
jens.hoffrichter at gmail
Aug 27, 2008, 9:43 AM
Post #1 of 1
(503 views)
Permalink
|
|
Retry time not reached for any host after a long failure period
|
|
Hello all, I'm having this pretty weird problem with exim where I can't find a solution for. Similar things popped up a couple of times on the mailing list, but nothing matched the exact symptoms I'm experiencing. I'm doing some general overhaul to a larger mail system, with several smtp in servers, and 5 cyrus pop3/imap backend servers. Up to now, the delivery of mail to the backends was handled by a local "deliver" process, which used a replicated cyrus murder database to determine the backend to deliver to, and sent it via lmtpproxyd to the responsible backends. Due to various reasons, we need to change that and have exim delivering directly to the backends using lmtp, as we want to gradually replace the smtp in servers with new hardware and newer distributions, and the cyrus on the new in server just isn't compatible with the old backend servers. When I switch the exim to delivering directly via lmtp, everything seems to work fine, except some messages get direct bounces without even having tried to deliver it to the backend. Here is a relevant excerpt from the logfile, with some data anonymized: 2008-08-27 01:00:02 1KY7W6-00049z-6S <= root [at] webserver10 H=(webserver10.xxxxxx) [xxx.xxx.174.206] P=esmtp S=1249 id=20080826230001.D220E469D70 [at] webserver10 2008-08-27 01:00:02 1KY7W6-00049z-6S ** xx123456 [at] lilzmailbe04 <user [at] yyyy> R=loadbalancer_final T=remote_lmtp_delivery: retry time not reache d for any host after a long failure period 2008-08-27 01:00:02 1KY7W6-00049z-6S ** xx456789 [at] lilzmailbe02 <user [at] yyyy> R=loadbalancer_final T=remote_lmtp_delivery: retry time not reached for any host after a long failure period 2008-08-27 01:00:02 1KY7W6-0004A5-F7 <= <> R=1KY7W6-00049z-6S U=exim P=local S=2363 2008-08-27 01:00:02 1KY7W6-00049z-6S Completed 2008-08-27 01:00:02 1KY7W6-0004A5-F7 => root [at] webserver10 R=outgoing_route T=remote_smtp H=smtp.liwest.at [212.33.55.20] X=TLSv1:AES256-SHA:256 2008-08-27 01:00:02 1KY7W6-0004A5-F7 Completed The relevant routers and delivery from the config file: address_data is filled from an ldap query in a router earlier, and is correct loadbalancer: driver = redirect redirect_router = loadbalancer_final local_part_suffix = +* local_part_suffix_optional condition = ${lookup{${extract {mailHost} {$address_data}}}lsearch{/etc/exim/backends}} data = ${extract {uid} {$address_data}}@${extract {mailHost} {$address_data}} loadbalancer_final: driver = accept condition = ${if def:address_data{yes}{no} } transport = remote_lmtp_delivery begin transports remote_lmtp_delivery: driver = smtp protocol = lmtp port = 2003 hosts = $domain gethostbyname = true If cut out irrelevant parts from the config which are not used for this example, but there is another router between the two, and more transports, of course. In the backends file there is a line like "lilzmailbe01.liwest.at : OK" for each backend, so the condition matches. I wanted to use ${filter on a list, but some of the exim installations are so old that it isn't implemented there yet, so I had to fall back to the file variant. The real strange thing is, that delivery to both of the mentioned hosts in there worked fine before and after, just that message (and a couple more) got bounced. One thing I noticed, though, is the fact that it only happened to mails which had more than one RCPT TO: (newsletters which got delivered to the mailsystem, mainly) The exim version on that particular host where that happened ist 4.66 I'm really at a loss what happens here, and I can't figure out where the problem is. I hope someone has an idea, any input is greatly appreciated. Thanks to all in advance, Jens -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
|