Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: exim: users

Delays, timeouts, and conflict

 

 

exim users RSS feed   Index | Next | Previous | View Threaded


jethro.binks at strath

Jun 19, 2012, 5:37 AM

Post #1 of 6 (303 views)
Permalink
Delays, timeouts, and conflict

I've been having an ongoing discussion with a supplier about their
inability to send us email.

Here's a summary of one of the transactions:

2012-05-31T18:47:44+01:00 SMTP connection from [216.70.64.64]:48322
I=[130.159.16.101]:25 (TCP/IP connection count = 79)

2012-05-31T18:48:31+01:00 Noting recipient:
rcpt=<...@strath.ac.uk> host=216.70.64.64
hostname=mailout28.mail01.mtsvc.net helo=n08.mail01.mtsvc.net
sender=<...>

(Extra log message we add when recipient sent)

2012-05-31T18:49:21+01:00 SMTP connection from mailout28.mail01.mtsvc.net
(n08.mail01.mtsvc.net) [216.70.64.64]:48322 I=[130.159.16.101]:25 lost
while reading message data (header)


We impose various delays in the SMTP transactions. To my reading of these
log messages, the remote side is dropping the connection some time after
sending the first RCPT, but before DATA session is complete.

The sending server is apparently Exim, according to the support staff for
the providers (mtsvc.net): "Unfortunately, we have no solution when
sending email to "strath.ac.uk" email servers from your (gs) Grid-Service.
I have not seen this problem happen to any other customers on the (gs)
Grid-Service and when connecting to the server using basic telnet
commands, the response time from their server is very slow. I understand
they implemented policies to combat spam, however our Exim configurations
are standard and as I stated working for thousands and thousands of users
on our shared email product.".

I have:

DELAY_SMTP_RCPT_FIRST = 30s
DELAY_SMTP_DATA = 30s

which are implemented near the start of the relevant ACLs (hosts in dnswl
are exempt from the DATA delay)

So I have a couple of questions really:

1. is my reading of the reason for the failure to transfer correct, or
could there be another cause that requires deeper investigation?

2. do other sites see issues from this particular set of sources,
particularly if you are imposing delays?

The fine manual found for me the settings which the remote host might have
adjusted (smtp transport, "data_timeout", "command_timeout", etc), so I
guess I could ask for them to report what those values are set to, if they
will be willing to disclose them. It may be that they are not aware that
these parameters are adjusted from the defaults, but I'm wary that I've
overlooked something.

Anyway, my choices appear to be to convince them that their Exim is
ill-configured and push them in the direction of how to fix it, if that's
what it is, or make a special exemption for the sending hosts. This
latter course of action I always consider a last resort, at least when I
have some sort of standards-based reason to support me for refusing to do
so. Which is why I need a sanity check on my logic!

Jethro.

. . . . . . . . . . . . . . . . . . . . . . . . .
Jethro R Binks, Network Manager,
Information Services Directorate, University Of Strathclyde, Glasgow, UK

The University of Strathclyde is a charitable body, registered in
Scotland, number SC015263.

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


wbh at conducive

Jun 19, 2012, 7:18 AM

Post #2 of 6 (290 views)
Permalink
Re: Delays, timeouts, and conflict [In reply to]

Jethro R Binks wrote:

>
> I have:
>
> DELAY_SMTP_RCPT_FIRST = 30s
> DELAY_SMTP_DATA = 30s
>

Having started with progressive delays that went into multiple-minutes,
we eventually found that a mere 13s delay would cause well over 90% of
all 'bots that were going to abandon of their own volition AT ALL to do
so aroudn the 11-12s mark.

Cheap and cheerful to see if halving your settings to 15s fixes your
problem w/o further ado or increase in garbage.

Oh - and we didn't apply ANY delay unless something already had
presented dodgy creds or smelt of phish or such, so our delays were
embedded in various acl test clauses, never in MAIN.

Dunno why one would want to load up one's own server resources 100% of
the time without looking for a reason to trigger that. OTOH, I can't
grok the rationale for greylimping, either.

Bill
--
韓家標

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


jgh at wizmail

Jun 19, 2012, 7:19 AM

Post #3 of 6 (291 views)
Permalink
Re: Delays, timeouts, and conflict [In reply to]

On 19/06/2012 13:37, Jethro R Binks wrote:
> Anyway, my choices appear to be to convince them that their Exim is
> ill-configured and push them in the direction of how to fix it, if that's
> what it is, or make a special exemption for the sending hosts. This
> latter course of action I always consider a last resort, at least when I
> have some sort of standards-based reason to support me for refusing to do
> so. Which is why I need a sanity check on my logic!

From your logs the connection is being dropped less than two minutes in.
If their logs show similar (can you find out?) they're well outside
the suggestion of the RFCs....
--
Cheers,
Jeremy

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


D.H.Davis at bath

Jun 19, 2012, 7:24 AM

Post #4 of 6 (291 views)
Permalink
Re: Delays, timeouts, and conflict [In reply to]

On Tue, 19 Jun 2012, Jethro R Binks wrote:

> From: Jethro R Binks <jethro.binks [at] strath>
> To: exim-users [at] exim
> Date: Tue, 19 Jun 2012 13:37:08
> Subject: [exim] Delays, timeouts, and conflict
>
> I've been having an ongoing discussion with a supplier about their
> inability to send us email.
>
> Here's a summary of one of the transactions:

...

*Extremely* wild stab in the dark here. Try reducing the MTU on
your mail relay(s) slightly. Say from 1500 to something like 1400
to 1450.

Saw something like what you describe several years ago with getting
email to the NHS. Long delays, timeouts, dropped connections
etc. It was suggested to me that something in the connection
route had broken path MTU discovery. Reducing the MTU was a great
improvement.
--
Dennis Davis, BUCS, University of Bath, Bath, BA2 7AY, UK
D.H.Davis [at] bath Phone: +44 1225 386101

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


graeme at graemef

Jun 19, 2012, 7:44 AM

Post #5 of 6 (291 views)
Permalink
Re: Delays, timeouts, and conflict [In reply to]

On Tue, 2012-06-19 at 15:24 +0100, Dennis Davis wrote:
> *Extremely* wild stab in the dark here. Try reducing the MTU on
> your mail relay(s) slightly. Say from 1500 to something like 1400
> to 1450.

Very good call. Prior to DATA, most transaction packets are far smaller
than the MTU of most links in the chain. Post DATA, for messages where
the total data size exceeds 1500 or so bytes, they are likely to be
larger. If there's a device inline which is breaking PMTUd, or a poorly
configured or maintained firewall (in the case of the NHS that was my
experience) then sliding your MTU down might well help.

Graeme


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


jethro.binks at strath

Jun 21, 2012, 2:41 AM

Post #6 of 6 (283 views)
Permalink
Re: Delays, timeouts, and conflict [In reply to]

On Tue, 19 Jun 2012, Graeme Fowler wrote:

> On Tue, 2012-06-19 at 15:24 +0100, Dennis Davis wrote:
> > *Extremely* wild stab in the dark here. Try reducing the MTU on
> > your mail relay(s) slightly. Say from 1500 to something like 1400
> > to 1450.
>
> Very good call. Prior to DATA, most transaction packets are far smaller
> than the MTU of most links in the chain. Post DATA, for messages where
> the total data size exceeds 1500 or so bytes, they are likely to be
> larger. If there's a device inline which is breaking PMTUd, or a poorly
> configured or maintained firewall (in the case of the NHS that was my
> experience) then sliding your MTU down might well help.

It is a good call, and I remember that incident.

I also recently read a post from TF about an investigation he did ... here
we are:

http://fanf.livejournal.com/74849.html

Spookily familiar (and I liked the second comment).

I did some ping -D with large packets to one of their outbound mailservers
and was successful up to a size of 1472 matching the 1500 interface MTU.
In the meantime, I've asked if they can get their provider to supply any
timeout settings on their smtp router, although they claim to be using
"standard" Exim settings.

If they really are, then I'll either look at reducing the delay time or
tweaking the interface MTU a bit.

Thanks for the suggestions,

Jethro.

. . . . . . . . . . . . . . . . . . . . . . . . .
Jethro R Binks, Network Manager,
Information Services Directorate, University Of Strathclyde, Glasgow, UK

The University of Strathclyde is a charitable body, registered in
Scotland, number SC015263.

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/

exim users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.