Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: exim: users

SMTP command timeout on connection - how to troubleshoot

 

 

exim users RSS feed   Index | Next | Previous | View Threaded


scott at qth

Feb 17, 2012, 1:11 PM

Post #1 of 22 (4260 views)
Permalink
SMTP command timeout on connection - how to troubleshoot

One particular remote mail server seems to be having problems delivering
mail to one of my local servers running Exim. The server reports that it's
running "EdgeWave mag2700" if this matters.

I have in my Exim config:

smtp_receive_timeout = 120s
log_selector = +subject +arguments +received_recipients

Here is a log snippet, redacted:

2012-02-17 14:22:52 H=sbox.example.net [1.2.3.4] Warning: Sender rate 1.3 /
1h
2012-02-17 14:22:53 1RyUKT-0003th-B6 <= jspringsteen [at] example H=
sbox.example.net [ 1.2.3.4 ] P=esmtp S=43627
id=00e901ccedb1$f272dcf0$d75896d0$@example.net T="test 1A" for
jthein [at] myuser
2012-02-17 14:24:53 SMTP command timeout on connection from sbox.example.net [
1.2.3.4 ]
2012-02-17 14:24:53 H=sbox.example.net [ 1.2.3.4 ] Warning: "Connection
Ratelimit - sbox.example.net [ 1.2.3.4 ] because of notquit:
command-timeout (1.3/1h max:1.2)"

Note, that even though it says "connection ratelimit" I have this server
set to disregard ratelimits, for troubleshooting purposes, so we are
continuing to accept connections from them, with no rate limit.

You can see that the remote server connects, provides the email headers,
then seems to do nothing for 120 seconds, at which time we disconnect them.

What are my options for troubleshooting? Of course the remote mail server
admin feels that it's OUR problem. It's hard to refute that, when I have
2977 other timeout messages in my log (this 1.2.3.4 server accounts for 146
of them). Of course, probably a lot of those are spammers.

- Scott
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


jgh at wizmail

Feb 17, 2012, 2:05 PM

Post #2 of 22 (4218 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

On 2012-02-17 21:11, Scott Neader wrote:
> EdgeWave mag2700

You don't say what your system is running though.
Can you run wireshark, and grab a failing smtp conversation?
--
Jeremy

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


scott at qth

Feb 17, 2012, 2:08 PM

Post #3 of 22 (4221 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

I was able to get the remote mail server admin to send me a packet capture
in .pcap format (if anyone wants to see it, I'd be glad to share, nothing
confidential in the cap).

What I see is that our Exim server sends the "250 OK id=xxxxxxx" message
just fine, and within a few ms, their server sends an ACK packet.

Here's the funny part... 120 seconds later, my Exim server sends a "421
my.servername.net: SMTP comamnd timeout - closing connection" packet, and
their server sends the ACK for that also, then my server sends the "FIN,
ACK", and their server sends the ACK, and the connection is closed.

Any ideas what is going on?

- Scott

On Fri, Feb 17, 2012 at 3:11 PM, Scott Neader <scott [at] qth> wrote:

> One particular remote mail server seems to be having problems delivering
> mail to one of my local servers running Exim. The server reports that it's
> running "EdgeWave mag2700" if this matters.
>
> I have in my Exim config:
>
> smtp_receive_timeout = 120s
> log_selector = +subject +arguments +received_recipients
>
> Here is a log snippet, redacted:
>
> 2012-02-17 14:22:52 H=sbox.example.net [1.2.3.4] Warning: Sender rate 1.3
> / 1h
> 2012-02-17 14:22:53 1RyUKT-0003th-B6 <= jspringsteen [at] example H=
> sbox.example.net [ 1.2.3.4 ] P=esmtp S=43627
> id=00e901ccedb1$f272dcf0$d75896d0$@example.net T="test 1A" for
> jthein [at] myuser
> 2012-02-17 14:24:53 SMTP command timeout on connection from
> sbox.example.net [ 1.2.3.4 ]
> 2012-02-17 14:24:53 H=sbox.example.net [ 1.2.3.4 ] Warning: "Connection
> Ratelimit - sbox.example.net [ 1.2.3.4 ] because of notquit:
> command-timeout (1.3/1h max:1.2)"
>
> Note, that even though it says "connection ratelimit" I have this server
> set to disregard ratelimits, for troubleshooting purposes, so we are
> continuing to accept connections from them, with no rate limit.
>
> You can see that the remote server connects, provides the email headers,
> then seems to do nothing for 120 seconds, at which time we disconnect them.
>
> What are my options for troubleshooting? Of course the remote mail server
> admin feels that it's OUR problem. It's hard to refute that, when I have
> 2977 other timeout messages in my log (this 1.2.3.4 server accounts for 146
> of them). Of course, probably a lot of those are spammers.
>
> - Scott
>
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


scott at qth

Feb 17, 2012, 2:12 PM

Post #4 of 22 (4218 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

I am running Exim 4.69.

I will see if I can get a cap from my side. Not sure if the cap from the
remote side is useful, but I have some of those.

- Scott

On Fri, Feb 17, 2012 at 4:05 PM, Jeremy Harris <jgh [at] wizmail> wrote:

> On 2012-02-17 21:11, Scott Neader wrote:
>
>> EdgeWave mag2700
>>
>
> You don't say what your system is running though.
> Can you run wireshark, and grab a failing smtp conversation?
> --
> Jeremy
>
> --
> ## List details at https://lists.exim.org/**mailman/listinfo/exim-users<https://lists.exim.org/mailman/listinfo/exim-users>
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


graeme at graemef

Feb 17, 2012, 2:20 PM

Post #5 of 22 (4218 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

Make sure there are no Cisco PIX or ASA devices with "smtp fixup" or "inspect smtp" switched on between you and the remote site.

Graeme


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


wbh at conducive

Feb 18, 2012, 2:51 AM

Post #6 of 22 (4207 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

Scott Neader wrote:
> I was able to get the remote mail server admin to send me a packet capture
> in .pcap format (if anyone wants to see it, I'd be glad to share, nothing
> confidential in the cap).
>
> What I see is that our Exim server sends the "250 OK id=xxxxxxx" message
> just fine, and within a few ms, their server sends an ACK packet.
>
> Here's the funny part... 120 seconds later, my Exim server sends a "421
> my.servername.net: SMTP comamnd timeout - closing connection" packet, and
> their server sends the ACK for that also, then my server sends the "FIN,
> ACK", and their server sends the ACK, and the connection is closed.
>
> Any ideas what is going on?
>
> - Scott

Set your timeout to 4 minutes instead of two minutes and see what, if
anything, changes.

My 'SWAG' is that, '250 OK' having been conveyed, a 'QUIT' will
transpire *before* your timeout arrives, all-hands will stand-down
normally, and the time-on-teat won't be enough longer to matter.

IOW you won't actually reach the 4 minutes, but at least will not have
confused the issue with an extraneous time-out message.

Bill
--
韓家標

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


scott at qth

Feb 20, 2012, 9:54 PM

Post #7 of 22 (4195 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

> Set your timeout to 4 minutes instead of two minutes and see what, if
> anything, changes.
>

I changed the timeout to 240 seconds, with no change... it just hangs for
240 seconds after I send the 250 OK, and then I disconnect due to timeout.

Make sure there are no Cisco PIX or ASA devices with "smtp fixup" or
> "inspect smtp" switched on between you and the remote site.


I'm told they do not have a PIX or ASA, but they do have a series 7200
router... it is capable of "smtp fixup", so I am asking them to ask their
router folks if it is enabled.

As another data point... I remembered having this problem with another ISP
a few months back. I just telnet'd to port 25 on their server and guess
what popped up in their initial greeting?:


> ESMTP EdgeWave mag4000


Yep... so here we have two of these "EdgeWave" mail servers that I can't
get mail from (but I can send them mail fine).

Anyone out there interested in looking at the package capture .pcap file I
have?

I will try to look through my logs for "SMTP command timeout" and try to
sift out the obvious spam zombie PCs and look for real mail servers, then
try to see if there is are more EdgeWave issues out there. Maybe it's
something?

- Scott

On Sat, Feb 18, 2012 at 4:51 AM, W B Hacker <wbh [at] conducive> wrote:

> Scott Neader wrote:
>
>> I was able to get the remote mail server admin to send me a packet capture
>> in .pcap format (if anyone wants to see it, I'd be glad to share, nothing
>> confidential in the cap).
>>
>> What I see is that our Exim server sends the "250 OK id=xxxxxxx" message
>> just fine, and within a few ms, their server sends an ACK packet.
>>
>> Here's the funny part... 120 seconds later, my Exim server sends a "421
>> my.servername.net: SMTP comamnd timeout - closing connection" packet, and
>> their server sends the ACK for that also, then my server sends the "FIN,
>> ACK", and their server sends the ACK, and the connection is closed.
>>
>> Any ideas what is going on?
>>
>> - Scott
>
>
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


scott at qth

Feb 20, 2012, 10:00 PM

Post #8 of 22 (4193 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

FYI, I'm seeing a number of timeouts from a mail provider called "
redcondor.net" and sure enough, when I telnet to port 25:

220 smtp450.redcondor.net ESMTP EdgeWave mag4000e


LOL... http://redcondor.com/ -- owned "by EdgeWave".

I guess the question is... is it something EdgeWave is doing, or is it
Exim?

We sure don't get along, that is for sure.

- Scott

On Mon, Feb 20, 2012 at 11:54 PM, Scott Neader <scott [at] qth> wrote:

>
> Set your timeout to 4 minutes instead of two minutes and see what, if
>> anything, changes.
>>
>
> I changed the timeout to 240 seconds, with no change... it just hangs for
> 240 seconds after I send the 250 OK, and then I disconnect due to timeout.
>
> Make sure there are no Cisco PIX or ASA devices with "smtp fixup" or
>> "inspect smtp" switched on between you and the remote site.
>
>
> I'm told they do not have a PIX or ASA, but they do have a series 7200
> router... it is capable of "smtp fixup", so I am asking them to ask their
> router folks if it is enabled.
>
> As another data point... I remembered having this problem with another ISP
> a few months back. I just telnet'd to port 25 on their server and guess
> what popped up in their initial greeting?:
>
>
>> ESMTP EdgeWave mag4000
>
>
> Yep... so here we have two of these "EdgeWave" mail servers that I can't
> get mail from (but I can send them mail fine).
>
> Anyone out there interested in looking at the package capture .pcap file I
> have?
>
> I will try to look through my logs for "SMTP command timeout" and try to
> sift out the obvious spam zombie PCs and look for real mail servers, then
> try to see if there is are more EdgeWave issues out there. Maybe it's
> something?
>
> - Scott
>
> On Sat, Feb 18, 2012 at 4:51 AM, W B Hacker <wbh [at] conducive> wrote:
>
>> Scott Neader wrote:
>>
>>> I was able to get the remote mail server admin to send me a packet
>>> capture
>>> in .pcap format (if anyone wants to see it, I'd be glad to share, nothing
>>> confidential in the cap).
>>>
>>> What I see is that our Exim server sends the "250 OK id=xxxxxxx" message
>>> just fine, and within a few ms, their server sends an ACK packet.
>>>
>>> Here's the funny part... 120 seconds later, my Exim server sends a "421
>>> my.servername.net: SMTP comamnd timeout - closing connection" packet,
>>> and
>>> their server sends the ACK for that also, then my server sends the "FIN,
>>> ACK", and their server sends the ACK, and the connection is closed.
>>>
>>> Any ideas what is going on?
>>>
>>> - Scott
>>
>>
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


scott at qth

Feb 20, 2012, 11:05 PM

Post #9 of 22 (4205 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

I opened a ticket with EdgeWave, to see if they were aware of any
particular problems talking to Exim. Their response was this:

Per RFC 2821, If your mail server is issuing a 250 response at the end of
> the smtp session then that means that the message was successfully
> delivered. (as far as the sender is concerned).
>
> I am not aware of any general issues with sending mail to exim mail
> servers.


I do agree they have a point... when Exim sends the 250 OK... shouldn't
Exim then deposit the message into the local mailbox or whatever needs to
happen at this point? Doesn't it seem wrong for Exim to send a 250 OK, but
not have actually accepted the message?

- Scott

On Tue, Feb 21, 2012 at 12:00 AM, Scott Neader <scott [at] qth> wrote:

> FYI, I'm seeing a number of timeouts from a mail provider called "
> redcondor.net" and sure enough, when I telnet to port 25:
>
> 220 smtp450.redcondor.net ESMTP EdgeWave mag4000e
>
>
> LOL... http://redcondor.com/ -- owned "by EdgeWave".
>
> I guess the question is... is it something EdgeWave is doing, or is it
> Exim?
>
> We sure don't get along, that is for sure.
>
> - Scott
>
>
> On Mon, Feb 20, 2012 at 11:54 PM, Scott Neader <scott [at] qth> wrote:
>
>>
>> Set your timeout to 4 minutes instead of two minutes and see what, if
>>> anything, changes.
>>>
>>
>> I changed the timeout to 240 seconds, with no change... it just hangs for
>> 240 seconds after I send the 250 OK, and then I disconnect due to timeout.
>>
>> Make sure there are no Cisco PIX or ASA devices with "smtp fixup" or
>>> "inspect smtp" switched on between you and the remote site.
>>
>>
>> I'm told they do not have a PIX or ASA, but they do have a series 7200
>> router... it is capable of "smtp fixup", so I am asking them to ask their
>> router folks if it is enabled.
>>
>> As another data point... I remembered having this problem with another
>> ISP a few months back. I just telnet'd to port 25 on their server and
>> guess what popped up in their initial greeting?:
>>
>>
>>> ESMTP EdgeWave mag4000
>>
>>
>> Yep... so here we have two of these "EdgeWave" mail servers that I can't
>> get mail from (but I can send them mail fine).
>>
>> Anyone out there interested in looking at the package capture .pcap file
>> I have?
>>
>> I will try to look through my logs for "SMTP command timeout" and try to
>> sift out the obvious spam zombie PCs and look for real mail servers, then
>> try to see if there is are more EdgeWave issues out there. Maybe it's
>> something?
>>
>> - Scott
>>
>> On Sat, Feb 18, 2012 at 4:51 AM, W B Hacker <wbh [at] conducive> wrote:
>>
>>> Scott Neader wrote:
>>>
>>>> I was able to get the remote mail server admin to send me a packet
>>>> capture
>>>> in .pcap format (if anyone wants to see it, I'd be glad to share,
>>>> nothing
>>>> confidential in the cap).
>>>>
>>>> What I see is that our Exim server sends the "250 OK id=xxxxxxx" message
>>>> just fine, and within a few ms, their server sends an ACK packet.
>>>>
>>>> Here's the funny part... 120 seconds later, my Exim server sends a "421
>>>> my.servername.net: SMTP comamnd timeout - closing connection" packet,
>>>> and
>>>> their server sends the ACK for that also, then my server sends the "FIN,
>>>> ACK", and their server sends the ACK, and the connection is closed.
>>>>
>>>> Any ideas what is going on?
>>>>
>>>> - Scott
>>>
>>>
>
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


hs at schlittermann

Feb 21, 2012, 12:42 AM

Post #10 of 22 (4212 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

Hi,

Scott Neader <scott [at] qth> (Di 21 Feb 2012 08:05:48 CET):
> I opened a ticket with EdgeWave, to see if they were aware of any
> particular problems talking to Exim. Their response was this:
>
> Per RFC 2821, If your mail server is issuing a 250 response at the end of
> > the smtp session then that means that the message was successfully
> > delivered. (as far as the sender is concerned).
> >
> > I am not aware of any general issues with sending mail to exim mail
> > servers.
>
>
> I do agree they have a point... when Exim sends the 250 OK... shouldn't
> Exim then deposit the message into the local mailbox or whatever needs to
> happen at this point? Doesn't it seem wrong for Exim to send a 250 OK, but
> not have actually accepted the message?

If I get it well, it happens about the following:

Remote Server Your Server

[ TCP SYN/SYN-ACK/ACK ]
220 …


DATA -->
<-- 354 Enter Message…
… -->
. -->
<-- 250 OK id=1Rzl72-00089K-6J

[several minutes]
<-- 421 … SMTP command timeout

[ TCP FIN/FIN-ACK/ACK ]


I just tried this with one of our servers, using netcat as client
and just didn't send the QUIT. It did exactly what I expected,
Exim delivers the mail as soon as the 250 OK was sent. Then, minutes
later there is a notice in the mainlog about the command timeout.

From my POV your problem is not related to this unhappy end of the
connection. (But I'd say it's unpolite behaviour of (by?) the other side…)

--
Heiko :: dresden : linux : SCHLITTERMANN.de
GPG Key 48D0359B : 3061 CFBF 2D88 F034 E8D2 7E92 EE4E AC98 48D0 359B
Attachments: signature.asc (0.19 KB)


graeme at graemef

Feb 21, 2012, 1:51 AM

Post #11 of 22 (4204 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

On Tue, 2012-02-21 at 09:42 +0100, Heiko Schlittermann wrote:
> From my POV your problem is not related to this unhappy end of the
> connection. (But I'd say it's unpolite behaviour of (by?) the other side…)

We saw this very recently after a site firewall upgrade. Almost
identical behaviour except the final "\n.\n" was never received so the
message was deferred. We logged a 421 command timeout.

The Cisco device was on our premises, not the remote, and removing the
erroneous "inspect SMTP" made it all go away.

Graeme


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


dwmw2 at infradead

Feb 22, 2012, 2:42 AM

Post #12 of 22 (4180 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

On Fri, 2012-02-17 at 16:08 -0600, Scott Neader wrote:
> I was able to get the remote mail server admin to send me a packet capture
> in .pcap format (if anyone wants to see it, I'd be glad to share, nothing
> confidential in the cap).
>
> What I see is that our Exim server sends the "250 OK id=xxxxxxx" message
> just fine, and within a few ms, their server sends an ACK packet.

Hm, if the message they see is really 'OK id=xxxxxxx' with a real
Exim-like queue ID (why the hell do you feel the need to obfuscate a
local queue ID, anyway?) then it's unlikely to have been generated
anywhere but your server.

If you search your logs for that specific ID, what do you see?

If you do a capture on *your* end, does it match what they see at their
end?

--
dwmw2
Attachments: smime.p7s (5.68 KB)


scott at qth

Feb 22, 2012, 7:56 AM

Post #13 of 22 (4167 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

Hi David. First, I wasn't obfuscating the ID, I was just saying "we send
our 250, they ack". Didn't think the actual log messsage and ID would be
important enough to paste into the email...

Anyway... you are right... these messages ARE getting delivered. I was
looking at log messages based on IP, and only seeing the connection, data
and delivery messages, but I did not look at how Exim dealt with the
message... when looking by ID as you have suggested, it shows the messages
are being delivered into the local mailbox.

The mystery still stands as to why I am seeing all these SMTP command
timeouts from just these "EdgeWave" mail servers. If the EdgeWave server
has received our "250 OK" message, and their packet capture shows they have
received it, and they have sent an ACK, then why don't they DISCONNECT?

I have started a ticket with EdgeWave, to see if they have any interest in
figuring this out.

Regarding a packet capture on my side, I have to admit, I have never done
it on command-line Linux before (done many on Windoze via
Ethereal/WireShark), so I will have to research that.

Thanks for the input -- much appreciated!!

- Scott

On Wed, Feb 22, 2012 at 4:42 AM, David Woodhouse <dwmw2 [at] infradead>wrote:

> On Fri, 2012-02-17 at 16:08 -0600, Scott Neader wrote:
> > I was able to get the remote mail server admin to send me a packet
> capture
> > in .pcap format (if anyone wants to see it, I'd be glad to share, nothing
> > confidential in the cap).
> >
> > What I see is that our Exim server sends the "250 OK id=xxxxxxx" message
> > just fine, and within a few ms, their server sends an ACK packet.
>
> Hm, if the message they see is really 'OK id=xxxxxxx' with a real
> Exim-like queue ID (why the hell do you feel the need to obfuscate a
> local queue ID, anyway?) then it's unlikely to have been generated
> anywhere but your server.
>
> If you search your logs for that specific ID, what do you see?
>
> If you do a capture on *your* end, does it match what they see at their
> end?
>
> --
> dwmw2
>
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


dwmw2 at infradead

Feb 22, 2012, 8:27 AM

Post #14 of 22 (4173 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

On Wed, 2012-02-22 at 09:56 -0600, Scott Neader wrote:
> The mystery still stands as to why I am seeing all these SMTP command
> timeouts from just these "EdgeWave" mail servers. If the EdgeWave server
> has received our "250 OK" message, and their packet capture shows they have
> received it, and they have sent an ACK, then why don't they DISCONNECT?

I don't know why they don't send a QUIT and then disconnect. Perhaps
they keep the connection open in case they want to use it to deliver
another mail in the future? Stranger things have happened...

--
dwmw2
Attachments: smime.p7s (5.68 KB)


scott at qth

Feb 22, 2012, 8:36 AM

Post #15 of 22 (4176 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

Are you willing to look at the cap file from their side, to see if they are
doing things right? I'd like to be able to tell them... look, RFC XXX says
after we send the 250 OK, you should send a QUIT but your cap shows you are
not..." (or whatever) -- but I'm just not knowledgeable enough.

- Scott

On Wed, Feb 22, 2012 at 10:27 AM, David Woodhouse <dwmw2 [at] infradead>wrote:

> On Wed, 2012-02-22 at 09:56 -0600, Scott Neader wrote:
> > The mystery still stands as to why I am seeing all these SMTP command
> > timeouts from just these "EdgeWave" mail servers. If the EdgeWave server
> > has received our "250 OK" message, and their packet capture shows they
> have
> > received it, and they have sent an ACK, then why don't they DISCONNECT?
>
> I don't know why they don't send a QUIT and then disconnect. Perhaps
> they keep the connection open in case they want to use it to deliver
> another mail in the future? Stranger things have happened...
>
> --
> dwmw2
>
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


mike.kennedy at dillards

Feb 22, 2012, 8:45 AM

Post #16 of 22 (4161 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

I had a similar experience with a JVM that maintained an SMTP connection
pool, the connections were being held open until it went to use them and
found they had timed out. In my case, the people administering the JVM were
cooperative and set the pool to expire connections rather than keep them
around until they timed out on the server side. I don't know EdgeWave from
Adam, though, and my mail servers don't communicate directly with any
similar DLP product, so this is little better than a WAG in your case.

On Wed, Feb 22, 2012 at 9:56 AM, Scott Neader <scott [at] qth> wrote:

> Hi David. First, I wasn't obfuscating the ID, I was just saying "we send
> our 250, they ack". Didn't think the actual log messsage and ID would be
> important enough to paste into the email...
>
> Anyway... you are right... these messages ARE getting delivered. I was
> looking at log messages based on IP, and only seeing the connection, data
> and delivery messages, but I did not look at how Exim dealt with the
> message... when looking by ID as you have suggested, it shows the messages
> are being delivered into the local mailbox.
>
> The mystery still stands as to why I am seeing all these SMTP command
> timeouts from just these "EdgeWave" mail servers. If the EdgeWave server
> has received our "250 OK" message, and their packet capture shows they have
> received it, and they have sent an ACK, then why don't they DISCONNECT?
>
> I have started a ticket with EdgeWave, to see if they have any interest in
> figuring this out.
>
> Regarding a packet capture on my side, I have to admit, I have never done
> it on command-line Linux before (done many on Windoze via
> Ethereal/WireShark), so I will have to research that.
>
> Thanks for the input -- much appreciated!!
>
> - Scott
>
> On Wed, Feb 22, 2012 at 4:42 AM, David Woodhouse <dwmw2 [at] infradead
> >wrote:
>
> > On Fri, 2012-02-17 at 16:08 -0600, Scott Neader wrote:
> > > I was able to get the remote mail server admin to send me a packet
> > capture
> > > in .pcap format (if anyone wants to see it, I'd be glad to share,
> nothing
> > > confidential in the cap).
> > >
> > > What I see is that our Exim server sends the "250 OK id=xxxxxxx"
> message
> > > just fine, and within a few ms, their server sends an ACK packet.
> >
> > Hm, if the message they see is really 'OK id=xxxxxxx' with a real
> > Exim-like queue ID (why the hell do you feel the need to obfuscate a
> > local queue ID, anyway?) then it's unlikely to have been generated
> > anywhere but your server.
> >
> > If you search your logs for that specific ID, what do you see?
> >
> > If you do a capture on *your* end, does it match what they see at their
> > end?
> >
> > --
> > dwmw2
> >
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


pdp at exim

Feb 22, 2012, 11:51 AM

Post #17 of 22 (4167 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

On 2012-02-22 at 09:56 -0600, Scott Neader wrote:
> Regarding a packet capture on my side, I have to admit, I have never done
> it on command-line Linux before (done many on Windoze via
> Ethereal/WireShark), so I will have to research that.

tcpdump -w foo-$(date +%s).cap -s 1500 -i ethDEVICE port 25

ethDEVICE on Linux systems used to typically be eth0.

The format of the capture dump is portable and you can then open it in
WireShark on another OS. Hit Ctrl-C when you want to stop the dumping.

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


scott at qth

Feb 22, 2012, 12:19 PM

Post #18 of 22 (4168 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

Very useful, Phil -- thanks very much !!!

- Scott

On Wed, Feb 22, 2012 at 1:51 PM, Phil Pennock <pdp [at] exim> wrote:

> On 2012-02-22 at 09:56 -0600, Scott Neader wrote:
> > Regarding a packet capture on my side, I have to admit, I have never done
> > it on command-line Linux before (done many on Windoze via
> > Ethereal/WireShark), so I will have to research that.
>
> tcpdump -w foo-$(date +%s).cap -s 1500 -i ethDEVICE port 25
>
> ethDEVICE on Linux systems used to typically be eth0.
>
> The format of the capture dump is portable and you can then open it in
> WireShark on another OS. Hit Ctrl-C when you want to stop the dumping.
>
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


dwmw2 at infradead

Feb 22, 2012, 11:49 PM

Post #19 of 22 (4157 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

On Wed, 2012-02-22 at 10:36 -0600, Scott Neader wrote:
> Are you willing to look at the cap file from their side, to see if they are
> doing things right? I'd like to be able to tell them... look, RFC XXX says
> after we send the 250 OK, you should send a QUIT but your cap shows you are
> not..." (or whatever) -- but I'm just not knowledgeable enough.

By all means, send it my way. Note that the only "problem" this causes
is an extra line in your log and a small amount of memory used while
Exim is waiting to die, right?

--
dwmw2
Attachments: smime.p7s (5.68 KB)


scott at qth

Feb 23, 2012, 9:24 AM

Post #20 of 22 (4150 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

Thanks, David, I'll send it to you direct.

My concern on the timeouts is:

1) I have seen in the past that all of my Exim sockets can be consumed by
misbehaving mail servers (or spam zombies) and thus we defer mail. I'm
open to discussion on this, if I'm doing something wrong, or
misunderstanding.

2) The far-end customer (using EdgeWave) is reporting SOME fatal errors.
Most messages are getting through, but the reason I found the problem is
after being contacted by their ISP asking why we aren't accepting some of
their mail.

3) We have rate limits set up for misbehaving mail servers, and these
timeouts are counted toward the rate limit. I will need to research to
find out how to stop counting timeouts toward rate limits, if I am to start
ignoring these timeouts as non-issues.

4) It seems most servers with this timeout problem are either EdgeWave mail
servers, or spam zombie home computers. I'm hesitant to ignore these
timeouts, but if the Exim community feels that I should, then I will.

Thanks!!

- Scott

On Thu, Feb 23, 2012 at 1:49 AM, David Woodhouse <dwmw2 [at] infradead>wrote:

> On Wed, 2012-02-22 at 10:36 -0600, Scott Neader wrote:
> > Are you willing to look at the cap file from their side, to see if they
> are
> > doing things right? I'd like to be able to tell them... look, RFC XXX
> says
> > after we send the 250 OK, you should send a QUIT but your cap shows you
> are
> > not..." (or whatever) -- but I'm just not knowledgeable enough.
>
> By all means, send it my way. Note that the only "problem" this causes
> is an extra line in your log and a small amount of memory used while
> Exim is waiting to die, right?
>
> --
> dwmw2
>
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


tlyons at ivenue

Feb 23, 2012, 11:27 AM

Post #21 of 22 (4160 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

By any chance do you have a firewall (Cisco ASA for example) that you
block all or most ICMP?

A few years ago, I experienced issues with a few particular remote
sites and their erratice mail delivery to us. We had blocked most
ICMP types at the firewall for PCI compliance. We relaxed the rule
and blocked just a few specific ICMP types (the time query ones) and
all of a sudden those issues went away. It must have been breaking
path mtu discovery.

...Todd

On Thu, Feb 23, 2012 at 9:24 AM, Scott Neader <scott [at] qth> wrote:
> Thanks, David, I'll send it to you direct.
>
> My concern on the timeouts is:
>
> 1) I have seen in the past that all of my Exim sockets can be consumed by
> misbehaving mail servers (or spam zombies) and thus we defer mail. I'm
> open to discussion on this, if I'm doing something wrong, or
> misunderstanding.
>
> 2) The far-end customer (using EdgeWave) is reporting SOME fatal errors.
> Most messages are getting through, but the reason I found the problem is
> after being contacted by their ISP asking why we aren't accepting some of
> their mail.
>
> 3) We have rate limits set up for misbehaving mail servers, and these
> timeouts are counted toward the rate limit. I will need to research to
> find out how to stop counting timeouts toward rate limits, if I am to start
> ignoring these timeouts as non-issues.
>
> 4) It seems most servers with this timeout problem are either EdgeWave mail
> servers, or spam zombie home computers. I'm hesitant to ignore these
> timeouts, but if the Exim community feels that I should, then I will.
>
> Thanks!!
>
> - Scott
>
> On Thu, Feb 23, 2012 at 1:49 AM, David Woodhouse <dwmw2 [at] infradead>wrote:
>
>> On Wed, 2012-02-22 at 10:36 -0600, Scott Neader wrote:
>> > Are you willing to look at the cap file from their side, to see if they
>> are
>> > doing things right? I'd like to be able to tell them... look, RFC XXX
>> says
>> > after we send the 250 OK, you should send a QUIT but your cap shows you
>> are
>> > not..." (or whatever) -- but I'm just not knowledgeable enough.
>>
>> By all means, send it my way. Note that the only "problem" this causes
>> is an extra line in your log and a small amount of memory used while
>> Exim is waiting to die, right?
>>
>> --
>> dwmw2
>>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/



--
SOPA: Any attempt to [use legal means to] reverse technological
advances is doomed. --Leo Leporte

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


scott at qth

Feb 23, 2012, 8:51 PM

Post #22 of 22 (4148 views)
Permalink
Re: SMTP command timeout on connection - how to troubleshoot [In reply to]

On Thu, Feb 23, 2012 at 1:27 PM, Todd Lyons <tlyons [at] ivenue> wrote:

> By any chance do you have a firewall (Cisco ASA for example) that you
> block all or most ICMP?
>

My Exim server does not. However, the far end EdgeWave server
(66.43.215.27) does have a Cisco 7201 in front of it, and the server is not
ping-able.

>
> A few years ago, I experienced issues with a few particular remote
> sites and their erratice mail delivery to us. We had blocked most
> ICMP types at the firewall for PCI compliance. We relaxed the rule
> and blocked just a few specific ICMP types (the time query ones) and
> all of a sudden those issues went away. It must have been breaking
> path mtu discovery.
>

Thanks for that... that is the second suggestion that it could be the
customer's firewall/router causing these problems. I am relaying to them.

- Scott


>
>
> On Thu, Feb 23, 2012 at 9:24 AM, Scott Neader <scott [at] qth> wrote:
> > Thanks, David, I'll send it to you direct.
> >
> > My concern on the timeouts is:
> >
> > 1) I have seen in the past that all of my Exim sockets can be consumed by
> > misbehaving mail servers (or spam zombies) and thus we defer mail. I'm
> > open to discussion on this, if I'm doing something wrong, or
> > misunderstanding.
> >
> > 2) The far-end customer (using EdgeWave) is reporting SOME fatal errors.
> > Most messages are getting through, but the reason I found the problem is
> > after being contacted by their ISP asking why we aren't accepting some of
> > their mail.
> >
> > 3) We have rate limits set up for misbehaving mail servers, and these
> > timeouts are counted toward the rate limit. I will need to research to
> > find out how to stop counting timeouts toward rate limits, if I am to
> start
> > ignoring these timeouts as non-issues.
> >
> > 4) It seems most servers with this timeout problem are either EdgeWave
> mail
> > servers, or spam zombie home computers. I'm hesitant to ignore these
> > timeouts, but if the Exim community feels that I should, then I will.
> >
> > Thanks!!
> >
> > - Scott
> >
> > On Thu, Feb 23, 2012 at 1:49 AM, David Woodhouse <dwmw2 [at] infradead
> >wrote:
> >
> >> On Wed, 2012-02-22 at 10:36 -0600, Scott Neader wrote:
> >> > Are you willing to look at the cap file from their side, to see if
> they
> >> are
> >> > doing things right? I'd like to be able to tell them... look, RFC XXX
> >> says
> >> > after we send the 250 OK, you should send a QUIT but your cap shows
> you
> >> are
> >> > not..." (or whatever) -- but I'm just not knowledgeable enough.
> >>
> >> By all means, send it my way. Note that the only "problem" this causes
> >> is an extra line in your log and a small amount of memory used while
> >> Exim is waiting to die, right?
> >>
> >> --
> >> dwmw2
> >>
> > --
> > ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> > ## Exim details at http://www.exim.org/
> > ## Please use the Wiki with this list - http://wiki.exim.org/
>
>
>
> --
> SOPA: Any attempt to [use legal means to] reverse technological
> advances is doomed. --Leo Leporte
>
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/

exim users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.