Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: exim: dev

Exim segv

 

 

exim dev RSS feed   Index | Next | Previous | View Threaded


jonathan at cs

Apr 20, 2005, 3:54 AM

Post #1 of 3 (126 views)
Permalink
Exim segv

We're having trouble with a mail message that is causing exim to SEGV.
We're using Exim-4.43 with exiscan and the perl module. We do get a log
entry for the incoming email giving the sender and the user claims that the
mail is deliverd (she gets lots of copies!). We're running clamav
anti-virus but there's no hint in either the exim logs or clamav logs of a
problem. The remote site just sees the connection vanish (its an idenitcal
exim binary).

Does anyone recognise this sequence of system calls and can take a guess at
where the problem might be? I don't mind spending some time debugging this
but I could do with a clue as to where to begin. It looks like it packs up
right after logging the sender.


32540 _llseek(4, 0, [0], SEEK_CUR) = 0
32540 time(NULL) = 1113991397
32540 write(4, "2005-04-20 11:03:17 Received fro"..., 204) = 204
32540 close(4) = 0
32540 munmap(0xb7f52000, 4096) = 0
32540 close(0) = 0
32540 munmap(0xb7f53000, 4096) = 0
32540 rt_sigaction(SIGTERM, {SIG_DFL}, {SIG_IGN}, 8) = 0
32540 rt_sigaction(SIGINT, {SIG_DFL}, {SIG_IGN}, 8) = 0
32540 --- SIGSEGV (Segmentation fault) @ 0 (0) ---
2241 <... select resumed> ) = ? ERESTARTNOHAND (To be restarted)
2241 --- SIGCHLD (Child exited) @ 0 (0) ---


--
______ jonathan [at] cs Jonathan Knight,
/ Department of Computer Science
/ _ __ Telephone: +44 1782 583437 University of Keele, Keele,
(_/ (_) / / Fax : +44 1782 713082 Staffordshire. ST5 5BG. U.K.


m.hubbard at imperial

Jun 28, 2005, 3:38 AM

Post #2 of 3 (125 views)
Permalink
RE: Exim segv [In reply to]

I'm also seeing this problem, and I'm not having much success finding
the cause.

It appears to be message-content specific, as the same message from the
same server causes this behaviour consistently. But I'm unable to
reproduce it with identical message bodies under test conditions.

I've got a back trace with gdb:

#0 0x002c505d in _int_free () from /lib/tls/libc.so.6
#1 0x002c4018 in free () from /lib/tls/libc.so.6
#2 0x080a10bc in store_reset_3 (ptr=0x9467f68, filename=0x80ca621
"daemon.c", linenumber=526) at store.c:373
#3 0x08050364 in handle_smtp_call (listen_sockets=0x945c8e8,
listen_socket_count=1, accept_socket=1, accepted=0x388180) at
daemon.c:526
#4 0x0805170d in daemon_go () at daemon.c:1709
#5 0x08061ffc in main (argc=5, cargv=0xbfffbb94) at exim.c:3871



Getting gdb attached to a child receiver process on a live server is a
pain, but I came up with the following method:

Under the data acl:
warn condition = ${if eq
{$h_message-id:}{\N<036e01c57bad$f76f91d0$762937be [at] VHKLA>\N}{1}{0}}
condition = ${if !
exists{/usr/local/exim/newbugged}{1}{0}}
set acl_m9 = ${run {/bin/touch
/usr/local/exim/newbugged}{1}{1}}
set acl_m9 = ${run {/bin/cp -r
/usr/local/exim/spool/scan/$message_id /usr/local/exim/tmp/}{1}{1}}
set acl_m9 = ${run {/usr/bin/screen -dmS eximgdb
/usr/bin/sudo /usr/bin/gdb /usr/local/exim/bin/exim $pid}{1}{1}}


In visudo:
User_Alias EXIMUSERS = exim
Cmnd_Alias EXIMGDB = /usr/bin/gdb /usr/local/exim/bin/exim [0-9]*
EXIMUSERS ALL=(root) NOPASSWD: EXIMGDB

The test and touch for a file makes sure it's only fired off once,
matching on the header message id. Should leave a screen with gdb ready
to go under the exim user. There's a race to get gdb attached before the
child exits, but it succeeds most of the time.

In order to open the screen, the current user (exim) needs write
permission to your tty, if you logged in as yourself, you will own your
own tty, simplest to chmod a+rw it.


The trace above is from exim-4.44 with exiscan. The problem has
persisted under exim-4.51, but I've not got a bt with debug info yet.


A packet capture of one of these transmissions shows the receiving
server sending a FIN after the EOM. No message acknowledgement is given.
The message is happily in the spool, as the receiving process SEGVd,
first delivery is carried out by the next queue runner.


I'd be grateful for any help or suggestions in getting to the bottom of
this. It seems to be occurring more frequently of late.

Cheers,
Matt.


-----Original Message-----
From: exim-dev-bounces [at] exim [mailto:exim-dev-bounces [at] exim] On
Behalf Of Jonathan Knight
Sent: 20 April 2005 11:54
To: exim-dev [at] exim
Subject: [exim-dev] Exim segv



We're having trouble with a mail message that is causing exim to SEGV.
We're using Exim-4.43 with exiscan and the perl module. We do get a log
entry for the incoming email giving the sender and the user claims that
the
mail is deliverd (she gets lots of copies!). We're running clamav
anti-virus but there's no hint in either the exim logs or clamav logs of
a
problem. The remote site just sees the connection vanish (its an
idenitcal
exim binary).

Does anyone recognise this sequence of system calls and can take a guess
at
where the problem might be? I don't mind spending some time debugging
this
but I could do with a clue as to where to begin. It looks like it packs
up
right after logging the sender.


32540 _llseek(4, 0, [0], SEEK_CUR) = 0
32540 time(NULL) = 1113991397
32540 write(4, "2005-04-20 11:03:17 Received fro"..., 204) = 204
32540 close(4) = 0
32540 munmap(0xb7f52000, 4096) = 0
32540 close(0) = 0
32540 munmap(0xb7f53000, 4096) = 0
32540 rt_sigaction(SIGTERM, {SIG_DFL}, {SIG_IGN}, 8) = 0
32540 rt_sigaction(SIGINT, {SIG_DFL}, {SIG_IGN}, 8) = 0
32540 --- SIGSEGV (Segmentation fault) @ 0 (0) ---
2241 <... select resumed> ) = ? ERESTARTNOHAND (To be
restarted)
2241 --- SIGCHLD (Child exited) @ 0 (0) ---


--
______ jonathan [at] cs Jonathan Knight,
/ Department of Computer Science
/ _ __ Telephone: +44 1782 583437 University of Keele, Keele,
(_/ (_) / / Fax : +44 1782 713082 Staffordshire. ST5 5BG. U.K.

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim
details at http://www.exim.org/ ##


jonathan at cs

Jun 28, 2005, 6:08 AM

Post #3 of 3 (123 views)
Permalink
Re: Exim segv [In reply to]

On Tue, Jun 28, 2005 at 11:38:26AM +0100, Hubbard, Matt R W wrote:
> I'm also seeing this problem, and I'm not having much success finding
> the cause.

Me too.

> It appears to be message-content specific, as the same message from the
> same server causes this behaviour consistently. But I'm unable to
> reproduce it with identical message bodies under test conditions.

Me too.

I removed the malware call in an ACL and the message shot through with no
problem. Having lost the message from the queue I couldn't re-create the
problem again. Very irritating.

I saw it in exim-4.43 with exiscan and 4.50. 2 million messages later and
I've not seen it happen again so it must be something very specific and
slightly out of the ordinary that causes the problem, but it's not the
content as I still have that and I can chuck through the mail system 'till
I'm blue in the face without triggering the SEGV again.

--
______ jonathan [at] cs Jonathan Knight,
/ Department of Computer Science
/ _ __ Telephone: +44 1782 583437 University of Keele, Keele,
(_/ (_) / / Fax : +44 1782 713082 Staffordshire. ST5 5BG. U.K.

exim dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.