
akirk at sourcefire
May 19, 2009, 10:26 AM
Post #7 of 7
(2091 views)
Permalink
|
|
Re: Intermittent tcpserver failure / stuck queue?
[In reply to]
|
|
On Tue, May 19, 2009 at 11:21 AM, Markus Stumpf <lists-qmail [at] maexotic>wrote: > On Tue, May 19, 2009 at 10:04:35AM -0400, Alex Kirk wrote: > > It works in that I get a nice "220 www.schnarff.com ESMTP" like I'd > expect > > and I had to go kill -9 it from a different shell. It did generate > > /tmp/qmail.log - which I can send if desired, for now it seems like it'd > > just clog things up - but a look through it has nothing that jumps out at > me > > as a problem. > > I should have mention that qmail-smtpd is then in an SMTP dialog, > reading from STDIN. You can type > HELO example.com > MAIL FROM: <joe [at] example> > RCPT TO: <somelocaluser> > DATA > blabla > . > quit > If the dialogue hangs somewhere it would be interesting to see the > last couple of lines of the truss log. > Look at the log backwards from the end. If you are using some spam/virus > filters the exec()s should show up, which would help to see which > program causes the problem (see also below). > Actually, the last thing is just it waiting to read: 28969: select(2,0x0,{1},0x0,{1200.000000}) = 1 (0x1) 28969: write(1,"220 www.schnarff.com ESMTP\r\n",28) = 28 (0x1c) 28969: select(1,{0},0x0,0x0,{1200.000000}) I mean, maybe I'm doing it wrong, but when I tried typing in and hitting "ehlo schnarff.com", it didn't do a damn thing. It's like it didn't get the input at all. I'm guessing this isn't the underlying Qmail issue, so much as I've fouled something up in the test here. > > > Looking at the truss man page, I decided to attach to the PID of one of > my > > bin/qmail-queue processes, and it did...nothing. It just hung out, like > the > > zombie process it appears to be. > > If the process already hangs in a syscall and does nothing truss cannot > report anything. > > > @400000004a12b7cc1126e074 tcpserver: status: 20/20 > > @400000004a12b7cc1128a97c tcpserver: pid 20194 from 216.146.33.13 > > @400000004a12b7cc1325f07c tcpserver: ok 20194 > > www.schnarff.com:65.102.233.117:25 > > mxout-013-bos.mailhop.org:216.146.33.13::63308 > > @400000004a12b7cc21626cc4 CHKUSER accepted sender: from > > <Chandra [at] flyingwebsites::> remote <mhfr-03-ewr.dyndns.com: > > mxout-013-bos.mailhop.org:216.146.33.13> rcpt <> : sender accepted > > @400000004a12b7cc21a18cec CHKUSER accepted rcpt: from > > <Chandra [at] flyingwebsites::> remote <mhfr-03-ewr.dyndns.com: > > mxout-013-bos.mailhop.org:216.146.33.13> rcpt <alex [at] schnarff> : > found > > existing recipient > > Ok, looks like you are using a patched version of netqmail with CHKUSER > patch. Any other patches? If memory serves, I think I went with the qmail-toaster-0.8.3.patch, which provides smtp-auth, tls, oversize DNS, qregex, netqmail-maildir++, chkuser, and SPF. > > Are you using simscan/amavis/spamassassin or other spam/virus checking? Spamassassin, called from .qmail files. Haven't touched it since this behavior began, and "ps aux" shows no evidence it's causing issues - not a ton of processes, the ones that are there aren't hogging the CPU, etc. > > I don't know about the CHKUSER patch, but that *seems* to work. > If the messages don't go to the queue and you have spam/virus checking > this might be the source of the problem. > > > Do you think there's something blocking actual mail delivery, and that's > > what's causing the problem? > > I think the problem is already at the queueing, otherwise the smtpds > would finish. > Time for me to look more closely at the queue, then, I guess. I'll see what I can find and report back if I can solve the problem there, so that it's in the archives... Alex Kirk
|