Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: SpamAssassin: users

SIGCHLD query

 

 

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded


martin at gregorie

Oct 2, 2009, 4:42 PM

Post #1 of 9 (744 views)
Permalink
SIGCHLD query

What causes a spamd 3.2.5 child process to be terminated by receiving a
SIGCHLD signal?

I've looked at the spamc and spamd manpages but there's no mention of
them there. I can't remember seeing them discussed on this maillist
either.

My last month's logs show 7 of them and I can't work out what caused
them to be sent. However, Jose Luis Marin Perez' system is seeing a lot
of them - on the order of 10% of messages scanned are getting hit by
them, though his seem to be connected with very long running scans.

So, what do these signals mean and what should I do to my SA
configuration to get rid of them.


Martin


per at computer

Oct 6, 2009, 8:21 AM

Post #2 of 9 (683 views)
Permalink
Re: SIGCHLD query [In reply to]

Martin Gregorie wrote:

> What causes a spamd 3.2.5 child process to be terminated by receiving
> a SIGCHLD signal?
>

A parent process receives a SIGCHLD when a child process terminates.

> My last month's logs show 7 of them and I can't work out what caused
> them to be sent. However, Jose Luis Marin Perez' system is seeing a
> lot of them - on the order of 10% of messages scanned are getting hit
> by them, though his seem to be connected with very long running scans.

A timeout in the child perhaps?


/Per Jessen, Zürich


martin at gregorie

Oct 6, 2009, 8:26 AM

Post #3 of 9 (681 views)
Permalink
Re: SIGCHLD query [In reply to]

On Tue, 2009-10-06 at 16:46 +0200, Per Jessen wrote:
> Martin Gregorie wrote:
>
> > What causes a spamd 3.2.5 child process to be terminated by receiving
> > a SIGCHLD signal?
> >
>
> A timeout in the child perhaps?
>
That thought that may be the reason. It certainly seems to apply when a
child runs longer than the time set by --timeout-child but there are a
few cases where a SIGCHLD is sent when the child has only run for a
second or two. Its a pity the log message doesn't include the reason why
the SIGCHLD was sent.


Martin


per at computer

Oct 6, 2009, 2:21 PM

Post #4 of 9 (668 views)
Permalink
Re: SIGCHLD query [In reply to]

Martin Gregorie wrote:

> On Tue, 2009-10-06 at 16:46 +0200, Per Jessen wrote:
>> Martin Gregorie wrote:
>>
>> > What causes a spamd 3.2.5 child process to be terminated by
>> > receiving a SIGCHLD signal?
>> >
>>
>> A timeout in the child perhaps?
>>
> That thought that may be the reason. It certainly seems to apply when
> a
> child runs longer than the time set by --timeout-child but there are
> a few cases where a SIGCHLD is sent when the child has only run for a
> second or two. Its a pity the log message doesn't include the reason
> why the SIGCHLD was sent.

Martin, generally speaking, the parent can only report the signal and
that the child has gone away. The child would have to report on why.


/Per Jessen, Zürich


per at computer

Oct 6, 2009, 11:54 PM

Post #5 of 9 (675 views)
Permalink
Re: SIGCHLD query [In reply to]

Martin Gregorie wrote:
> On Tue, 2009-10-06 at 23:16 +0200, Per Jessen wrote:
>> Martin, generally speaking, the parent can only report the signal and
>> that the child has gone away. The child would have to report on why.
>>
> OK, rephrase that to "a pity the child doesn't say why its generating a
> SIGCHLD signal".
>

Yeah - maybe there is some indication in the log? I think there is a
switch that determines how many emails a child will process before
needing restart. (just looked it up: --max-conn-per-child)
I just checked my logs, during the last 9 hours I have 6016 of these:

spamd[11362]: spamd: handled cleanup of child pid 14010 due to SIGCHLD

Is that the one you mean?

There are also arguments for controlling minimum/maximum number of spare
child processes - if your load varies, and you have a significant
difference between min and max, I could see that leading to more child
processes stopping and starting.


/Per


martin at gregorie

Oct 7, 2009, 4:03 AM

Post #6 of 9 (679 views)
Permalink
Re: SIGCHLD query [In reply to]

> Yeah - maybe there is some indication in the log? I think there is a
> switch that determines how many emails a child will process before
> needing restart. (just looked it up: --max-conn-per-child)
> I just checked my logs, during the last 9 hours I have 6016 of these:
>
> spamd[11362]: spamd: handled cleanup of child pid 14010 due to SIGCHLD
>
> Is that the one you mean?
>
That's the only log message I've seen. Sometimes you can associate it
with a scan that exceeded --timeout-child seconds and sometimes, much
more rarely, it happens after a scan taking two or three seconds. Tuning
would be easier if there was some indication about why a scan had
terminated - maybe it could be added to the statistics list in the
'results' log line.

> There are also arguments for controlling minimum/maximum number of spare
> child processes - if your load varies, and you have a significant
> difference between min and max, I could see that leading to more child
> processes stopping and starting.
>
Does the parent or the child determine whether the child stays alive
after completing a scan or whether it should terminate?


Martin


per at computer

Oct 7, 2009, 4:41 AM

Post #7 of 9 (666 views)
Permalink
Re: SIGCHLD query [In reply to]

Martin Gregorie wrote:
>> Yeah - maybe there is some indication in the log? I think there is a
>> switch that determines how many emails a child will process before
>> needing restart. (just looked it up: --max-conn-per-child)
>> I just checked my logs, during the last 9 hours I have 6016 of these:
>>
>> spamd[11362]: spamd: handled cleanup of child pid 14010 due to SIGCHLD
>>
>> Is that the one you mean?
>>
> That's the only log message I've seen. Sometimes you can associate it
> with a scan that exceeded --timeout-child seconds and sometimes, much
> more rarely, it happens after a scan taking two or three seconds.

I don't know if that is happening on my systems too, I haven't checked.
I wonder if the latter could be caused by the maintenance of spare
child processes?

>> There are also arguments for controlling minimum/maximum number of spare
>> child processes - if your load varies, and you have a significant
>> difference between min and max, I could see that leading to more child
>> processes stopping and starting.
>>
> Does the parent or the child determine whether the child stays alive
> after completing a scan or whether it should terminate?

It's the child that determines that "Uh, I've done X scans, all done".
It's just a for-loop:

for( i=0; i<maxscansperchild; i++ )
wait for work
do work

If it's about pruning idle child processes, the parent is no doubt doing it.


/Per


per at computer

Oct 7, 2009, 6:21 AM

Post #8 of 9 (666 views)
Permalink
Re: SIGCHLD query [In reply to]

Per Jessen wrote:

> Martin Gregorie wrote:
>>> Yeah - maybe there is some indication in the log? I think there is
>>> a switch that determines how many emails a child will process before
>>> needing restart. (just looked it up: --max-conn-per-child)
>>> I just checked my logs, during the last 9 hours I have 6016 of
>>> these:
>>>
>>> spamd[11362]: spamd: handled cleanup of child pid 14010 due to
>>> SIGCHLD
>>>
>>> Is that the one you mean?
>>>
>> That's the only log message I've seen. Sometimes you can associate it
>> with a scan that exceeded --timeout-child seconds and sometimes, much
>> more rarely, it happens after a scan taking two or three seconds.
>
> I don't know if that is happening on my systems too, I haven't
> checked.

Okay, I ran a check on my logs since midnight - yes, I also see a lot of
child processes running for less than 10secs, in fact slightly more
than 50%. Interesting issue.


/Per Jessen, Zürich


martin at gregorie

Oct 7, 2009, 7:13 AM

Post #9 of 9 (672 views)
Permalink
Re: SIGCHLD query [In reply to]

On Wed, 2009-10-07 at 14:31 +0200, Per Jessen wrote:
> Okay, I ran a check on my logs since midnight - yes, I also see a lot of
> child processes running for less than 10secs, in fact slightly more
> than 50%. Interesting issue.
>
Here's the results of a scan across all my mail logs:

Processing file /var/log/maillog*
3544 Messages found
3538 Results (99.8%)
6 SIGCHLDs caught (0.2%)
min avg max
Message size: 353 7340 496682
Scan time (secs): 0.5 2.3 34.5

I've checked all the SIGCHLD log lines. The previuous scan by those
children were all in the range 1.- to 3.1 seconds. I'm using the default
child population and the default --timeout-child of 300 secs.


Martin

SpamAssassin users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.