Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Apache: Dev

[mod_fcgid patch] reap children without the zombie scan

 

 

Apache dev RSS feed   Index | Next | Previous | View Threaded


trawick at gmail

Oct 1, 2009, 5:01 PM

Post #1 of 3 (364 views)
Permalink
[mod_fcgid patch] reap children without the zombie scan

(just fixing subject)

On Thu, Oct 1, 2009 at 7:55 PM, Ricardo Cantu <ricardo [at] smartcsc> wrote:

> On Tuesday 29 September 2009 4:20:49 pm you wrote:
> > On Tue, Sep 29, 2009 at 4:59 PM, Ricardo Cantu <ricardo [at] smartcsc>
> wrote:
> > > On Tuesday 29 September 2009 2:31:21 pm Jeff Trawick wrote:
> > > > ZombieScanInterval (leave alone until processes can be reaped
> > >
> > > differently)
> > > Working on a patch for this one. Don't want to duplicate work, so let
> me
> > > know
> > > if anybody else is working on this.
> >
> > not me
> >
> > I hope that, for Unix, processes can be reaped as with the MPMs: instead
> of
> > asking if a specific pid has exited (for each pid in the list), ask if
> any
> > pid has exited and if so find it in the list and handle.
> >
>
> Well, here it is. My patch to reap the children when they exit rather than
> check the list for zombies. Before I take out the old logic for the zombie
> scan I would like to hear some input on the code.
>
> basically,
>
> apr_proc_other_child_register() - to register a callback when child exits.
>
> sigaction(SIGCHLD, &sa, NULL) - to listen for children dying.
>
> apr_proc_other_child_refresh_all(APR_OC_REASON_RESTART) - called when
> SIGCHLD
> received so callback will be called on the correct registered child.
>
> fcgid_child_maint - The callback. Cleans up the various lists and prints
> out
> log info.
>
> And that's it.
>
>
Attachments: patch.diff (17.5 KB)


trawick at gmail

Oct 3, 2009, 11:44 AM

Post #2 of 3 (314 views)
Permalink
Re: [mod_fcgid patch] reap children without the zombie scan [In reply to]

On Thu, Oct 1, 2009 at 8:01 PM, Jeff Trawick <trawick [at] gmail> wrote:

> (just fixing subject)
>
> On Thu, Oct 1, 2009 at 7:55 PM, Ricardo Cantu <ricardo [at] smartcsc>wrote:
>
>> On Tuesday 29 September 2009 4:20:49 pm you wrote:
>> > On Tue, Sep 29, 2009 at 4:59 PM, Ricardo Cantu <ricardo [at] smartcsc>
>> wrote:
>> > > On Tuesday 29 September 2009 2:31:21 pm Jeff Trawick wrote:
>> > > > ZombieScanInterval (leave alone until processes can be reaped
>> > >
>> > > differently)
>> > > Working on a patch for this one. Don't want to duplicate work, so let
>> me
>> > > know
>> > > if anybody else is working on this.
>> >
>> > not me
>> >
>> > I hope that, for Unix, processes can be reaped as with the MPMs: instead
>> of
>> > asking if a specific pid has exited (for each pid in the list), ask if
>> any
>> > pid has exited and if so find it in the list and handle.
>> >
>>
>> Well, here it is. My patch to reap the children when they exit rather than
>> check the list for zombies. Before I take out the old logic for the zombie
>> scan I would like to hear some input on the code.
>>
>> basically,
>>
>> apr_proc_other_child_register() - to register a callback when child
>> exits.
>>
>> sigaction(SIGCHLD, &sa, NULL) - to listen for children dying.
>>
>> apr_proc_other_child_refresh_all(APR_OC_REASON_RESTART) - called when
>> SIGCHLD
>> received so callback will be called on the correct registered child.
>>
>> fcgid_child_maint - The callback. Cleans up the various lists and prints
>> out
>> log info.
>>
>> And that's it.
>>
>>
>
Is this right?

before:

when process mgr has awakened (every second) and ZombieScanInterval has
elapsed:

lock table
foreach table entry (process spawned by process mgr) {
call waitpid on pid and see if it has exited
if exited, move to free list
}
unlock table

and this was done whether or not a child exited

now:

when awakened by SIGCHILD, in signal handler:

foreach registered other child (process spawned by process mgr) {
call waitpid on pid and see if it has exited
if exited {
lock table
search table for child and move to free list
unlock table
}
}

If so: I don't see that using the other child API has bought us much. At
least we don't lock the table unless a process has actually exited.

If the process mgr calls waitpid with a wildcard to see if any pid has
exited, then at least a list doesn't have to be searched unless there is
work to do. The prefork MPM does that, then calls
apr_proc_other_child_alert() to see if it is for an other-child. (The alert
function then runs the list of other children to find the pid.)

SIGCHLD can be used to unblock from waiting for (command, 1 second) sooner*,
but calling interesting code from the SIGCHLD handler should be avoided if
at all possible. (Windows wouldn't have that flow of control either.)

*I dunno if the mpm_common function currently used will return on EINTR

Thoughts?

(BTW, it is easier on reviewers if one patch tries to solve exactly one
problem.)


ricardo at smartcsc

Oct 4, 2009, 4:55 PM

Post #3 of 3 (308 views)
Permalink
Re: [mod_fcgid patch] reap children without the zombie scan [In reply to]

On Saturday 03 October 2009 12:44:02 pm Jeff Trawick wrote:
> On Thu, Oct 1, 2009 at 8:01 PM, Jeff Trawick <trawick [at] gmail> wrote:
> > (just fixing subject)
> >
> > On Thu, Oct 1, 2009 at 7:55 PM, Ricardo Cantu <ricardo [at] smartcsc>wrote:
> >> On Tuesday 29 September 2009 4:20:49 pm you wrote:
> >> > On Tue, Sep 29, 2009 at 4:59 PM, Ricardo Cantu <ricardo [at] smartcsc>
> >>
> >> wrote:
> >> > > On Tuesday 29 September 2009 2:31:21 pm Jeff Trawick wrote:
> >> > > > ZombieScanInterval (leave alone until processes can be reaped
> >> > >
> >> > > differently)
> >> > > Working on a patch for this one. Don't want to duplicate work, so
> >> > > let
> >>
> >> me
> >>
> >> > > know
> >> > > if anybody else is working on this.
> >> >
> >> > not me
> >> >
> >> > I hope that, for Unix, processes can be reaped as with the MPMs:
> >> > instead
> >>
> >> of
> >>
> >> > asking if a specific pid has exited (for each pid in the list), ask if
> >>
> >> any
> >>
> >> > pid has exited and if so find it in the list and handle.
> >>
> >> Well, here it is. My patch to reap the children when they exit rather
> >> than check the list for zombies. Before I take out the old logic for the
> >> zombie scan I would like to hear some input on the code.
> >>
> >> basically,
> >>
> >> apr_proc_other_child_register() - to register a callback when child
> >> exits.
> >>
> >> sigaction(SIGCHLD, &sa, NULL) - to listen for children dying.
> >>
> >> apr_proc_other_child_refresh_all(APR_OC_REASON_RESTART) - called when
> >> SIGCHLD
> >> received so callback will be called on the correct registered child.
> >>
> >> fcgid_child_maint - The callback. Cleans up the various lists and prints
> >> out
> >> log info.
> >>
> >> And that's it.
>
> Is this right?
>
> before:
>
> when process mgr has awakened (every second) and ZombieScanInterval has
> elapsed:
>
> lock table
> foreach table entry (process spawned by process mgr) {
> call waitpid on pid and see if it has exited
> if exited, move to free list
> }
> unlock table
>
> and this was done whether or not a child exited
>
> now:
>
> when awakened by SIGCHILD, in signal handler:
If one process exits every minute and the proc array is 40 processes long it
would go from ~60 x 40 scans to 1 x40. Seems like an improvement.

>
> foreach registered other child (process spawned by process mgr) {
> call waitpid on pid and see if it has exited
> if exited {
I guess I could do a wait_all_procs here instead.

> lock table

> search table for child and move to free list
This can never be eliminated due to the proc's being in a simple circular
list. We could change it to a double linked and not have to search.

> unlock table
> }
> }
>
> If so: I don't see that using the other child API has bought us much. At
> least we don't lock the table unless a process has actually exited.
>
> If the process mgr calls waitpid with a wildcard to see if any pid has
> exited, then at least a list doesn't have to be searched unless there is
> work to do.
true with this patch as well.

> The prefork MPM does that, then calls
> apr_proc_other_child_alert() to see if it is for an other-child. (The
> alert function then runs the list of other children to find the pid.)
>
> SIGCHLD can be used to unblock from waiting for (command, 1 second)
> sooner*, but calling interesting code from the SIGCHLD handler should be
> avoided if at all possible. (Windows wouldn't have that flow of control
> either.)
concede on that point.

>
> *I dunno if the mpm_common function currently used will return on EINTR
it does.

>
> Thoughts?
Okay, instead of using SIGCHLD to unblock the pm, do you propose making it a
nowait type read and do the wait_all_procs every cycle or perhaps a reaping
thread to handle the children and leave the pm alone? From what I can tell
prefork.c does not block (ap_wait_or_timeout uses ap_wait_all_procs(ret,
status, APR_NOWAIT, p); which does not block) unlike pm_main that blocks at
procmgr_peek_cmd waiting for a request.

recap of directions:
1. make pm_main non-blocking and do wait_all_procs/apr_proc_other_child_alert
per cycle.

or

2. Create a thread to do wait_all_procs/apr_proc_other_child_alert and leave
pm_main alone.

or

3. Change my patch to use wait_all_procs/apr_proc_other_child_alert on
SIGCHLD.

or

4. unblock pm_main some how on SIGCHLD and make wait_all_procs part of a
normal cycle of a request.

>
> (BTW, it is easier on reviewers if one patch tries to solve exactly one
> problem.)
got it

Apache dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.