
ricardo at smartcsc
Oct 4, 2009, 4:55 PM
Post #3 of 3
(308 views)
Permalink
|
|
Re: [mod_fcgid patch] reap children without the zombie scan
[In reply to]
|
|
On Saturday 03 October 2009 12:44:02 pm Jeff Trawick wrote: > On Thu, Oct 1, 2009 at 8:01 PM, Jeff Trawick <trawick [at] gmail> wrote: > > (just fixing subject) > > > > On Thu, Oct 1, 2009 at 7:55 PM, Ricardo Cantu <ricardo [at] smartcsc>wrote: > >> On Tuesday 29 September 2009 4:20:49 pm you wrote: > >> > On Tue, Sep 29, 2009 at 4:59 PM, Ricardo Cantu <ricardo [at] smartcsc> > >> > >> wrote: > >> > > On Tuesday 29 September 2009 2:31:21 pm Jeff Trawick wrote: > >> > > > ZombieScanInterval (leave alone until processes can be reaped > >> > > > >> > > differently) > >> > > Working on a patch for this one. Don't want to duplicate work, so > >> > > let > >> > >> me > >> > >> > > know > >> > > if anybody else is working on this. > >> > > >> > not me > >> > > >> > I hope that, for Unix, processes can be reaped as with the MPMs: > >> > instead > >> > >> of > >> > >> > asking if a specific pid has exited (for each pid in the list), ask if > >> > >> any > >> > >> > pid has exited and if so find it in the list and handle. > >> > >> Well, here it is. My patch to reap the children when they exit rather > >> than check the list for zombies. Before I take out the old logic for the > >> zombie scan I would like to hear some input on the code. > >> > >> basically, > >> > >> apr_proc_other_child_register() - to register a callback when child > >> exits. > >> > >> sigaction(SIGCHLD, &sa, NULL) - to listen for children dying. > >> > >> apr_proc_other_child_refresh_all(APR_OC_REASON_RESTART) - called when > >> SIGCHLD > >> received so callback will be called on the correct registered child. > >> > >> fcgid_child_maint - The callback. Cleans up the various lists and prints > >> out > >> log info. > >> > >> And that's it. > > Is this right? > > before: > > when process mgr has awakened (every second) and ZombieScanInterval has > elapsed: > > lock table > foreach table entry (process spawned by process mgr) { > call waitpid on pid and see if it has exited > if exited, move to free list > } > unlock table > > and this was done whether or not a child exited > > now: > > when awakened by SIGCHILD, in signal handler: If one process exits every minute and the proc array is 40 processes long it would go from ~60 x 40 scans to 1 x40. Seems like an improvement. > > foreach registered other child (process spawned by process mgr) { > call waitpid on pid and see if it has exited > if exited { I guess I could do a wait_all_procs here instead. > lock table > search table for child and move to free list This can never be eliminated due to the proc's being in a simple circular list. We could change it to a double linked and not have to search. > unlock table > } > } > > If so: I don't see that using the other child API has bought us much. At > least we don't lock the table unless a process has actually exited. > > If the process mgr calls waitpid with a wildcard to see if any pid has > exited, then at least a list doesn't have to be searched unless there is > work to do. true with this patch as well. > The prefork MPM does that, then calls > apr_proc_other_child_alert() to see if it is for an other-child. (The > alert function then runs the list of other children to find the pid.) > > SIGCHLD can be used to unblock from waiting for (command, 1 second) > sooner*, but calling interesting code from the SIGCHLD handler should be > avoided if at all possible. (Windows wouldn't have that flow of control > either.) concede on that point. > > *I dunno if the mpm_common function currently used will return on EINTR it does. > > Thoughts? Okay, instead of using SIGCHLD to unblock the pm, do you propose making it a nowait type read and do the wait_all_procs every cycle or perhaps a reaping thread to handle the children and leave the pm alone? From what I can tell prefork.c does not block (ap_wait_or_timeout uses ap_wait_all_procs(ret, status, APR_NOWAIT, p); which does not block) unlike pm_main that blocks at procmgr_peek_cmd waiting for a request. recap of directions: 1. make pm_main non-blocking and do wait_all_procs/apr_proc_other_child_alert per cycle. or 2. Create a thread to do wait_all_procs/apr_proc_other_child_alert and leave pm_main alone. or 3. Change my patch to use wait_all_procs/apr_proc_other_child_alert on SIGCHLD. or 4. unblock pm_main some how on SIGCHLD and make wait_all_procs part of a normal cycle of a request. > > (BTW, it is easier on reviewers if one patch tries to solve exactly one > problem.) got it
|