Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: ModPerl: ModPerl

Apache Children Stuck on futex

 

 

ModPerl modperl RSS feed   Index | Next | Previous | View Threaded


sean.thorne at gmail

Jun 23, 2009, 8:52 PM

Post #1 of 11 (3710 views)
Permalink
Apache Children Stuck on futex

-------------8<---------- Start Bug Report ------------8<----------
1. Problem Description:

I've got some Apache Children that are getting stuck on a futex
call. This started happening on a Apache 2.2.6 worker w/ mod_per
2.0.4 install, so I upgraded to Apache 2.2.11 worker w/ mod_perl 2.0.4
and it still continues. I have modules for proxy and php installed as
well, but this problem only presents when using mod_perl and
MaxRequestsPerChild. If I remove mod_perl the Apache children close
as expected. It's easily replicated using an abusive ab test and
turning down MaxRequestsPerChild. I know I could turn off
MaxRequestsPerChild, but I have that on to deal with PHP's poor thread
handling and memory leaks. I could switch to prefork, but the servers
I have don't have enough RAM to handle the load I need them too. Any
help would be appreciated.

This child appears to be waiting for PID 3451, but that PID no
longer exists.

[sthorne [at] 81082-spar ~]$ sudo strace -p 3271
Process 3271 attached - interrupt to quit
futex(0x1b5bbe8, FUTEX_WAIT, 3451, NULL

2. Used Components and their Configuration:

*** mod_perl version 2.000004

*** using /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi/
Apache2/BuildConfig.pm

*** Makefile.PL options:
MP_APR_CONFIG => /my_setup/apps/apache/bin/apr-1-config
MP_APR_LIB => aprext
MP_APXS => /my_setup/apps/apache/bin/apxs
MP_COMPAT_1X => 1
MP_GENERATE_XS => 1
MP_LIBNAME => mod_perl
MP_USE_DSO => 1


*** The httpd binary was not found


*** (apr|apu)-config linking info

(apr|apu)-config scripts were not found



*** /usr/bin/perl -V
Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
Platform:
osname=linux, osvers=2.6.9-67.0.7.elsmp, archname=i386-linux-
thread-multi
uname='linux hs20-bc1-5.build.redhat.com 2.6.9-67.0.7.elsmp #1 smp
wed feb 27 04:47:23 est 2008 i686 i686 i386 gnulinux '
config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -
mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -
Dperladmin=root [at] localhos -Dcc=gcc -Dcf_by=Red Hat, Inc. -
Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -
Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -
Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -
Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -
Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less
-isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
hint=recommended, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define
usemultiplicity=define
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-
strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -
D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-
aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-9)',
gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='gcc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -
lc
perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
gnulibc_version='2.3.4'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -
Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'


Characteristics of this binary (from libperl):
Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS
USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
Built under linux
Compiled at Jun 5 2008 07:33:47
%ENV:
PERL_LWP_USE_HTTP_10="1"
@INC:
/usr/lib/perl5/5.8.5/i386-linux-thread-multi
/usr/lib/perl5/5.8.5
/usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.5
/usr/lib/perl5/site_perl/5.8.4
/usr/lib/perl5/site_perl/5.8.3
/usr/lib/perl5/site_perl/5.8.2
/usr/lib/perl5/site_perl/5.8.1
/usr/lib/perl5/site_perl/5.8.0
/usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.5
/usr/lib/perl5/vendor_perl/5.8.4
/usr/lib/perl5/vendor_perl/5.8.3
/usr/lib/perl5/vendor_perl/5.8.2
/usr/lib/perl5/vendor_perl/5.8.1
/usr/lib/perl5/vendor_perl/5.8.0
/usr/lib/perl5/vendor_perl
.

*** Packages of interest status:

Apache2 : -
Apache2::Request : -
CGI : 3.05
ExtUtils::MakeMaker: 6.17
LWP : 5.808
mod_perl : -
mod_perl2 : 2.000004


3. This is the core dump trace: (if you get a core dump):

Nothing has cored

This report was generated by /usr/bin/mp2bug on Wed Jun 24 03:36:25
2009 GMT.

-------------8<---------- End Bug Report --------------8<----------

Note: Complete the rest of the details and post this bug report to
modperl <at> perl.apache.org. To subscribe to the list send an empty
email to modperl-subscribe [at] perl


max at maxbarry

Oct 25, 2011, 8:56 PM

Post #2 of 11 (3168 views)
Permalink
Re: Apache Children Stuck on futex [In reply to]

Hello,

I'm trying to solve a long-running problem whereby my Apache mod_perl
processes get stuck in a "FUTEX_WAIT" state instead of exiting.

I believe this is the same issue as reported here:
http://www.gossamer-threads.com/lists/modperl/modperl/99879

The problem occurs fairly frequently following a burst of traffic, when
Apache spawns new processes, then attempts to cull them afterward. It
also occurred, before I disabled this, when Apache tried to cull a
process upon reaching MaxRequestsPerChild.

Usually, from the child's point of view, this looks like this:

$ strace -p 21764
Process 21764 attached - interrupt to quit
read(5, "!", 1) = 1
tgkill(21764, 21791, SIGHUP) = 0
tgkill(21764, 21791, SIG_0) = 0
select(0, NULL, NULL, NULL, {0, 500000}) = ? ERESTARTNOHAND (To be
restarted)
--- SIGTERM (Terminated) @ 0 (0) ---
rt_sigreturn(0xf) = -1 EINTR (Interrupted system call)
munmap(0x7f9905750000, 8392704) = 0
munmap(0x7f98f8736000, 8392704) = 0
...
madvise(0x7f98e4021000, 73728, MADV_DONTNEED) = 0
exit_group(0) = ?
Process 21764 detached

However, every five or so attempts, it instead goes like this:

$ strace -p 24133
Process 24133 attached - interrupt to quit
read(5, "!", 1) = 1
tgkill(24133, 24164, SIGHUP) = 0
tgkill(24133, 24164, SIG_0) = 0
--- SIGTERM (Terminated) @ 0 (0) ---
rt_sigreturn(0xf) = 0
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
tgkill(24133, 24140, SIGUSR1) = 0
futex(0x7f9904f4e9d0, FUTEX_WAIT, 24140, NULL

... and goes no further.

Sometimes, after a few minutes of doing nothing, the process will
suddenly free itself, spit out a bunch of "munmap" calls, and exit. But
more often it hangs indefinitely.

Given time, these hung children accumulate until they occupy all
available RAM, which sends the box into swap and eventually crashes it.

This problem has occurred on various flavors of Apache & Ubuntu over the
last two years. I'm currently seeing it regularly on the two boxes I
manage, which are:

- Apache/2.2.17 (Ubuntu) mod_perl/2.0.4 Perl/v5.10.1 on Ubuntu 11.04
(2.6.38-11-generic #50-Ubuntu SMP x86_64).

- Apache/2.2.14 (Ubuntu) mod_perl/2.0.4 Perl/v5.10.1 on Ubuntu 10.04
(2.6.32-30-server #59-Ubuntu SMP x86_64).

The problem does not occur on Apache running without mod_perl.

I have tried to debug this problem for a long time, but don't know how
to advance any further.

Thanks in advance for any advice!

Max.


max at maxbarry

Oct 26, 2011, 3:06 PM

Post #3 of 11 (3131 views)
Permalink
Re: Apache Children Stuck on futex [In reply to]

On 26/10/11 14:56, Max Barry wrote:
> I'm trying to solve a long-running problem whereby my Apache mod_perl
> processes get stuck in a "FUTEX_WAIT" state instead of exiting.
>
> I believe this is the same issue as reported here:
> http://www.gossamer-threads.com/lists/modperl/modperl/99879
>
> The problem occurs fairly frequently following a burst of traffic, when
> Apache spawns new processes, then attempts to cull them afterward. It
> also occurred, before I disabled this, when Apache tried to cull a
> process upon reaching MaxRequestsPerChild.

Further to this, I've found that the problem occurs even on a fresh
Apache/mod_perl install; i.e. after completely removing Apache &
mod_perl, including /etc/apache2/, and doing only this:

1. sudo apt-get install apache2-mpm-worker libapache2-mod-perl2
apache2-doc

2. Edit the 'default' site thusly:

--- /etc/apache2/sites-available/default.original 2011-10-27
08:44:11.383803928 +1100
+++ /etc/apache2/sites-available/default 2011-10-27 08:44:51.795391116 +1100
@@ -19,6 +19,9 @@
Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
Order allow,deny
Allow from all
+
+ SetHandler perl-script
+ PerlResponseHandler ModPerl::RegistryBB
</Directory>

ErrorLog ${APACHE_LOG_DIR}/error.log

3. Add a low 'MaxRequestsPerChild' directive (not strictly necessary,
but makes the problem much more visible):

--- /etc/apache2/httpd.conf.original 2011-10-27 08:52:48.898361041 +1100
+++ /etc/apache2/httpd.conf 2011-10-27 08:58:52.384091605 +1100
@@ -0,0 +1 @@
+MaxRequestsPerChild 10

4. sudo /etc/init.d/apache2 restart

5. ab -n 450 -c 175 http://localhost/cgi-bin/

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
apr_poll: The timeout specified has expired (70007)
Total of 426 requests completed

(The number "426" above varies: sometimes it's as low as 400, sometimes
all requests complete. Usually, though, it's around 430.)

I've also established that:

* The problem occurs regardless of whether ModPerl::Registry,
ModPerl::RegistryBB, or ModPerl::PerlRun is used.

* The problem occurs even when all Apache modules are disabled except
for alias, authz_host, and mod_perl.

* The problem occurs even when no script is being run: i.e. Apache is
asked for "/cgi-bin/nonexistentscript.cgi", or just "/cgi-bin/".

This is pretty perplexing.

Max.


fred at redhotpenguin

Oct 26, 2011, 7:16 PM

Post #4 of 11 (3118 views)
Permalink
Re: Apache Children Stuck on futex [In reply to]

Have you tried this with mod_perl 2.0.5, or 2.0.6-dev? May have been resolved already.


On Wednesday, October 26, 2011 at 3:06 PM, Max Barry wrote:

>
> On 26/10/11 14:56, Max Barry wrote:
> > I'm trying to solve a long-running problem whereby my Apache mod_perl
> > processes get stuck in a "FUTEX_WAIT" state instead of exiting.
> >
> > I believe this is the same issue as reported here:
> > http://www.gossamer-threads.com/lists/modperl/modperl/99879
> >
> > The problem occurs fairly frequently following a burst of traffic, when
> > Apache spawns new processes, then attempts to cull them afterward. It
> > also occurred, before I disabled this, when Apache tried to cull a
> > process upon reaching MaxRequestsPerChild.
>
> Further to this, I've found that the problem occurs even on a fresh
> Apache/mod_perl install; i.e. after completely removing Apache &
> mod_perl, including /etc/apache2/, and doing only this:
>
> 1. sudo apt-get install apache2-mpm-worker libapache2-mod-perl2
> apache2-doc
>
> 2. Edit the 'default' site thusly:
>
> --- /etc/apache2/sites-available/default.original 2011-10-27
> 08:44:11.383803928 +1100
> +++ /etc/apache2/sites-available/default 2011-10-27 08:44:51.795391116 +1100
> @@ -19,6 +19,9 @@
> Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
> Order allow,deny
> Allow from all
> +
> + SetHandler perl-script
> + PerlResponseHandler ModPerl::RegistryBB
> </Directory>
>
> ErrorLog ${APACHE_LOG_DIR}/error.log
>
> 3. Add a low 'MaxRequestsPerChild' directive (not strictly necessary,
> but makes the problem much more visible):
>
> --- /etc/apache2/httpd.conf.original 2011-10-27 08:52:48.898361041 +1100
> +++ /etc/apache2/httpd.conf 2011-10-27 08:58:52.384091605 +1100
> @@ -0,0 +1 @@
> +MaxRequestsPerChild 10
>
> 4. sudo /etc/init.d/apache2 restart
>
> 5. ab -n 450 -c 175 http://localhost/cgi-bin/
>
> This is ApacheBench, Version 2.3 <$Revision: 655654 $>
> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
> Licensed to The Apache Software Foundation, http://www.apache.org/
>
> Benchmarking localhost (be patient)
> Completed 100 requests
> Completed 200 requests
> Completed 300 requests
> Completed 400 requests
> apr_poll: The timeout specified has expired (70007)
> Total of 426 requests completed
>
> (The number "426" above varies: sometimes it's as low as 400, sometimes
> all requests complete. Usually, though, it's around 430.)
>
> I've also established that:
>
> * The problem occurs regardless of whether ModPerl::Registry,
> ModPerl::RegistryBB, or ModPerl::PerlRun is used.
>
> * The problem occurs even when all Apache modules are disabled except
> for alias, authz_host, and mod_perl.
>
> * The problem occurs even when no script is being run: i.e. Apache is
> asked for "/cgi-bin/nonexistentscript.cgi", or just "/cgi-bin/".
>
> This is pretty perplexing.
>
> Max.


max at maxbarry

Oct 27, 2011, 3:21 PM

Post #5 of 11 (3106 views)
Permalink
Re: Apache Children Stuck on futex [In reply to]

On 27/10/11 13:16, Fred Moyer wrote:
> Have you tried this with mod_perl 2.0.5, or 2.0.6-dev? May have been resolved already.

Doesn't look like it: I upgraded a system to Ubuntu 11.10, which is:

Apache/2.2.20 (Ubuntu) mod_perl/2.0.5 Perl/v5.12

... and the same problem occurs.

I've found a third person experiencing this:
http://ubuntuforums.org/showthread.php?t=1607697

Max.


torsten.foertsch at gmx

Oct 29, 2011, 4:43 AM

Post #6 of 11 (3101 views)
Permalink
Re: Apache Children Stuck on futex [In reply to]

On Wednesday, 26 October 2011 05:56:49 Max Barry wrote:
> $ strace -p 24133
> Process 24133 attached - interrupt to quit
> read(5, "!", 1) = 1
> tgkill(24133, 24164, SIGHUP) = 0
> tgkill(24133, 24164, SIG_0) = 0
> --- SIGTERM (Terminated) @ 0 (0) ---
> rt_sigreturn(0xf) = 0
> select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
> tgkill(24133, 24140, SIGUSR1) = 0
> futex(0x7f9904f4e9d0, FUTEX_WAIT, 24140, NULL

It would be interesting to see which futex it is blocked on. One way to
check that is perhaps to allow core dumps in the apache config and then
to send a core dump signal like SEGV, BUS or similar when the process
hangs. Use the dump file then to get a stack trace.

Torsten Förtsch

--
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net


max at maxbarry

Nov 13, 2011, 7:42 PM

Post #7 of 11 (3058 views)
Permalink
Re: Apache Children Stuck on futex [In reply to]

On 29/10/11 22:43, Torsten Förtsch wrote:
> On Wednesday, 26 October 2011 05:56:49 Max Barry wrote:
>> $ strace -p 24133
>> Process 24133 attached - interrupt to quit
>> read(5, "!", 1) = 1
>> tgkill(24133, 24164, SIGHUP) = 0
>> tgkill(24133, 24164, SIG_0) = 0
>> --- SIGTERM (Terminated) @ 0 (0) ---
>> rt_sigreturn(0xf) = 0
>> select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
>> tgkill(24133, 24140, SIGUSR1) = 0
>> futex(0x7f9904f4e9d0, FUTEX_WAIT, 24140, NULL
>
> It would be interesting to see which futex it is blocked on. One way to
> check that is perhaps to allow core dumps in the apache config and then
> to send a core dump signal like SEGV, BUS or similar when the process
> hangs. Use the dump file then to get a stack trace.
>
> Torsten Förtsch

Thank you very much for the reply! Here is the result:

http://pastebin.com/YDbmq84w

This shows me:
* running the Apache benchmarking utility to generate lots of requests
* identifying a process hung in 'futex_wait' (11447)
* killing it with SEGV
* obtaining a stack trace

Max.


torsten.foertsch at gmx

Nov 14, 2011, 6:26 AM

Post #8 of 11 (3077 views)
Permalink
Re: Apache Children Stuck on futex [In reply to]

On Monday, 14 November 2011 04:42:16 Max Barry wrote:
> Here is the result:
>
> http://pastebin.com/YDbmq84w
>
> This shows me:
> * running the Apache benchmarking utility to generate lots of requests
> * identifying a process hung in 'futex_wait' (11447)
> * killing it with SEGV
> * obtaining a stack trace

Thanks Max. It really seems to be a modperl problem. I think there is
either something fishy with modperl_tipool_putback_base() or someone
writes to a location that it doesn't own.

Many of your threads block in modperl_tipool_pop() waiting for an
interpreter to become available:

/* block until an item becomes available */
modperl_tipool_wait(tipool);

In src/modules/perl/modperl_tipool.c in function
modperl_tipool_putback_base() you find these lines:

if (!listp) {
/* XXX: Attempt to putback something that was never there */
modperl_tipool_unlock(tipool);
return;
}

I think the code should not return here but call abort() and dump core
because if it enters the if-block it tries to push back an interpreter
that was not taken from the pool. But why would someone call
modperl_tipool_putback_base if not to release an interpreter. Hence the
interpreter is lost. The other part of the function seems quite
reasonable. So, I think modperl_tipool_putback_base() is sometimes called
with a wrong data pointer and thus leaks interpreters.

Can you install the symbol tables for your modperl and perhaps check the
values of *tipool in the core? I think it is

tipool->size == tipool->in_use == tipool->cfg->max

That would explain the behavior.

BTW, there are IMHO many points about the tipool implementation that can
be improved. Why do we use these lists? Wouldn't it be better to
allocated an array of tipool->cfg->max pointers? Or perhaps an apr_hash_t
in pconf?

Torsten Förtsch

--
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net


max at maxbarry

Nov 14, 2011, 3:36 PM

Post #9 of 11 (3048 views)
Permalink
Re: Apache Children Stuck on futex [In reply to]

On 15/11/11 01:26, Torsten Förtsch wrote:
> On Monday, 14 November 2011 04:42:16 Max Barry wrote:
>> Here is the result:
>>
>> http://pastebin.com/YDbmq84w
>>
>> This shows me:
>> * running the Apache benchmarking utility to generate lots of requests
>> * identifying a process hung in 'futex_wait' (11447)
>> * killing it with SEGV
>> * obtaining a stack trace
>
> Thanks Max. It really seems to be a modperl problem. I think there is
> either something fishy with modperl_tipool_putback_base() or someone
> writes to a location that it doesn't own.
>
> Many of your threads block in modperl_tipool_pop() waiting for an
> interpreter to become available:
>
> /* block until an item becomes available */
> modperl_tipool_wait(tipool);
>
> In src/modules/perl/modperl_tipool.c in function
> modperl_tipool_putback_base() you find these lines:
>
> if (!listp) {
> /* XXX: Attempt to putback something that was never there */
> modperl_tipool_unlock(tipool);
> return;
> }
>
> I think the code should not return here but call abort() and dump core
> because if it enters the if-block it tries to push back an interpreter
> that was not taken from the pool. But why would someone call
> modperl_tipool_putback_base if not to release an interpreter. Hence the
> interpreter is lost. The other part of the function seems quite
> reasonable. So, I think modperl_tipool_putback_base() is sometimes called
> with a wrong data pointer and thus leaks interpreters.
>
> Can you install the symbol tables for your modperl and perhaps check the
> values of *tipool in the core? I think it is
>
> tipool->size == tipool->in_use == tipool->cfg->max
>
> That would explain the behavior.
>
> BTW, there are IMHO many points about the tipool implementation that can
> be improved. Why do we use these lists? Wouldn't it be better to
> allocated an array of tipool->cfg->max pointers? Or perhaps an apr_hash_t
> in pconf?
>
> Torsten Förtsch

Hi Torsten,

I'm afraid that installing debugging symbols is beyond me, but I have
confirmed that the problem is reproducible in a clean Ubuntu Server install.

Here is me going from a brand new Ubuntu Server install to futex_wait
hang in a few easy steps:

http://pastebin.com/ahDtAeAS

To reproduce:

1. Download an ISO of Ubuntu Server 11.10 64-bit. (I got it from a local
mirror:
http://mirror.aarnet.edu.au/pub/ubuntu/releases/11.10/ubuntu-11.10-server-amd64.iso).

2. Install as a virtual machine. (I installed inside VirtualBox
4.1.4-r74291, accepting all defaults and installing no additional packages.)

3. Install mod_perl2, configure the 'default' site to use it, and lower
MaxRequestsPerChild.

4. Smash the server with requests.

I hope this is sufficient to let you find the problem. Please let me
know if I can help further.

Max.


salusa at nationstates

Feb 13, 2012, 8:00 AM

Post #10 of 11 (2768 views)
Permalink
Re: Apache Children Stuck on futex [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

(Sorry about the grave-dig, but as this is still an issue.)

I'm still coming up to speed on the inner working of mod_perl (I've
never played in it before), but Max asked me to take a look at the futex
problem, so I thought I'd try to pick up where it was left off and
hopefully get this fixed.

My system:
Linux modperl 2.6.38-8-server #42-Ubuntu SMP Mon Apr 11 03:49:04 UTC
2011 x86_64 x86_64 x86_64 GNU/Linux
Apache/2.2.20 (build from Ubuntu source packages with debug symbols)
mod_perl 2.0.5 (build from Ubuntu source packages with debug symbols)

Torsten Förtsch wrote:

> Can you install the symbol tables for your modperl and perhaps check the
> values of *tipool in the core? I think it is
>
> tipool->size == tipool->in_use == tipool->cfg->max

3, 0, and 5 respectively for all threads blocked on
modperl_tipool_wait(tipool).

> BTW, there are IMHO many points about the tipool implementation that
can be improved.
> Why do we use these lists? Wouldn't it be better to allocated an array
of tipool->cfg->max
> pointers? Or perhaps an apr_hash_t in pconf?

As I don't understand the inner workings yet, I don't know, and hope
figure out.

Greg
P.S. Since it's been a while, here is the archived thread:
http://www.gossamer-threads.com/lists/modperl/modperl/103558#103558
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk85M4QACgkQF1oFywYE3z7GvACfSZ2uU7Vfnn60rRlEJHBNLkVk
nL8AoO7bz5sEM/B7OSDdZhgxbvi1j7gT
=NCp+
-----END PGP SIGNATURE-----


max at maxbarry

Mar 5, 2012, 5:47 PM

Post #11 of 11 (2654 views)
Permalink
Re: Apache Children Stuck on futex [In reply to]

I'm told there are people watching this issue, so the good news is my
colleague Greg Rubin seems to have tracked down the source of the
problem! There is a patch & description here:

http://www.gossamer-threads.com/lists/modperl/dev/104026

Max.

ModPerl modperl RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.