Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Apache: Users

debugging strategies for httpd that kills the box

 

 

Apache users RSS feed   Index | Next | Previous | View Threaded


evilchili at gmail

Nov 15, 2007, 9:20 AM

Post #1 of 5 (243 views)
Permalink
debugging strategies for httpd that kills the box

Hello List,

I have a FreeBSD 5.5 (amd64) server with the following:

Apache/2.0.61 (FreeBSD)
mod_ssl/2.0.53
OpenSSL/0.9.7e-p1
PHP/4.4.4 with Suhosin-Patch
mod_apreq2-20051231/2.6.0
mod_perl/2.0.3
Perl/v5.8.6

Background: The server was recently updated to FreeBSD 5.5 from 5.3,
which had run for a couple of years without interruption. I'm not
especially convinced the issue I'm having is related to this upgrade,
but it's the most significant change in some time, so it's worth
mentioning. Apache is configured with KeepAlive on, as this box
provides a number of other services, but I noticed last week that the
majority of the requests (specifically those being handled by
HTML::Mason via mod_perl) were not sending a Content-Length header,
and thus the KeepAlive settings were largely useless. I added this
header to the mod_perl output, and began tuning apache to maximize
performance. I got, for us, quite acceptable results using the
following:

<IfModule prefork.c>
StartServers 5
MinSpareServers 5
MaxSpareServers 10
MaxClients 200
MaxRequestsPerChild 0
</IfModule>

The problem: After a few days the server became unstable. Twice the
server was out of swap space and had to be rebooted. While attempting
to solve this problem, I came to understand that the webserver seemed
responsible; for as-yet unknown reason, the box will suddenly lock up,
and if I do not kill httpd within 5 or 10 seconds the server will
become completely unresponsive and need to be rebooted. If left
alone, swap space will be consumed and the kernel will begin killing
off httpd processes, but not fast enough. By altering the apache
config to the following, I am able to avoid the hang:

<IfModule prefork.c>
StartServers 5
MinSpareServers 5
MaxSpareServers 5
MaxClients 40
MaxRequestsPerChild 0
</IfModule>

But naturally this provides less than optimal performance. Other
things I tried, without impact:

-- setting KeepAlive off
-- disabling all vhosts
-- disabling mod_perl2
-- disabling mod_php
-- running httpd -X
-- tweaking/disabling resource limits via Apache2::Resource
-- recompiling httpd
-- recompiling mod_perl2
-- recompiling & reinstalling the freebsd userland

The server has 1gb of ram, 4gb of swap (I recently doubled it in an
attempt to give me more time to diagnose the problem when it occurs --
didn't help, as the lock-up occurs long before swap is consumed), and
in the moments before the hang, shows no particular level of activity
-- swap usage is around 100mb, load is very low (1.5 - 2.0 on a
dual-proc system). There doesn't seem to be any correlation to
specific requests or request types, whether static, cgi, mod_perl or
php. There are no errors in the server logs prior to the lock up (
although once, and only once, i saw "httpd in free(): error: page is
already free" ).

I'm hoping someone here can make some suggestions on how to
troubleshoot the problem further, because I'm kind of at a loss.

I have not (yet) posted this to the mod_perl list or freebsd-users,
though I would be happy to do so if it's thought the problem might lie
within those realms.

Thanks!
Greg

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe [at] httpd
" from the digest: users-digest-unsubscribe [at] httpd
For additional commands, e-mail: users-help [at] httpd


christian.folini at post

Nov 15, 2007, 11:19 PM

Post #2 of 5 (232 views)
Permalink
Re: debugging strategies for httpd that kills the box [In reply to]

Hey Greg,

On Thu, Nov 15, 2007 at 12:20:44PM -0500, greg boyington wrote:
> Background: The server was recently updated to FreeBSD 5.5 from 5.3,
> which had run for a couple of years without interruption. I'm not
> especially convinced the issue I'm having is related to this upgrade,
> ...
> <IfModule prefork.c>
> StartServers 5
> MinSpareServers 5
> MaxSpareServers 10
> MaxClients 200
> MaxRequestsPerChild 0
> </IfModule>


Problems as this are hard to track down. I'll give you a few thoughts
what you could do next. Maybe it helps, but I can not guarantee it.
- Play around with MaxRequestsPerChild. A typical value
outside of 0 is 10000. I guess that's also the default,
but not in your setup.
- Did you compile apache yourself? Give it a try.
- Have you tried to get the lock-up in a lab setup with
apache bench or some other load generating tool?
Guess, this would help.
- You disabled mod_perl2 and mod_php. This sounds as if
you would not really need them. (Or you sacrificed
functionality :)
Disable all modules which you do not really need. Definitely
all of them. If you can boil it down to threadsafe modules,
then you might give it a try with another mpm.
http://httpd.apache.org/docs/2.0/mpm.html

Good luck - and keep us informed on how it goes.

Christian

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe [at] httpd
" from the digest: users-digest-unsubscribe [at] httpd
For additional commands, e-mail: users-help [at] httpd


greg at regex

Nov 16, 2007, 7:06 AM

Post #3 of 5 (228 views)
Permalink
Re: debugging strategies for httpd that kills the box [In reply to]

On Nov 16, 2007 2:19 AM, Christian Folini <christian.folini [at] post> wrote:
> Hey Greg,

Hi Christian, thanks for your response.

> - Play around with MaxRequestsPerChild. A typical value

I will do -- I experimented with low values (100-1000) in addition to
0, but nothing seemed to correlate. In my testing, the hangs would
often occur long before this threshold was ever reached.

> - Did you compile apache yourself? Give it a try.

I do. Most everything is installed from ports; I recompiled by hand
when testing.

> - Have you tried to get the lock-up in a lab setup with
> apache bench or some other load generating tool?

Yup -- ab was used to test each random tweak I made to try and isolate
the problem.

> - You disabled mod_perl2 and mod_php. This sounds as if
> you would not really need them. (Or you sacrificed
> functionality :)

They are both critical to our operation, but since the box was dead in
the water anyway I took the opportunity to turn everything off and
test each one in turn. I only mentioned them in particular because I
figured they were the most likely suspects. :)

> then you might give it a try with another mpm.

That crossed my mind as well, but I haven't had any experience with
mod_perl2 and HTML::Mason under mpm, so I would have to do some
research to see what I was getting into there. In the mean time, The
Powers That Be have decided to build a replacement box so I can pull
this one off the wire and do more testing. Hard to debug a problem
that kills the production machine resonsible for web, storefront,
email and support chat.

> Good luck - and keep us informed on how it goes.

Thanks, I will.

-G

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe [at] httpd
" from the digest: users-digest-unsubscribe [at] httpd
For additional commands, e-mail: users-help [at] httpd


christian.folini at post

Nov 16, 2007, 7:39 AM

Post #4 of 5 (225 views)
Permalink
Re: debugging strategies for httpd that kills the box [In reply to]

On Fri, Nov 16, 2007 at 10:06:50AM -0500, Greg Boyington wrote:
> > - Play around with MaxRequestsPerChild. A typical value
>
> I will do -- I experimented with low values (100-1000) in addition to
> 0, but nothing seemed to correlate. In my testing, the hangs would
> often occur long before this threshold was ever reached.

The directive means to let a child process die and fork
it anew after n individual request for this particular
child process.

You can try and keep an eye on these children with mod_status,
write it all into a logfile and reconstruct when the lockup
occurs. Not sure it's of any use though. ;(

You mentioned httpd -X. Maybe you try and follow
the process(es) during the lockup with strace/truss.

Besides, I forgot to mention, that there is mod_diagnostics
by Nick Kew. Itis mentioned here from time to time.
See http://apache.webthing.com/mod_diagnostics/
Nick will be able to give you specific hints on how to use
it.

mod_forensic can also help to nail down the bad request. But it
sounds, like your problem is not about a specific request.

Otherwise, I am slowly running out of ideas and you have
done all your homework before. It's really hard to recommend
anything here. At least for me, but that does not say much.

regs,

Christian




---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe [at] httpd
" from the digest: users-digest-unsubscribe [at] httpd
For additional commands, e-mail: users-help [at] httpd


greg at regex

Nov 16, 2007, 8:50 AM

Post #5 of 5 (221 views)
Permalink
Re: debugging strategies for httpd that kills the box [In reply to]

On Nov 16, 2007 10:39 AM, Christian Folini <christian.folini [at] post> wrote:
> On Fri, Nov 16, 2007 at 10:06:50AM -0500, Greg Boyington wrote:
> > > - Play around with MaxRequestsPerChild. A typical value
> >
> > I will do -- I experimented with low values (100-1000) in addition to
> > 0, but nothing seemed to correlate. In my testing, the hangs would
> > often occur long before this threshold was ever reached.
>
> The directive means to let a child process die and fork
> it anew after n individual request for this particular
> child process.

Indeed; I monitored the request counts via Apache2::VMonitor;
sometimes processes would reach the MaxRequestsPerChild and die/fork
before the lockup, sometimes not.

I will experiment with mod_status and mod_diagnostics once the box is
off the wire; thanks for the suggestions. I'll post my progress, if
there is any. :)

-G

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe [at] httpd
" from the digest: users-digest-unsubscribe [at] httpd
For additional commands, e-mail: users-help [at] httpd

Apache users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.