Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Varnish: Misc

Varnish constantly restarting

 

 

Varnish misc RSS feed   Index | Next | Previous | View Threaded


alex at bengler

Jul 2, 2010, 4:53 AM

Post #1 of 14 (1451 views)
Permalink
Varnish constantly restarting

Varnish (2.0.6 on Ubuntu Hardy, also tested 2.1.2 from source; Linux
x64) restarts many times per day with the following messages being
logged:

varnishd[9010]: child (9011) Started
varnishd[9010]: Child (9011) said Closed fds: 3 4 5 9 10 12 13
varnishd[9010]: Child (9011) said Child starts
varnishd[9010]: Child (9011) said Ready
varnishd[9010]: Child (9011) not responding to ping, killing it.
varnishd[9010]: Child (9011) not responding to ping, killing it.
varnishd[9010]: Child (9011) died signal=3
varnishd[9010]: Child cleanup complete
varnishd[9010]: child (9315) Started
varnishd[9010]: Child (9315) said Closed fds: 3 4 5 10 11 13 14
varnishd[9010]: Child (9315) said Child starts
varnishd[9010]: Child (9315) said Ready

Tried modifying defaults and switching from file-based cache to
malloc-based; it seems to be hanging less often now, but it still
happens a couple of times a day. Here is my current command line:

varnishd -P /var/run/varnishd.pid \
-a :6081 \
-f /etc/varnish/default.vcl \
-T 127.0.0.1:6082 \
-t 120 \
-w 4,30,120 \
-s malloc,1G \
-p thread_pool_max=300 \
-p ping_interval=5 \
-p cli_timeout=120s \
-u varnish \
-g varnish

At the point when the Varnish child stops responding to ping, CPU
usage blows through the roof. I started a screen running on the box
that logs varnishstat every 10s; this is the last output just abefore
it dies:

uptime 71589 . Child uptime
client_conn 4751731 66.38 Client connections accepted
client_drop 0 0.00 Connection dropped, no sess
client_req 4747499 66.32 Client requests received
cache_hit 4647642 64.92 Cache hits
cache_hitpass 775 0.01 Cache hits for pass
cache_miss 86706 1.21 Cache misses
backend_conn 87236 1.22 Backend conn. success
backend_unhealthy 0 0.00 Backend conn. not attempted
backend_busy 0 0.00 Backend conn. too many
backend_fail 247 0.00 Backend conn. failures
backend_reuse 0 0.00 Backend conn. reuses
backend_toolate 0 0.00 Backend conn. was closed
backend_recycle 0 0.00 Backend conn. recycles
backend_unused 0 0.00 Backend conn. unused
fetch_head 2 0.00 Fetch head
fetch_length 87222 1.22 Fetch with Length
fetch_chunked 0 0.00 Fetch chunked
fetch_eof 0 0.00 Fetch EOF
fetch_bad 0 0.00 Fetch had bad headers
fetch_close 6 0.00 Fetch wanted close
fetch_oldhttp 0 0.00 Fetch pre HTTP/1.1 closed
fetch_zero 0 0.00 Fetch zero len
fetch_failed 0 0.00 Fetch failed
n_srcaddr 0 . N struct srcaddr
n_srcaddr_act 0 . N active struct srcaddr
n_sess_mem 129 . N struct sess_mem
n_sess 1899 . N struct sess
n_object 28283 . N struct object
n_objecthead 28288 . N struct objecthead
n_smf 0 . N struct smf
n_smf_frag 0 . N small free smf
n_smf_large 0 . N large free smf
n_vbe_conn 18446744073709551598 . N struct vbe_conn
n_bereq 30 . N struct bereq
n_wrk 9 . N worker threads
n_wrk_create 682 0.01 N worker threads created
n_wrk_failed 0 0.00 N worker threads not created
n_wrk_max 0 0.00 N worker threads limited
n_wrk_queue 119 0.00 N queued work requests
n_wrk_overflow 5138 0.07 N overflowed work requests
n_wrk_drop 0 0.00 N dropped work requests
n_backend 1 . N backends
n_expired 39742 . N expired objects
n_lru_nuked 19046 . N LRU nuked objects
n_lru_saved 0 . N LRU saved objects
n_lru_moved 2610711 . N LRU moved objects
n_deathrow 0 . N objects on deathrow
losthdr 0 0.00 HTTP header overflows
n_objsendfile 0 0.00 Objects sent with sendfile
n_objwrite 3055715 42.68 Objects sent with write
n_objoverflow 0 0.00 Objects overflowing workspace
s_sess 4751602 66.37 Total Sessions
s_req 4751602 66.37 Total Requests
s_pipe 0 0.00 Total pipe
s_pass 777 0.01 Total pass
s_fetch 87230 1.22 Total fetch
s_hdrbytes 2066945543 28872.39 Total header bytes
s_bodybytes 16985467831 237263.66 Total body bytes
sess_closed 4751602 66.37 Session Closed
sess_pipeline 0 0.00 Session Pipeline
sess_readahead 0 0.00 Session Read Ahead
sess_linger 0 0.00 Session Linger
sess_herd 0 0.00 Session herd
shm_records 215002154 3003.28 SHM records
shm_writes 19146311 267.45 SHM writes
shm_flushes 112 0.00 SHM flushes due to overflow
shm_cont 105277 1.47 SHM MTX contention
shm_cycles 92 0.00 SHM cycles through buffer
sm_nreq 0 0.00 allocator requests
sm_nobj 0 . outstanding allocations
sm_balloc 0 . bytes allocated
sm_bfree 0 . bytes free
sma_nreq 208247 2.91 SMA allocator requests
sma_nobj 55341 . SMA outstanding allocations
sma_nbytes 1073656763 . SMA outstanding bytes
sma_balloc 2560359776 . SMA bytes allocated
sma_bfree 1486703013 . SMA bytes free
sms_nreq 14533 0.20 SMS allocator requests
sms_nobj 0 . SMS outstanding allocations
sms_nbytes 0 . SMS outstanding bytes
sms_balloc 6363722 . SMS bytes allocated
sms_bfree 6363722 . SMS bytes freed
backend_req 87236 1.22 Backend requests made
n_vcl 1 0.00 N vcl total
n_vcl_avail 1 0.00 N vcl available
n_vcl_discard 0 0.00 N vcl discarded
n_purge 1 . N total active purges
n_purge_add 1 0.00 N new purges added
n_purge_retire 0 0.00 N old purges deleted
n_purge_obj_test 0 0.00 N objects tested
n_purge_re_test 0 0.00 N regexps tested against
n_purge_dups 0 0.00 N duplicate purges removed
hcb_nolock 0 0.00 HCB Lookups without lock
hcb_lock 0 0.00 HCB Lookups with lock
hcb_insert 0 0.00 HCB Inserts
esi_parse 0 0.00 Objects ESI parsed (unlock)
esi_errors 0 0.00 ESI parse errors (unlock)

The odd one out is n_vbe_conn, don't know if this is benign.

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc


fla_torres at yahoo

Jul 3, 2010, 10:01 AM

Post #2 of 14 (1370 views)
Permalink
Re: Varnish constantly restarting [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Em 2/7/2010 08:53, Alexander Staubo escreveu:
> varnishd[9010]: Child (9011) said Ready
> varnishd[9010]: Child (9011) not responding to ping, killing it.
> varnishd[9010]: Child (9011) not responding to ping, killing it.
> varnishd[9010]: Child (9011) died signal=3


Hi Alexander,

Have you seen this thread
http://lists.varnish-cache.org/pipermail/varnish-misc/2010-June/004358.html
?

Also, you are having some failure connections to the backend (1). How
is your backend health ?

[]'s

- --

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)

iEYEARECAAYFAkwvbPQACgkQNRQApncg294ZQACgn5jatwv1+cVsYKfua6jQ1lM8
nx4An21oNmnjYihg2P1tTtcZLVP8Osnn
=4NXl
-----END PGP SIGNATURE-----



_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc


alex at bengler

Jul 3, 2010, 10:59 AM

Post #3 of 14 (1367 views)
Permalink
Re: Varnish constantly restarting [In reply to]

On Sat, Jul 3, 2010 at 7:01 PM, Flavio Torres <fla_torres [at] yahoo> wrote:
> Hi Alexander,
>
> Have you seen this thread
> http://lists.varnish-cache.org/pipermail/varnish-misc/2010-June/004358.html
> ?

Yes. That threads seems to be about a different problem. In my case I
don't have any panic messages in the log. Curiously, though, the child
is dying with SIGQUIT. Should it not be producing a core dump
somewhere then?

> Also, you are having some failure connections to the backend (1). How
> is your backend health ?

The backend is HAProxy, which is pretty solid. Might be worth looking
into, but I'm not too concerned about a %0.002 failure rate in this
case.

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc


perbu at varnish-software

Jul 3, 2010, 2:04 PM

Post #4 of 14 (1366 views)
Permalink
Re: Varnish constantly restarting [In reply to]

On Sat, Jul 3, 2010 at 7:59 PM, Alexander Staubo <alex [at] bengler> wrote:
> On Sat, Jul 3, 2010 at 7:01 PM, Flavio Torres <fla_torres [at] yahoo> wrote:
>> Hi Alexander,
>>
>> Have you seen this thread
>> http://lists.varnish-cache.org/pipermail/varnish-misc/2010-June/004358.html
>> ?
>
> Yes. That threads seems to be about a different problem.

I don't think so. Try it.

> In my case I don't have any panic messages in the log.

You're sure you're looking in the right log?

> Curiously, though, the child is dying with SIGQUIT.

Are you setting diag_bitmap? That would make the master send QUIT.

--
Per Buer, Varnish Software
Phone: +47 21 98 92 61 / Mobile: +47 958 39 117 / skype: per.buer

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc


alex at bengler

Jul 3, 2010, 2:16 PM

Post #5 of 14 (1366 views)
Permalink
Re: Varnish constantly restarting [In reply to]

On Sat, Jul 3, 2010 at 11:04 PM, Per Buer <perbu [at] varnish-software> wrote:
> On Sat, Jul 3, 2010 at 7:59 PM, Alexander Staubo <alex [at] bengler> wrote:
>> On Sat, Jul 3, 2010 at 7:01 PM, Flavio Torres <fla_torres [at] yahoo> wrote:
>>> Hi Alexander,
>>>
>>> Have you seen this thread
>>> http://lists.varnish-cache.org/pipermail/varnish-misc/2010-June/004358.html
>>> ?
>>
>> Yes. That threads seems to be about a different problem.
>
> I don't think so. Try it.

I do think so:

(1) The child isn't segfaulting -- it's dying with signal 3 (SIGQUIT),
and only after the parent sends SIGKILL.

(2) I have already increased cli_timeout and decreased the maximum
number of threads, and it's not helping.

(3) The number of threads at the time that the child dies is not high.

After switching to malloc and increasing the cli_timeout, Varnish does
stay up for longer periods of time, but it's still dying at least once
a day.

>> In my case I don't have any panic messages in the log.
>
> You're sure you're looking in the right log?

Yes.

>> Curiously, though, the child is dying with SIGQUIT.
>
> Are you setting diag_bitmap? That would make the master send QUIT.

I'm not.

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc


alex at bengler

Jul 3, 2010, 8:35 PM

Post #6 of 14 (1374 views)
Permalink
Re: Varnish constantly restarting [In reply to]

On Sat, Jul 3, 2010 at 11:04 PM, Per Buer <perbu [at] varnish-software> wrote:
> On Sat, Jul 3, 2010 at 7:59 PM, Alexander Staubo <alex [at] bengler> wrote:
>> Curiously, though, the child is dying with SIGQUIT.
>
> Are you setting diag_bitmap? That would make the master send QUIT.

Actually, it looks like Varnish *will* use SIGQUIT ordinarily, as
diag_bitmap seems to be 0 by default. From mgt_child.c (apparently
0x1000 means "do not core-dump child process"):

if (params->diag_bitmap & 0x1000)
(void)kill(child_pid, SIGKILL);
else
(void)kill(child_pid, SIGQUIT);

That's an odd choice of default behaviour. I would expect core-dumping
to be something you would explicitly turn on, especially given that
the name of the variable is diag_bitmap.

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc


perbu at varnish-software

Jul 5, 2010, 12:08 AM

Post #7 of 14 (1350 views)
Permalink
Re: Varnish constantly restarting [In reply to]

On Sat, Jul 3, 2010 at 11:16 PM, Alexander Staubo <alex [at] bengler> wrote:
> On Sat, Jul 3, 2010 at 11:04 PM, Per Buer <perbu [at] varnish-software> wrote:
>> On Sat, Jul 3, 2010 at 7:59 PM, Alexander Staubo <alex [at] bengler> wrote:
>>> On Sat, Jul 3, 2010 at 7:01 PM, Flavio Torres <fla_torres [at] yahoo> wrote:
>>>> Hi Alexander,
>>>>
>>>> Have you seen this thread
>>>> http://lists.varnish-cache.org/pipermail/varnish-misc/2010-June/004358.html
>>>> ?
>>>
>>> Yes. That threads seems to be about a different problem.
>>
>> I don't think so. Try it.
>
> I do think so:

There is only one place where SIGQUIT is used and that is where you're
hitting it. I'd try to disable the whole check by setting the
cli_timeout ridiculously high. If you have proper monitoring the risk
isn't high. You might be having issues with your io scheduler (cfq
can be a real disaster) or a bigger working set then your memory can
handle (swap on Linux doesn't work very well).

--
Per Buer, Varnish Software
Phone: +47 21 98 92 61 / Mobile: +47 958 39 117 / skype: per.buer

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc


alex at bengler

Jul 5, 2010, 4:46 AM

Post #8 of 14 (1359 views)
Permalink
Re: Varnish constantly restarting [In reply to]

On Mon, Jul 5, 2010 at 9:08 AM, Per Buer <perbu [at] varnish-software> wrote:
> There is only one place where SIGQUIT is used and that is where you're
> hitting it. I'd try to disable the whole check by setting the
> cli_timeout ridiculously high. If you have proper monitoring the risk
> isn't high.

That makes sense. I was slow in catching onto the fact that Varnish
was simply killing its child because it was slow in responding to the
pings; from my point of view it looked more like Varnish was hanging.

> You might be having issues  with your io scheduler (cfq
> can be a real disaster) or a bigger working set then your memory can
> handle (swap on Linux doesn't work very well).

How do I go about determining the underlying cause? As far as I can
see, it's not caused by any increase in traffic.

I am tracking every varnishstat metric using Munin, and not seeing
anything out of the ordinary at the instances when Varnish kills
itself -- nor with any of the other system metrics tracked with Munin.
There should be plenty of memory available, and I'm not seeing any
swapping.

Linux 2.6.24 + CFQ reports weird CPU usage [1] that screws up our
graphs, but that's hardly an indication of anything. However, we
should probably try to switch away from CFQ.

[1] http://grab.by/grabs/af573a30e5be0774edcfffc77294f4d8.png

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc


fla_torres at yahoo

Jul 5, 2010, 8:41 PM

Post #9 of 14 (1351 views)
Permalink
Re: Varnish constantly restarting [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Em 3/7/2010 14:59, Alexander Staubo escreveu:
> Yes. That threads seems to be about a different problem. In my case I
> don't have any panic messages in the log. Curiously, though, the child
> is dying with SIGQUIT.

Alexander,

We had the same problem with the 2.0.6 version running on a VMBox, it
was fixed upgrading to the 2.1.2. Are you still with the 2.1 version ?

I saw you Munin graph, 200% of CPU Usage means your load? What is your
resource ? Did you tune your settings ?


[]'s



- --

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)

iEYEARECAAYFAkwypf0ACgkQNRQApncg295w/wCgyO2hn21rj0g09jxdjh1qZNHJ
cTUAnjF0MyRzhiySJQkWWc/NY5+ZYgQM
=MPJz
-----END PGP SIGNATURE-----



_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc


checker at d6

Jul 5, 2010, 10:10 PM

Post #10 of 14 (1361 views)
Permalink
Re: Varnish constantly restarting [In reply to]

> We had the same problem with the 2.0.6 version

Ugh, this is the version that's installed on my CentOS box that I was
about to start using...are there known problems with this version?

Chris



Flavio Torres wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Em 3/7/2010 14:59, Alexander Staubo escreveu:
>> Yes. That threads seems to be about a different problem. In my case I
>> don't have any panic messages in the log. Curiously, though, the child
>> is dying with SIGQUIT.
>
> Alexander,
>
> We had the same problem with the 2.0.6 version running on a VMBox, it
> was fixed upgrading to the 2.1.2. Are you still with the 2.1 version ?
>
> I saw you Munin graph, 200% of CPU Usage means your load? What is your
> resource ? Did you tune your settings ?
>
>
> []'s
>
>
>
> - --
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.8 (MingW32)
>
> iEYEARECAAYFAkwypf0ACgkQNRQApncg295w/wCgyO2hn21rj0g09jxdjh1qZNHJ
> cTUAnjF0MyRzhiySJQkWWc/NY5+ZYgQM
> =MPJz
> -----END PGP SIGNATURE-----
>
>
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc [at] varnish-cache
> http://lists.varnish-cache.org/mailman/listinfo/varnish-misc
>

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc


alex at bengler

Jul 6, 2010, 5:05 AM

Post #11 of 14 (1344 views)
Permalink
Re: Varnish constantly restarting [In reply to]

On Tue, Jul 6, 2010 at 5:41 AM, Flavio Torres <fla_torres [at] yahoo> wrote:
> We had the same problem with the 2.0.6 version running on a VMBox, it
> was fixed upgrading to the 2.1.2. Are you still with the 2.1 version ?

This is 2.0.6 as we are on Ubuntu Hardy, but I ran 2.1.2 from source
for a while and saw the exact same behaviour then.

> I saw you Munin graph, 200% of CPU Usage means your load? What is your
> resource ? Did you tune your settings ?

I included the Munin graph to show how weird it was (look at the idle
percentages), and I think the weirdness is related to the scheduler.
What do you mean by "resource"?

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc


fla_torres at yahoo

Jul 6, 2010, 7:52 AM

Post #12 of 14 (1343 views)
Permalink
Re: Varnish constantly restarting [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/06/2010 09:05 AM, Alexander Staubo wrote:
> On Tue, Jul 6, 2010 at 5:41 AM, Flavio Torres
> <fla_torres [at] yahoo> wrote:
>> We had the same problem with the 2.0.6 version running on a
>> VMBox, it was fixed upgrading to the 2.1.2. Are you still with
>> the 2.1 version ?
>
> This is 2.0.6 as we are on Ubuntu Hardy, but I ran 2.1.2 from
> source for a while and saw the exact same behaviour then.
>


Would be nice keep 2.1.2 and tune it (varnish settings and SO -
sysctl) - [1]



> What do you mean by "resource"?


CPU, Memory, disk. Is your varnish server a DomU ?



[1] http://varnish-cache.org/wiki/Performance,
http://lists.varnish-cache.org/pipermail/varnish-misc/2008-April/001763.html


[]'s

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwzQysACgkQNRQApncg297RVACfXu3TvYLnvEWgD0wrdOd99MB9
Ln4An3Na0/BLGTTTdWo5m+Lt4mlzO+g4
=d5By
-----END PGP SIGNATURE-----


perbu at varnish-software

Jul 7, 2010, 12:22 AM

Post #13 of 14 (1337 views)
Permalink
Re: Varnish constantly restarting [In reply to]

On Tue, Jul 6, 2010 at 7:10 AM, Chris Hecker <checker [at] d6> wrote:
>
>> We had the same problem with the 2.0.6 version
>
> Ugh, this is the version that's installed on my CentOS box that I was about
> to start using...are there known problems with this version?

Not really. 2.0.6 is a very good release still.

--
Per Buer, Varnish Software
Phone: +47 21 98 92 61 / Mobile: +47 958 39 117 / skype: per.buer

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc


alex at bengler

Jul 13, 2010, 8:18 AM

Post #14 of 14 (1320 views)
Permalink
Re: Varnish constantly restarting [In reply to]

On Tue, Jul 6, 2010 at 4:52 PM, Flavio Torres <fla_torres [at] yahoo> wrote:
> CPU, Memory, disk.

2 x quad-core Intel X5570, 8MB L2 cache, 24GB RAM, SCSI disk.

Ubuntu Hardy running Linux 2.6.24-24, 64-bit, CFQ scheduler.

I don't think this is a tuning issue. There are no signs that Varnish
is being overloaded when it hangs; every metric we are tracking is
normal at the time of each incident.

> Is your varnish server a DomU ?

No virtualization.

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc

Varnish misc RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.