Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Varnish: Bugs

#1083: Persistent Varnish crashes since using bans and lurker

 

 

Varnish bugs RSS feed   Index | Next | Previous | View Threaded


varnish-bugs at varnish-cache

Jan 10, 2012, 2:40 AM

Post #1 of 10 (338 views)
Permalink
#1083: Persistent Varnish crashes since using bans and lurker

#1083: Persistent Varnish crashes since using bans and lurker
-------------------------+--------------------------------------------------
Reporter: rmohrbacher | Type: defect
Status: new | Priority: high
Milestone: | Component: varnishd
Version: 3.0.2 | Severity: major
Keywords: |
-------------------------+--------------------------------------------------
We use a farm with three persistent Varnishes (-s
persistent,/cms/varnish_cache/persistent/varnish_storage.bin,204800M").

This Varnishes runs since 3 months without any crashes (in the moment not
in production, but stressed with several stress tests).

Since some days, we use bans and the lurker process (lurker-friendly bans
via: ban("obj.http.x-url ~ " + req.url);
We have about 250 bans/hour.

Now we have the big problem, that the varnishes crashes after some hours.
Curios: all three Varnishes crashes in the same moment. And they runs on
three different Servers!

The follow part from syslog suggest, that there is an problem with an
invalid ban:


Jan 9 19:40:32 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
died signal=6
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
Panic message: Missing errorhandling code in smp_append_sign(),
storage_persistent_subr.c line 128:#012 Condition((smp_chk_sign(ctx)) ==
0) not true.thread = (cache-worker)#012ident =
Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,epoll#012Backtrace:#012
0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012 0x44a346:
/usr/sbin/varnishd(smp_append_sign+0x126) [0x44a346]#012 0x447b6d:
/usr/sbin/varnishd(SMP_NewBan+0x3d) [0x447b6d]#012 0x4125c7:
/usr/sbin/varnishd(BAN_Insert+0x1a7) [0x4125c7]#012 0x433bd5:
/usr/sbin/varnishd(VRT_ban_string+0xc5) [0x433bd5]#012 0x7f91f39fa4be:
./vcl.PNU3fGhs.so(+0x24be) [0x7f91f39fa4be]#012 0x433863:
/usr/sbin/varnishd(VCL_recv_method+0x43) [0x433863]#012 0x417c22:
/usr/sbin/varnishd(CNT_Session+0xb62) [0x417c22]#012 0x42efb8:
/usr/sbin/varnishd() [0x42efb8]#012 0x42e19b: /usr/sbin/varnishd()
[0x42e19b]#012sp = 0x7f91ed4ab008 {#012 fd = 15, id = 15, xid =
683670119,#012 client = 172.27.70.103 36115,#012 step = STP_RECV,#012
handling = deliver,#012 restarts = 0, esi_level = 0#012 flags = #012
bodystatus = 4#012 ws = 0x7f91ed4ab080 { #012 id = "sess",#012
{s,f,r,e} = {0x7f91ed4abc90,+56,(nil),+65536},#012 },#012 http[req] =
{#012 ws = 0x7f91ed4ab080[sess]#012 "PURGE",#012
"105867846",#012 "HTTP/1.0",#012 },#012 worker = 0x7f91ef1faa80
{#012 ws = 0x7f91ef1facc0 { #012 id = "wrk",#012 {s,f,r,e} =
{0x7f91ef1e8a30,+32,(nil),+65536},#012 },#012 },#012 vcl = {#012
srcname = {#012 "input",#012 "Default",#012 },#012
},#012},#012
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: child (6907)
Started
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Pushing vcls
failed:#012CLI communication error (hdr)
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
died signal=6
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
Panic message: Assert error in smp_open(), storage_persistent.c line
320:#012 Condition((smp_valid_silo(sc)) == 0) not true.#012thread =
(cache-main)#012ident =
Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,no_waiter#012Backtrace:#012
0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012 0x44756a:
/usr/sbin/varnishd() [0x44756a]#012 0x444d57:
/usr/sbin/varnishd(STV_open+0x27) [0x444d57]#012 0x42b525:
/usr/sbin/varnishd(child_main+0xc5) [0x42b525]#012 0x43d5ec:
/usr/sbin/varnishd() [0x43d5ec]#012 0x43de7c: /usr/sbin/varnishd()
[0x43de7c]#012 0x7f92015684c7: /usr/lib64/varnish/libvarnish.so(+0x94c7)
[0x7f92015684c7]#012 0x7f9201568b58:
/usr/lib64/varnish/libvarnish.so(vev_schedule+0x88) [0x7f9201568b58]#012
0x43d7c2: /usr/sbin/varnishd(MGT_Run+0x132) [0x43d7c2]#012 0x44cacb:
/usr/sbin/varnishd(main+0xd1b) [0x44cacb]#012
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
said Child starts
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
said CHK(0x7f91ffd26120 BAN 1 0x7f34723f4000 BAN 1) = 4
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1083>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Oct 29, 2012, 4:17 AM

Post #2 of 10 (260 views)
Permalink
Re: #1083: Persistent Varnish crashes since using bans and lurker [In reply to]

#1083: Persistent Varnish crashes since using bans and lurker
-------------------------+---------------------
Reporter: rmohrbacher | Owner: martin
Type: defect | Status: new
Priority: high | Milestone:
Component: varnishd | Version: 3.0.2
Severity: major | Resolution:
Keywords: |
-------------------------+---------------------
Changes (by phk):

* owner: => martin


--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1083#comment:1>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Oct 29, 2012, 4:18 AM

Post #3 of 10 (260 views)
Permalink
Re: #1083: Persistent Varnish crashes since using bans and lurker [In reply to]

#1083: Persistent Varnish crashes since using bans and lurker
-------------------------+---------------------
Reporter: rmohrbacher | Owner: martin
Type: defect | Status: new
Priority: high | Milestone:
Component: varnishd | Version: 3.0.2
Severity: major | Resolution:
Keywords: |
-------------------------+---------------------
Description changed by tfheen:

Old description:

> We use a farm with three persistent Varnishes (-s
> persistent,/cms/varnish_cache/persistent/varnish_storage.bin,204800M").
>
> This Varnishes runs since 3 months without any crashes (in the moment not
> in production, but stressed with several stress tests).
>
> Since some days, we use bans and the lurker process (lurker-friendly bans
> via: ban("obj.http.x-url ~ " + req.url);
> We have about 250 bans/hour.
>
> Now we have the big problem, that the varnishes crashes after some hours.
> Curios: all three Varnishes crashes in the same moment. And they runs on
> three different Servers!
>
> The follow part from syslog suggest, that there is an problem with an
> invalid ban:
>

> Jan 9 19:40:32 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
> said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
> died signal=6
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
> Panic message: Missing errorhandling code in smp_append_sign(),
> storage_persistent_subr.c line 128:#012 Condition((smp_chk_sign(ctx)) ==
> 0) not true.thread = (cache-worker)#012ident =
> Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,epoll#012Backtrace:#012
> 0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012 0x44a346:
> /usr/sbin/varnishd(smp_append_sign+0x126) [0x44a346]#012 0x447b6d:
> /usr/sbin/varnishd(SMP_NewBan+0x3d) [0x447b6d]#012 0x4125c7:
> /usr/sbin/varnishd(BAN_Insert+0x1a7) [0x4125c7]#012 0x433bd5:
> /usr/sbin/varnishd(VRT_ban_string+0xc5) [0x433bd5]#012 0x7f91f39fa4be:
> ./vcl.PNU3fGhs.so(+0x24be) [0x7f91f39fa4be]#012 0x433863:
> /usr/sbin/varnishd(VCL_recv_method+0x43) [0x433863]#012 0x417c22:
> /usr/sbin/varnishd(CNT_Session+0xb62) [0x417c22]#012 0x42efb8:
> /usr/sbin/varnishd() [0x42efb8]#012 0x42e19b: /usr/sbin/varnishd()
> [0x42e19b]#012sp = 0x7f91ed4ab008 {#012 fd = 15, id = 15, xid =
> 683670119,#012 client = 172.27.70.103 36115,#012 step = STP_RECV,#012
> handling = deliver,#012 restarts = 0, esi_level = 0#012 flags = #012
> bodystatus = 4#012 ws = 0x7f91ed4ab080 { #012 id = "sess",#012
> {s,f,r,e} = {0x7f91ed4abc90,+56,(nil),+65536},#012 },#012 http[req] =
> {#012 ws = 0x7f91ed4ab080[sess]#012 "PURGE",#012
> "105867846",#012 "HTTP/1.0",#012 },#012 worker = 0x7f91ef1faa80
> {#012 ws = 0x7f91ef1facc0 { #012 id = "wrk",#012 {s,f,r,e} =
> {0x7f91ef1e8a30,+32,(nil),+65536},#012 },#012 },#012 vcl = {#012
> srcname = {#012 "input",#012 "Default",#012 },#012
> },#012},#012
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: child (6907)
> Started
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Pushing vcls
> failed:#012CLI communication error (hdr)
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
> died signal=6
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
> Panic message: Assert error in smp_open(), storage_persistent.c line
> 320:#012 Condition((smp_valid_silo(sc)) == 0) not true.#012thread =
> (cache-main)#012ident =
> Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,no_waiter#012Backtrace:#012
> 0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012 0x44756a:
> /usr/sbin/varnishd() [0x44756a]#012 0x444d57:
> /usr/sbin/varnishd(STV_open+0x27) [0x444d57]#012 0x42b525:
> /usr/sbin/varnishd(child_main+0xc5) [0x42b525]#012 0x43d5ec:
> /usr/sbin/varnishd() [0x43d5ec]#012 0x43de7c: /usr/sbin/varnishd()
> [0x43de7c]#012 0x7f92015684c7: /usr/lib64/varnish/libvarnish.so(+0x94c7)
> [0x7f92015684c7]#012 0x7f9201568b58:
> /usr/lib64/varnish/libvarnish.so(vev_schedule+0x88) [0x7f9201568b58]#012
> 0x43d7c2: /usr/sbin/varnishd(MGT_Run+0x132) [0x43d7c2]#012 0x44cacb:
> /usr/sbin/varnishd(main+0xd1b) [0x44cacb]#012
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
> said Child starts
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
> said CHK(0x7f91ffd26120 BAN 1 0x7f34723f4000 BAN 1) = 4
> Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
> said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1

New description:

We use a farm with three persistent Varnishes (-s
persistent,/cms/varnish_cache/persistent/varnish_storage.bin,204800M").

This Varnishes runs since 3 months without any crashes (in the moment not
in production, but stressed with several stress tests).

Since some days, we use bans and the lurker process (lurker-friendly bans
via: ban("obj.http.x-url ~ " + req.url);
We have about 250 bans/hour.

Now we have the big problem, that the varnishes crashes after some hours.
Curios: all three Varnishes crashes in the same moment. And they runs on
three different Servers!

The follow part from syslog suggest, that there is an problem with an
invalid ban:

{{{
Jan 9 19:40:32 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
died signal=6
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
Panic message: Missing errorhandling code in smp_append_sign(),
storage_persistent_subr.c line 128:#012 Condition((smp_chk_sign(ctx)) ==
0) not true.thread = (cache-worker)#012ident =
Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,epoll#012Backtrace:#012
0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012 0x44a346:
/usr/sbin/varnishd(smp_append_sign+0x126) [0x44a346]#012 0x447b6d:
/usr/sbin/varnishd(SMP_NewBan+0x3d) [0x447b6d]#012 0x4125c7:
/usr/sbin/varnishd(BAN_Insert+0x1a7) [0x4125c7]#012 0x433bd5:
/usr/sbin/varnishd(VRT_ban_string+0xc5) [0x433bd5]#012 0x7f91f39fa4be:
./vcl.PNU3fGhs.so(+0x24be) [0x7f91f39fa4be]#012 0x433863:
/usr/sbin/varnishd(VCL_recv_method+0x43) [0x433863]#012 0x417c22:
/usr/sbin/varnishd(CNT_Session+0xb62) [0x417c22]#012 0x42efb8:
/usr/sbin/varnishd() [0x42efb8]#012 0x42e19b: /usr/sbin/varnishd()
[0x42e19b]#012sp = 0x7f91ed4ab008 {#012 fd = 15, id = 15, xid =
683670119,#012 client = 172.27.70.103 36115,#012 step = STP_RECV,#012
handling = deliver,#012 restarts = 0, esi_level = 0#012 flags = #012
bodystatus = 4#012 ws = 0x7f91ed4ab080 { #012 id = "sess",#012
{s,f,r,e} = {0x7f91ed4abc90,+56,(nil),+65536},#012 },#012 http[req] =
{#012 ws = 0x7f91ed4ab080[sess]#012 "PURGE",#012
"105867846",#012 "HTTP/1.0",#012 },#012 worker = 0x7f91ef1faa80
{#012 ws = 0x7f91ef1facc0 { #012 id = "wrk",#012 {s,f,r,e} =
{0x7f91ef1e8a30,+32,(nil),+65536},#012 },#012 },#012 vcl = {#012
srcname = {#012 "input",#012 "Default",#012 },#012
},#012},#012
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: child (6907)
Started
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Pushing vcls
failed:#012CLI communication error (hdr)
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
died signal=6
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
Panic message: Assert error in smp_open(), storage_persistent.c line
320:#012 Condition((smp_valid_silo(sc)) == 0) not true.#012thread =
(cache-main)#012ident =
Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,no_waiter#012Backtrace:#012
0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012 0x44756a:
/usr/sbin/varnishd() [0x44756a]#012 0x444d57:
/usr/sbin/varnishd(STV_open+0x27) [0x444d57]#012 0x42b525:
/usr/sbin/varnishd(child_main+0xc5) [0x42b525]#012 0x43d5ec:
/usr/sbin/varnishd() [0x43d5ec]#012 0x43de7c: /usr/sbin/varnishd()
[0x43de7c]#012 0x7f92015684c7: /usr/lib64/varnish/libvarnish.so(+0x94c7)
[0x7f92015684c7]#012 0x7f9201568b58:
/usr/lib64/varnish/libvarnish.so(vev_schedule+0x88) [0x7f9201568b58]#012
0x43d7c2: /usr/sbin/varnishd(MGT_Run+0x132) [0x43d7c2]#012 0x44cacb:
/usr/sbin/varnishd(main+0xd1b) [0x44cacb]#012
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
said Child starts
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
said CHK(0x7f91ffd26120 BAN 1 0x7f34723f4000 BAN 1) = 4
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
}}}

--

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1083#comment:2>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Apr 14, 2013, 3:30 AM

Post #4 of 10 (197 views)
Permalink
Re: #1083: Persistent Varnish crashes since using bans and lurker [In reply to]

#1083: Persistent Varnish crashes since using bans and lurker
-------------------------+---------------------
Reporter: rmohrbacher | Owner: martin
Type: defect | Status: new
Priority: high | Milestone:
Component: varnishd | Version: 3.0.2
Severity: major | Resolution:
Keywords: |
-------------------------+---------------------

Comment (by numard):

I can confirm this happened on 3.0.2-1~1lucid1 (once every ~ 8 hours ). I
upgraded to to 3.0.3-1~precise , and it happens also, but it seems, so
far, less often (~ 18 hours ).

We have 2 x servers with similar usage pattern as @mohrbacher's :
- file storage
- no issues for a long time
- we started pushing a lot more bans, and the issues started to happen.

Varnish (3.0.3-1~precise package from http://repo.varnish-
cache.org/ubuntu/, ubuntu Precise 12.0.4 LTS ) is acting as a cache for S3
objects. It runs as :
{{{
/usr/sbin/varnishd -P /var/run/varnishd.pid -a :80 -p thread_pool_min 200
-p thread_pool_max 4000 -p thread_pool_add_delay 2 -p http_req_hdr_len
10240 -p http_req_size 65536 -p first_byte_timeout 300 -T localhost:6082
-f /etc/varnish/default.vcl -S /etc/varnish/secret -s
persistent,/mnt/varnish_store,360G
}}}
Running on AWS, m1.medium, no apparent constraints on memory, none on cpu
nor i/o.

When child process dies, panic.list shows:


{{{
varnish> panic.show
200
Last panic at: Sun, 14 Apr 2013 09:57:34 GMT
Missing errorhandling code in smp_append_sign(), storage_persistent_subr.c
line 128:
Condition((smp_chk_sign(ctx)) == 0) not true.thread = (cache-worker)
ident =
Linux,3.2.0-40-virtual,x86_64,-spersistent,-smalloc,-hcritbit,epoll
Backtrace:
0x4310e5: /usr/sbin/varnishd() [0x4310e5]
0x4514d8: /usr/sbin/varnishd(smp_append_sign+0x128) [0x4514d8]
0x44f1da: /usr/sbin/varnishd(SMP_NewBan+0x3a) [0x44f1da]
0x4158d2: /usr/sbin/varnishd(BAN_Insert+0x1a2) [0x4158d2]
0x439fa8: /usr/sbin/varnishd(VRT_ban_string+0xb8) [0x439fa8]
0x7f6391ef60c7: ./vcl.LQXRTnfB.so(+0x20c7) [0x7f6391ef60c7]
0x437f48: /usr/sbin/varnishd(VCL_recv_method+0x48) [0x437f48]
0x41946b: /usr/sbin/varnishd(CNT_Session+0xf2b) [0x41946b]
0x432ee5: /usr/sbin/varnishd() [0x432ee5]
0x7fbd9bb5de9a: /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)
[0x7fbd9bb5de9a]
sp = 0x7f62c8cda008 {
fd = 12, id = 12, xid = 1800342971,
client = 10.32.37.110 49187,
step = STP_RECV,
handling = deliver,
restarts = 0, esi_level = 0
flags =
bodystatus = 4
ws = 0x7f62c8cda080 {
id = "sess",
{s,f,r,e} = {0x7f62c8cdac78,+168,(nil),+65536},
},
http[req] = {
ws = 0x7f62c8cda080[sess]
"BAN",
"/xxxxs3bucketxxxx/path1/key2/key3",
"HTTP/1.1",
"Accept: */*",
"host: s3.amazonaws.com",
},
worker = 0x7f632d629ac0 {
ws = 0x7f632d629cf8 {
id = "wrk",
{s,f,r,e} = {0x7f632d617a50,+56,(nil),+65536},
},
},
vcl = {
srcname = {
"input",
"Default",
},
},
},

}}}

-----

Both servers get each ban request needed (they are behind load balancers
with non-deterministic choosing of the varnish server), but the url shown
in the panic dumps are different (though of the same 'type' - if it
matters i can show examples).

I'm willing to test a patch on production ASAP if it exists...

Cheers,
Beto

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1083#comment:3>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Apr 14, 2013, 4:09 AM

Post #5 of 10 (192 views)
Permalink
Re: #1083: Persistent Varnish crashes since using bans and lurker [In reply to]

#1083: Persistent Varnish crashes since using bans and lurker
-------------------------+---------------------
Reporter: rmohrbacher | Owner: martin
Type: defect | Status: new
Priority: high | Milestone:
Component: varnishd | Version: 3.0.2
Severity: major | Resolution:
Keywords: |
-------------------------+---------------------

Comment (by numard):

When using varnish 3.0.2, on each crash, I had to delete the persistent
file storage - if not deleted, the child process would fail pretty much
right away on start.

3.0.3 has only failed me once so far, restarting without deleting the
persistent file storage has worked this time...

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1083#comment:4>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Apr 14, 2013, 10:38 PM

Post #6 of 10 (192 views)
Permalink
Re: #1083: Persistent Varnish crashes since using bans and lurker [In reply to]

#1083: Persistent Varnish crashes since using bans and lurker
-------------------------+---------------------
Reporter: rmohrbacher | Owner: martin
Type: defect | Status: new
Priority: high | Milestone:
Component: varnishd | Version: 3.0.2
Severity: major | Resolution:
Keywords: |
-------------------------+---------------------

Comment (by numard):

Another crash, not as much info on panic.show, but points to similar piece
of code.

{{{
varnish> panic.show
200
Last panic at: Mon, 15 Apr 2013 05:29:32 GMT
Missing errorhandling code in smp_append_sign(), storage_persistent_subr.c
line 128:
Condition((smp_chk_sign(ctx)) == 0) not true.errno = 22 (Invalid
argument)
thread = (cache-main)
ident =
Linux,3.2.0-40-virtual,x86_64,-spersistent,-smalloc,-hcritbit,no_waiter
Backtrace:
0x4310e5: /usr/sbin/varnishd() [0x4310e5]
0x4514d8: /usr/sbin/varnishd(smp_append_sign+0x128) [0x4514d8]
0x44f1da: /usr/sbin/varnishd(SMP_NewBan+0x3a) [0x44f1da]
0x416446: /usr/sbin/varnishd(BAN_Compile+0x66) [0x416446]
0x42fe0a: /usr/sbin/varnishd(child_main+0xca) [0x42fe0a]
0x443767: /usr/sbin/varnishd() [0x443767]
0x443caf: /usr/sbin/varnishd() [0x443caf]
0x7f7ceb8bac82: /usr/lib/varnish/libvarnish.so(+0x9c82) [0x7f7ceb8bac82]
0x7f7ceb8bb348: /usr/lib/varnish/libvarnish.so(vev_schedule+0x98)
[0x7f7ceb8bb348]
0x443f87: /usr/sbin/varnishd(MGT_Run+0x137) [0x443f87]



varnish>
}}}

within minutes of the first server dying, the 2nd server crashed as well,
implying there is a limit we are hitting (or a leak we are hitting...)

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1083#comment:5>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Apr 16, 2013, 6:28 AM

Post #7 of 10 (190 views)
Permalink
Re: #1083: Persistent Varnish crashes since using bans and lurker [In reply to]

#1083: Persistent Varnish crashes since using bans and lurker
-------------------------+---------------------
Reporter: rmohrbacher | Owner: martin
Type: defect | Status: new
Priority: high | Milestone:
Component: varnishd | Version: 3.0.2
Severity: major | Resolution:
Keywords: |
-------------------------+---------------------

Comment (by martin):

This is because in stock Varnish persistence, the persisted ban space is a
fixed size that cannot be reclaimed. When the space is exhausted, the silo
becomes unusable. So in stock Varnish persistence, it is not advisable to
rely on bans for cache invalidation.

This is planned fixed with Varnish release 4, that will contain fixes in
this area. Leaving ticket open until the necessary bits have been fully
merged.

(The -plus branch contains a preview of fixes that correct this behavior.
While this is open source, there is no community driven support available
(https://github.com/mbgrydeland/varnish-cache/tree/3.0.3-plus)).

Regards,
Martin Blix Grydeland

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1083#comment:6>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Apr 16, 2013, 8:05 AM

Post #8 of 10 (189 views)
Permalink
Re: #1083: Persistent Varnish crashes since using bans and lurker [In reply to]

#1083: Persistent Varnish crashes since using bans and lurker
-------------------------+---------------------
Reporter: rmohrbacher | Owner: martin
Type: defect | Status: new
Priority: high | Milestone:
Component: varnishd | Version: 3.0.2
Severity: major | Resolution:
Keywords: |
-------------------------+---------------------

Comment (by numard):

Understood, thanks for the (public) update :) .

If i understand this correctly, PURGE requests don't suffer from the fixed
'list size' problem, but they happen synchronously with the purge request?

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1083#comment:7>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Apr 23, 2013, 8:11 AM

Post #9 of 10 (176 views)
Permalink
Re: #1083: Persistent Varnish crashes since using bans and lurker [In reply to]

#1083: Persistent Varnish crashes since using bans and lurker
-------------------------+---------------------
Reporter: rmohrbacher | Owner: martin
Type: defect | Status: new
Priority: high | Milestone:
Component: varnishd | Version: 3.0.2
Severity: major | Resolution:
Keywords: |
-------------------------+---------------------

Comment (by numard):

Hi Martin,
I've had this in production for a few days - not great results I'm afraid.
Tried the 3.0.3-plus branch, as well as 'persistent'. Crashes still
happen, though not exact output in panic (as expected).

From branch persistent, panic is :


{{{
Last panic at: Mon, 22 Apr 2013 01:40:53 GMT
Assert error in smp_appendban(), storage/storage_persistent.c line 98:
Condition(4 + sizeof t + 4 + len < left) not true.
thread = (cache-main)
ident =
Linux,3.2.0-40-virtual,x86_64,-spersistent,-smalloc,-hcritbit,epoll
Backtrace:
0x4331d5: pan_ic+d5
0x455f05: smp_appendban+e5
0x45603e: smp_newban+4e
0x451ba9: STV_NewBan+29
0x4157d3: BAN_Compile+43
0x431e0a: child_main+10a
0x4491a2: start_child+962
0x4496df: mgt_sigchld+51f
0x7f2c671874d2: _end+7f2c66afbcea
0x7f2c67187b98: _end+7f2c66afc3b0
}}}

from 3.0.3-plus, almost identical to previous,

{{{
Assert error in smp_appendban(), storage_persistent.c line 97:
Condition(4 + sizeof t + 4 + len < left) not true.
thread = (cache-main)
ident =
Linux,3.2.0-40-virtual,x86_64,-spersistent,-smalloc,-hcritbit,no_waiter
Backtrace:
0x432b75: pan_ic+d5
0x451465: smp_appendban+e5
0x45159e: smp_newban+4e
0x44d969: STV_NewBan+29
0x416853: BAN_Compile+43
0x43187a: child_main+ca
0x445ed7: start_child+8c7
0x44641f: mgt_sigchld+51f
0x7fc825ea5312: _end+7fc82581ff62
0x7fc825ea59d8: _end+7fc825820628
}}}

I'm building on up to date Precise 12.0.4 LTS, with nothing special :

{{{
./autogen.sh
./configure --exec-prefix=/usr
make
make install
ldconfig

}}}

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1083#comment:8>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Jul 15, 2013, 3:24 AM

Post #10 of 10 (82 views)
Permalink
Re: #1083: Persistent Varnish crashes since using bans and lurker [In reply to]

#1083: Persistent Varnish crashes since using bans and lurker
-------------------------+------------------------
Reporter: rmohrbacher | Owner: martin
Type: defect | Status: closed
Priority: high | Milestone:
Component: varnishd | Version: 3.0.2
Severity: major | Resolution: duplicate
Keywords: |
-------------------------+------------------------
Changes (by phk):

* status: new => closed
* resolution: => duplicate


Comment:

Linking this to our Catch-All persistent ticket #1053

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1083#comment:9>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs

Varnish bugs RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.