
varnish-bugs at varnish-cache
Aug 5, 2013, 3:02 AM
Post #2 of 4
(22 views)
Permalink
|
#1331: Varnish coredump every day -------------------------+-------------------- Reporter: jinjian.1@… | Owner: Type: defect | Status: new Priority: high | Milestone: Component: varnishd | Version: 3.0.3 Severity: critical | Resolution: Keywords: coredump | -------------------------+-------------------- Description changed by tfheen: Old description: > we encountered varnish coredump issue everyday in this week. My version > is 3.0.3 > > From var/log/messages: > > Aug 2 07:50:26 ip-10-36-1-238 varnishd[28776]: Child (28777) not > responding to CLI, killing it. > Aug 2 07:50:36 ip-10-36-1-238 varnishd[28776]: Child (28777) not > responding to CLI, killing it. > Aug 2 07:50:47 ip-10-36-1-238 varnishd[28776]: Child (28777) not > responding to CLI, killing it. > Aug 2 07:50:53 ip-10-36-1-238 stud[10104]: {client} Connection closed > (in data) > Aug 2 07:50:53 ip-10-36-1-238 stud[10104]: ipaddress :10.36.1.238 > accept! > Aug 2 07:50:57 ip-10-36-1-238 varnishd[28776]: Child (28777) not > responding to CLI, killing it. > Aug 2 07:51:02 ip-10-36-1-238 stud[10104]: {backend} Connection reset by > peer > Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: Child (28777) not > responding to CLI, killing it. > Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: Child (28777) not > responding to CLI, killing it. > Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: Child (28777) died > signal=3 (core dumped) > Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: child (20041) Started > Aug 2 07:51:04 ip-10-36-1-238 varnishd[28776]: Child (20041) said Child > starts > > from coredump: > > (gdb) bt > #0 0x00007fdce4b41054 in __lll_lock_wait () from /lib64/libpthread.so.0 > #1 0x00007fdce4b3c388 in _L_lock_854 () from /lib64/libpthread.so.0 > #2 0x00007fdce4b3c257 in pthread_mutex_lock () from > /lib64/libpthread.so.0 > #3 0x0000000000434350 in vsl_get () > #4 0x0000000000434508 in VSLR () > #5 0x00000000004346d2 in VSL () > #6 0x00007fdce66d2d95 in cls_vlu2 (priv=0x7fdce3d42780, > av=0x7fd96e85b500) at cli_serve.c:292 > #7 0x00007fdce66d347b in cls_vlu (priv=0x7fdce3d42780, p=0x2 <Address > 0x2 out of bounds>) at cli_serve.c:339 > #8 0x00007fdce66d6e09 in LineUpProcess (l=0x7fdce3d1d730) at vlu.c:154 > #9 0x00007fdce66d3e7d in VCLS_Poll (cs=0x7fdce3d03290, timeout=<value > optimized out>) at cli_serve.c:528 > #10 0x000000000041aa41 in CLI_Run () > #11 0x000000000042ea01 in child_main () > #12 0x000000000044155c in start_child () > #13 0x0000000000441ee8 in MGT_Run () > #14 0x000000000045037f in main () > > Our system is down for almost 1 minute during the recover process. > > The issue is very similar with https://www.varnish- > cache.org/trac/ticket/516 and https://www.varnish- > cache.org/trac/ticket/1054. But i could not find any solution there. Do > anybody could put some lights on it? New description: we encountered varnish coredump issue everyday in this week. My version is 3.0.3 From var/log/messages: {{{ Aug 2 07:50:26 ip-10-36-1-238 varnishd[28776]: Child (28777) not responding to CLI, killing it. Aug 2 07:50:36 ip-10-36-1-238 varnishd[28776]: Child (28777) not responding to CLI, killing it. Aug 2 07:50:47 ip-10-36-1-238 varnishd[28776]: Child (28777) not responding to CLI, killing it. Aug 2 07:50:53 ip-10-36-1-238 stud[10104]: {client} Connection closed (in data) Aug 2 07:50:53 ip-10-36-1-238 stud[10104]: ipaddress :10.36.1.238 accept! Aug 2 07:50:57 ip-10-36-1-238 varnishd[28776]: Child (28777) not responding to CLI, killing it. Aug 2 07:51:02 ip-10-36-1-238 stud[10104]: {backend} Connection reset by peer Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: Child (28777) not responding to CLI, killing it. Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: Child (28777) not responding to CLI, killing it. Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: Child (28777) died signal=3 (core dumped) Aug 2 07:51:02 ip-10-36-1-238 varnishd[28776]: child (20041) Started Aug 2 07:51:04 ip-10-36-1-238 varnishd[28776]: Child (20041) said Child starts }}} from coredump: {{{ (gdb) bt #0 0x00007fdce4b41054 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007fdce4b3c388 in _L_lock_854 () from /lib64/libpthread.so.0 #2 0x00007fdce4b3c257 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x0000000000434350 in vsl_get () #4 0x0000000000434508 in VSLR () #5 0x00000000004346d2 in VSL () #6 0x00007fdce66d2d95 in cls_vlu2 (priv=0x7fdce3d42780, av=0x7fd96e85b500) at cli_serve.c:292 #7 0x00007fdce66d347b in cls_vlu (priv=0x7fdce3d42780, p=0x2 <Address 0x2 out of bounds>) at cli_serve.c:339 #8 0x00007fdce66d6e09 in LineUpProcess (l=0x7fdce3d1d730) at vlu.c:154 #9 0x00007fdce66d3e7d in VCLS_Poll (cs=0x7fdce3d03290, timeout=<value optimized out>) at cli_serve.c:528 #10 0x000000000041aa41 in CLI_Run () #11 0x000000000042ea01 in child_main () #12 0x000000000044155c in start_child () #13 0x0000000000441ee8 in MGT_Run () #14 0x000000000045037f in main () }}} Our system is down for almost 1 minute during the recover process. The issue is very similar with https://www.varnish- cache.org/trac/ticket/516 and https://www.varnish- cache.org/trac/ticket/1054. But i could not find any solution there. Do anybody could put some lights on it? -- -- Ticket URL: <https://www.varnish-cache.org/trac/ticket/1331#comment:1> Varnish <https://varnish-cache.org/> The Varnish HTTP Accelerator _______________________________________________ varnish-bugs mailing list varnish-bugs [at] varnish-cache https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs
|