Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Varnish: Bugs

#1054: Child not responding to CLI, killing it

 

 

Varnish bugs RSS feed   Index | Next | Previous | View Threaded


varnish-bugs at varnish-cache

Nov 8, 2011, 10:54 PM

Post #1 of 24 (1908 views)
Permalink
#1054: Child not responding to CLI, killing it

#1054: Child not responding to CLI, killing it
----------------------+-----------------------------------------------------
Reporter: scorillo | Type: defect
Status: new | Priority: normal
Milestone: | Component: varnishd
Version: 3.0.2 | Severity: normal
Keywords: |
----------------------+-----------------------------------------------------
Every few days, sometimes several times per day, our cache gets emptied
because varnishd restarts it's childs.

{{{
[root [at] varnis ~]# cat /var/log/messages
Nov 6 03:46:02 varnish kernel: imklog 4.6.2, log source = /proc/kmsg
started.
Nov 6 03:46:02 varnish rsyslogd: [origin software="rsyslogd"
swVersion="4.6.2" x-pid="1173" x-info="http://www.rsyslog.com"] (re)start
Nov 8 20:44:21 varnish varnishd[6719]: Child (31124) not responding to
CLI, killing it.
Nov 8 20:44:23 varnish abrt[16392]: file /usr/sbin/varnishd seems to be
deleted
Nov 8 20:44:23 varnish varnishd[6719]: Child (31124) not responding to
CLI, killing it.
Nov 8 20:44:23 varnish varnishd[6719]: Child (31124) died signal=3 (core
dumped)
Nov 8 20:44:23 varnish varnishd[6719]: child (16393) Started
Nov 8 20:44:24 varnish varnishd[6719]: Child (16393) said Child starts
Nov 8 20:45:28 varnish varnishd[6719]: Child (16393) not responding to
CLI, killing it.
Nov 8 20:45:29 varnish abrt[16861]: file /usr/sbin/varnishd seems to be
deleted
Nov 8 20:45:29 varnish varnishd[6719]: Child (16393) not responding to
CLI, killing it.
Nov 8 20:45:29 varnish varnishd[6719]: Child (16393) died signal=3 (core
dumped)
Nov 8 20:45:29 varnish varnishd[6719]: child (16862) Started
Nov 8 20:45:29 varnish varnishd[6719]: Child (16862) said Child starts
Nov 8 22:53:15 varnish varnishd[6719]: Child (16862) not responding to
CLI, killing it.
Nov 8 22:53:18 varnish abrt[28668]: file /usr/sbin/varnishd seems to be
deleted
Nov 8 22:53:18 varnish varnishd[6719]: Child (16862) not responding to
CLI, killing it.
Nov 8 22:53:18 varnish varnishd[6719]: Child (16862) died signal=3 (core
dumped)
Nov 8 22:53:18 varnish varnishd[6719]: child (28669) Started
Nov 8 22:53:18 varnish varnishd[6719]: Child (28669) said Child starts
Nov 9 01:22:21 varnish varnishd[6719]: Child (28669) not responding to
CLI, killing it.
Nov 9 01:22:25 varnish abrt[10394]: file /usr/sbin/varnishd seems to be
deleted
Nov 9 01:22:25 varnish varnishd[6719]: Child (28669) not responding to
CLI, killing it.
Nov 9 01:22:25 varnish varnishd[6719]: Child (28669) not responding to
CLI, killing it.
Nov 9 01:22:25 varnish varnishd[6719]: Child (28669) died signal=3 (core
dumped)
Nov 9 01:22:25 varnish varnishd[6719]: child (10395) Started
Nov 9 01:22:25 varnish varnishd[6719]: Child (10395) said Child starts
[root [at] varnis ~]# ls -al /usr/sbin/varnishd
-rwxr-xr-x 1 root root 489880 Oct 24 10:52 /usr/sbin/varnishd
[root [at] varnis ~]# rpm -qa | grep varnish
varnish-libs-3.0.2-1.el5.x86_64
varnish-release-3.0-1.noarch
varnish-3.0.2-1.el5.x86_64
[root [at] varnis ~]# uname -a
Linux varnish 2.6.32-71.29.1.el6.x86_64 #1 SMP Mon Jun 27 19:49:27 BST
2011 x86_64 x86_64 x86_64 GNU/Linux
[root [at] varnis ~]# cat /etc/redhat-release
CentOS Linux release 6.0 (Final)
}}}

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Nov 10, 2011, 2:25 AM

Post #2 of 24 (1857 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
----------------------+-----------------------------------------------------
Reporter: scorillo | Type: defect
Status: new | Priority: normal
Milestone: | Component: varnishd
Version: 3.0.2 | Severity: normal
Keywords: |
----------------------+-----------------------------------------------------

Comment(by kristian):

This is, as the log says, because the child is not responding to the
manager.

Please post the parameters and/or startup arguments for Varnish, and
describe the storage you use. Varnishstat -1 would also be useful, as
would any information on load (both request-wise and cpu/IO-wise) be.

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:1>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Nov 10, 2011, 4:32 AM

Post #3 of 24 (1863 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
----------------------+-----------------------------------------------------
Reporter: scorillo | Type: defect
Status: new | Priority: normal
Milestone: | Component: varnishd
Version: 3.0.2 | Severity: normal
Keywords: |
----------------------+-----------------------------------------------------

Comment(by scorillo):

This is the only varnish3 instance that we have and the only instance of
varnish running on Centos6.

We have 9 other instances of varnish2 (2.1.2) running on Centos5.4, using
the same parameters.
These varnish2 instances are working fine (uptime 36824261
1.00 Client uptime).

May the 'n_wrk_max' be a problem?
n_wrk_max 301076 108.22 N worker threads limited

On varnish2 instances, this counter is zero.


The requested information follows:

/usr/sbin/varnishd -P /var/run/varnish.pid -a :80 -f
/etc/varnish/default.vcl -T 127.0.0.1:6082 -t 120 -w 200,4000,120 -u
varnish -g varnish -S /etc/varnish/secret -s malloc,2G -h classic,500009
-p lru_interval 20, -p listen_depth 4096, -p sess_workspace 262144



client_conn 3618 1.30 Client connections accepted
client_drop 0 0.00 Connection dropped, no
sess/wrk
client_req 3618 1.30 Client requests received
cache_hit 1774 0.64 Cache hits
cache_hitpass 0 0.00 Cache hits for pass
cache_miss 1844 0.66 Cache misses
backend_conn 1840 0.66 Backend conn. success
backend_unhealthy 0 0.00 Backend conn. not attempted
backend_busy 0 0.00 Backend conn. too many
backend_fail 4 0.00 Backend conn. failures
backend_reuse 0 0.00 Backend conn. reuses
backend_toolate 0 0.00 Backend conn. was closed
backend_recycle 0 0.00 Backend conn. recycles
backend_retry 0 0.00 Backend conn. retry
fetch_head 0 0.00 Fetch head
fetch_length 1840 0.66 Fetch with Length
fetch_chunked 0 0.00 Fetch chunked
fetch_eof 0 0.00 Fetch EOF
fetch_bad 0 0.00 Fetch had bad headers
fetch_close 0 0.00 Fetch wanted close
fetch_oldhttp 0 0.00 Fetch pre HTTP/1.1 closed
fetch_zero 0 0.00 Fetch zero len
fetch_failed 0 0.00 Fetch failed
fetch_1xx 0 0.00 Fetch no body (1xx)
fetch_204 0 0.00 Fetch no body (204)
fetch_304 0 0.00 Fetch no body (304)
n_sess_mem 25 . N struct sess_mem
n_sess 0 . N struct sess
n_object 1823 . N struct object
n_vampireobject 0 . N unresurrected objects
n_objectcore 1838 . N struct objectcore
n_objecthead 1838 . N struct objecthead
n_waitinglist 22 . N struct waitinglist
n_vbc 0 . N struct vbc
n_wrk 60 . N worker threads
n_wrk_create 71 0.03 N worker threads created
n_wrk_failed 0 0.00 N worker threads not created
n_wrk_max 301076 108.22 N worker threads limited
n_wrk_lqueue 0 0.00 work request queue length
n_wrk_queued 0 0.00 N queued work requests
n_wrk_drop 0 0.00 N dropped work requests
n_backend 2 . N backends
n_expired 17 . N expired objects
n_lru_nuked 0 . N LRU nuked objects
n_lru_moved 1571 . N LRU moved objects
losthdr 0 0.00 HTTP header overflows
n_objsendfile 0 0.00 Objects sent with sendfile
n_objwrite 3152 1.13 Objects sent with write
n_objoverflow 0 0.00 Objects overflowing workspace
s_sess 3618 1.30 Total Sessions
s_req 3618 1.30 Total Requests
s_pipe 0 0.00 Total pipe
s_pass 0 0.00 Total pass
s_fetch 1840 0.66 Total fetch
s_hdrbytes 1271298 456.97 Total header bytes
s_bodybytes 17115442 6152.21 Total body bytes
sess_closed 3599 1.29 Session Closed
sess_pipeline 0 0.00 Session Pipeline
sess_readahead 0 0.00 Session Read Ahead
sess_linger 34 0.01 Session Linger
sess_herd 19 0.01 Session herd
shm_records 237082 85.22 SHM records
shm_writes 20696 7.44 SHM writes
shm_flushes 0 0.00 SHM flushes due to overflow
shm_cont 56 0.02 SHM MTX contention
shm_cycles 0 0.00 SHM cycles through buffer
sms_nreq 4 0.00 SMS allocator requests
sms_nobj 0 . SMS outstanding allocations
sms_nbytes 0 . SMS outstanding bytes
sms_balloc 1672 . SMS bytes allocated
sms_bfree 1672 . SMS bytes freed
backend_req 1840 0.66 Backend requests made
n_vcl 1 0.00 N vcl total
n_vcl_avail 1 0.00 N vcl available
n_vcl_discard 0 0.00 N vcl discarded
n_ban 1 . N total active bans
n_ban_add 1 0.00 N new bans added
n_ban_retire 0 0.00 N old bans deleted
n_ban_obj_test 0 0.00 N objects tested
n_ban_re_test 0 0.00 N regexps tested against
n_ban_dups 0 0.00 N duplicate bans removed
hcb_nolock 0 0.00 HCB Lookups without lock
hcb_lock 0 0.00 HCB Lookups with lock
hcb_insert 0 0.00 HCB Inserts
esi_errors 0 0.00 ESI parse errors (unlock)
esi_warnings 0 0.00 ESI parse warnings (unlock)
accept_fail 0 0.00 Accept failures
client_drop_late 0 0.00 Connection dropped late
uptime 2782 1.00 Client uptime
dir_dns_lookups 0 0.00 DNS director lookups
dir_dns_failed 0 0.00 DNS director failed lookups
dir_dns_hit 0 0.00 DNS director cached lookups
hit
dir_dns_cache_full 0 0.00 DNS director full dnscache
vmods 0 . Loaded VMODs
n_gzip 0 0.00 Gzip operations
n_gunzip 0 0.00 Gunzip operations
LCK.sms.creat 1 0.00 Created locks
LCK.sms.destroy 0 0.00 Destroyed locks
LCK.sms.locks 12 0.00 Lock Operations
LCK.sms.colls 0 0.00 Collisions
LCK.smp.creat 0 0.00 Created locks
LCK.smp.destroy 0 0.00 Destroyed locks
LCK.smp.locks 0 0.00 Lock Operations
LCK.smp.colls 0 0.00 Collisions
LCK.sma.creat 2 0.00 Created locks
LCK.sma.destroy 0 0.00 Destroyed locks
LCK.sma.locks 3722 1.34 Lock Operations
LCK.sma.colls 0 0.00 Collisions
LCK.smf.creat 0 0.00 Created locks
LCK.smf.destroy 0 0.00 Destroyed locks
LCK.smf.locks 0 0.00 Lock Operations
LCK.smf.colls 0 0.00 Collisions
LCK.hsl.creat 0 0.00 Created locks
LCK.hsl.destroy 0 0.00 Destroyed locks
LCK.hsl.locks 0 0.00 Lock Operations
LCK.hsl.colls 0 0.00 Collisions
LCK.hcb.creat 0 0.00 Created locks
LCK.hcb.destroy 0 0.00 Destroyed locks
LCK.hcb.locks 0 0.00 Lock Operations
LCK.hcb.colls 0 0.00 Collisions
LCK.hcl.creat 500009 179.73 Created locks
LCK.hcl.destroy 0 0.00 Destroyed locks
LCK.hcl.locks 5413 1.95 Lock Operations
LCK.hcl.colls 0 0.00 Collisions
LCK.vcl.creat 1 0.00 Created locks
LCK.vcl.destroy 0 0.00 Destroyed locks
LCK.vcl.locks 24 0.01 Lock Operations
LCK.vcl.colls 0 0.00 Collisions
LCK.stat.creat 1 0.00 Created locks
LCK.stat.destroy 0 0.00 Destroyed locks
LCK.stat.locks 25 0.01 Lock Operations
LCK.stat.colls 0 0.00 Collisions
LCK.sessmem.creat 1 0.00 Created locks
LCK.sessmem.destroy 0 0.00 Destroyed locks
LCK.sessmem.locks 3883 1.40 Lock Operations
LCK.sessmem.colls 0 0.00 Collisions
LCK.wstat.creat 1 0.00 Created locks
LCK.wstat.destroy 0 0.00 Destroyed locks
LCK.wstat.locks 8353 3.00 Lock Operations
LCK.wstat.colls 0 0.00 Collisions
LCK.herder.creat 1 0.00 Created locks
LCK.herder.destroy 0 0.00 Destroyed locks
LCK.herder.locks 1 0.00 Lock Operations
LCK.herder.colls 0 0.00 Collisions
LCK.wq.creat 12 0.00 Created locks
LCK.wq.destroy 0 0.00 Destroyed locks
LCK.wq.locks 40571 14.58 Lock Operations
LCK.wq.colls 0 0.00 Collisions
LCK.objhdr.creat 1859 0.67 Created locks
LCK.objhdr.destroy 21 0.01 Destroyed locks
LCK.objhdr.locks 10933 3.93 Lock Operations
LCK.objhdr.colls 0 0.00 Collisions
LCK.exp.creat 1 0.00 Created locks
LCK.exp.destroy 0 0.00 Destroyed locks
LCK.exp.locks 5135 1.85 Lock Operations
LCK.exp.colls 0 0.00 Collisions
LCK.lru.creat 2 0.00 Created locks
LCK.lru.destroy 0 0.00 Destroyed locks
LCK.lru.locks 2341 0.84 Lock Operations
LCK.lru.colls 0 0.00 Collisions
LCK.cli.creat 1 0.00 Created locks
LCK.cli.destroy 0 0.00 Destroyed locks
LCK.cli.locks 940 0.34 Lock Operations
LCK.cli.colls 0 0.00 Collisions
LCK.ban.creat 1 0.00 Created locks
LCK.ban.destroy 0 0.00 Destroyed locks
LCK.ban.locks 4640 1.67 Lock Operations
LCK.ban.colls 0 0.00 Collisions
LCK.vbp.creat 1 0.00 Created locks
LCK.vbp.destroy 0 0.00 Destroyed locks
LCK.vbp.locks 561 0.20 Lock Operations
LCK.vbp.colls 0 0.00 Collisions
LCK.vbe.creat 1 0.00 Created locks
LCK.vbe.destroy 0 0.00 Destroyed locks
LCK.vbe.locks 3688 1.33 Lock Operations
LCK.vbe.colls 0 0.00 Collisions
LCK.backend.creat 2 0.00 Created locks
LCK.backend.destroy 0 0.00 Destroyed locks
LCK.backend.locks 7376 2.65 Lock Operations
LCK.backend.colls 0 0.00 Collisions
SMA.s0.c_req 3650 1.31 Allocator requests
SMA.s0.c_fail 0 0.00 Allocator failures
SMA.s0.c_bytes 14001522 5032.90 Bytes allocated
SMA.s0.c_freed 136448 49.05 Bytes freed
SMA.s0.g_alloc 3646 . Allocations outstanding
SMA.s0.g_bytes 13865074 . Bytes outstanding
SMA.s0.g_space 2133618574 . Bytes available
SMA.Transient.c_req 34 0.01 Allocator requests
SMA.Transient.c_fail 0 0.00 Allocator failures
SMA.Transient.c_bytes 77012 27.68 Bytes allocated
SMA.Transient.c_freed 77012 27.68 Bytes freed
SMA.Transient.g_alloc 0 . Allocations outstanding
SMA.Transient.g_bytes 0 . Bytes outstanding
SMA.Transient.g_space 0 . Bytes available
VBE.img_bal(172.27.101.150,,80).vcls 1 . VCL
references
VBE.img_bal(172.27.101.150,,80).happy18446744073707452415 .
Happy health probes
VBE.img_oks9(172.27.101.159,,80).vcls 1 . VCL
references
VBE.img_oks9(172.27.101.159,,80).happy 18446744073709551615 .
Happy health probes

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:2>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Nov 10, 2011, 4:36 AM

Post #4 of 24 (1860 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
----------------------+-----------------------------------------------------
Reporter: scorillo | Type: defect
Status: new | Priority: normal
Milestone: | Component: varnishd
Version: 3.0.2 | Severity: normal
Keywords: |
----------------------+-----------------------------------------------------

Comment(by scorillo):

Sorry about not inserting the information as code block.

{{{
client_conn 3618 1.30 Client connections accepted
client_drop 0 0.00 Connection dropped, no
sess/wrk
client_req 3618 1.30 Client requests received
cache_hit 1774 0.64 Cache hits
cache_hitpass 0 0.00 Cache hits for pass
cache_miss 1844 0.66 Cache misses
backend_conn 1840 0.66 Backend conn. success
backend_unhealthy 0 0.00 Backend conn. not attempted
backend_busy 0 0.00 Backend conn. too many
backend_fail 4 0.00 Backend conn. failures
backend_reuse 0 0.00 Backend conn. reuses
backend_toolate 0 0.00 Backend conn. was closed
backend_recycle 0 0.00 Backend conn. recycles
backend_retry 0 0.00 Backend conn. retry
fetch_head 0 0.00 Fetch head
fetch_length 1840 0.66 Fetch with Length
fetch_chunked 0 0.00 Fetch chunked
fetch_eof 0 0.00 Fetch EOF
fetch_bad 0 0.00 Fetch had bad headers
fetch_close 0 0.00 Fetch wanted close
fetch_oldhttp 0 0.00 Fetch pre HTTP/1.1 closed
fetch_zero 0 0.00 Fetch zero len
fetch_failed 0 0.00 Fetch failed
fetch_1xx 0 0.00 Fetch no body (1xx)
fetch_204 0 0.00 Fetch no body (204)
fetch_304 0 0.00 Fetch no body (304)
n_sess_mem 25 . N struct sess_mem
n_sess 0 . N struct sess
n_object 1823 . N struct object
n_vampireobject 0 . N unresurrected objects
n_objectcore 1838 . N struct objectcore
n_objecthead 1838 . N struct objecthead
n_waitinglist 22 . N struct waitinglist
n_vbc 0 . N struct vbc
n_wrk 60 . N worker threads
n_wrk_create 71 0.03 N worker threads created
n_wrk_failed 0 0.00 N worker threads not created
n_wrk_max 301076 108.22 N worker threads limited
n_wrk_lqueue 0 0.00 work request queue length
n_wrk_queued 0 0.00 N queued work requests
n_wrk_drop 0 0.00 N dropped work requests
n_backend 2 . N backends
n_expired 17 . N expired objects
n_lru_nuked 0 . N LRU nuked objects
n_lru_moved 1571 . N LRU moved objects
losthdr 0 0.00 HTTP header overflows
n_objsendfile 0 0.00 Objects sent with sendfile
n_objwrite 3152 1.13 Objects sent with write
n_objoverflow 0 0.00 Objects overflowing workspace
s_sess 3618 1.30 Total Sessions
s_req 3618 1.30 Total Requests
s_pipe 0 0.00 Total pipe
s_pass 0 0.00 Total pass
s_fetch 1840 0.66 Total fetch
s_hdrbytes 1271298 456.97 Total header bytes
s_bodybytes 17115442 6152.21 Total body bytes
sess_closed 3599 1.29 Session Closed
sess_pipeline 0 0.00 Session Pipeline
sess_readahead 0 0.00 Session Read Ahead
sess_linger 34 0.01 Session Linger
sess_herd 19 0.01 Session herd
shm_records 237082 85.22 SHM records
shm_writes 20696 7.44 SHM writes
shm_flushes 0 0.00 SHM flushes due to overflow
shm_cont 56 0.02 SHM MTX contention
shm_cycles 0 0.00 SHM cycles through buffer
sms_nreq 4 0.00 SMS allocator requests
sms_nobj 0 . SMS outstanding allocations
sms_nbytes 0 . SMS outstanding bytes
sms_balloc 1672 . SMS bytes allocated
sms_bfree 1672 . SMS bytes freed
backend_req 1840 0.66 Backend requests made
n_vcl 1 0.00 N vcl total
n_vcl_avail 1 0.00 N vcl available
n_vcl_discard 0 0.00 N vcl discarded
n_ban 1 . N total active bans
n_ban_add 1 0.00 N new bans added
n_ban_retire 0 0.00 N old bans deleted
n_ban_obj_test 0 0.00 N objects tested
n_ban_re_test 0 0.00 N regexps tested against
n_ban_dups 0 0.00 N duplicate bans removed
hcb_nolock 0 0.00 HCB Lookups without lock
hcb_lock 0 0.00 HCB Lookups with lock
hcb_insert 0 0.00 HCB Inserts
esi_errors 0 0.00 ESI parse errors (unlock)
esi_warnings 0 0.00 ESI parse warnings (unlock)
accept_fail 0 0.00 Accept failures
client_drop_late 0 0.00 Connection dropped late
uptime 2782 1.00 Client uptime
dir_dns_lookups 0 0.00 DNS director lookups
dir_dns_failed 0 0.00 DNS director failed lookups
dir_dns_hit 0 0.00 DNS director cached lookups
hit
dir_dns_cache_full 0 0.00 DNS director full dnscache
vmods 0 . Loaded VMODs
n_gzip 0 0.00 Gzip operations
n_gunzip 0 0.00 Gunzip operations
LCK.sms.creat 1 0.00 Created locks
LCK.sms.destroy 0 0.00 Destroyed locks
LCK.sms.locks 12 0.00 Lock Operations
LCK.sms.colls 0 0.00 Collisions
LCK.smp.creat 0 0.00 Created locks
LCK.smp.destroy 0 0.00 Destroyed locks
LCK.smp.locks 0 0.00 Lock Operations
LCK.smp.colls 0 0.00 Collisions
LCK.sma.creat 2 0.00 Created locks
LCK.sma.destroy 0 0.00 Destroyed locks
LCK.sma.locks 3722 1.34 Lock Operations
LCK.sma.colls 0 0.00 Collisions
LCK.smf.creat 0 0.00 Created locks
LCK.smf.destroy 0 0.00 Destroyed locks
LCK.smf.locks 0 0.00 Lock Operations
LCK.smf.colls 0 0.00 Collisions
LCK.hsl.creat 0 0.00 Created locks
LCK.hsl.destroy 0 0.00 Destroyed locks
LCK.hsl.locks 0 0.00 Lock Operations
LCK.hsl.colls 0 0.00 Collisions
LCK.hcb.creat 0 0.00 Created locks
LCK.hcb.destroy 0 0.00 Destroyed locks
LCK.hcb.locks 0 0.00 Lock Operations
LCK.hcb.colls 0 0.00 Collisions
LCK.hcl.creat 500009 179.73 Created locks
LCK.hcl.destroy 0 0.00 Destroyed locks
LCK.hcl.locks 5413 1.95 Lock Operations
LCK.hcl.colls 0 0.00 Collisions
LCK.vcl.creat 1 0.00 Created locks
LCK.vcl.destroy 0 0.00 Destroyed locks
LCK.vcl.locks 24 0.01 Lock Operations
LCK.vcl.colls 0 0.00 Collisions
LCK.stat.creat 1 0.00 Created locks
LCK.stat.destroy 0 0.00 Destroyed locks
LCK.stat.locks 25 0.01 Lock Operations
LCK.stat.colls 0 0.00 Collisions
LCK.sessmem.creat 1 0.00 Created locks
LCK.sessmem.destroy 0 0.00 Destroyed locks
LCK.sessmem.locks 3883 1.40 Lock Operations
LCK.sessmem.colls 0 0.00 Collisions
LCK.wstat.creat 1 0.00 Created locks
LCK.wstat.destroy 0 0.00 Destroyed locks
LCK.wstat.locks 8353 3.00 Lock Operations
LCK.wstat.colls 0 0.00 Collisions
LCK.herder.creat 1 0.00 Created locks
LCK.herder.destroy 0 0.00 Destroyed locks
LCK.herder.locks 1 0.00 Lock Operations
LCK.herder.colls 0 0.00 Collisions
LCK.wq.creat 12 0.00 Created locks
LCK.wq.destroy 0 0.00 Destroyed locks
LCK.wq.locks 40571 14.58 Lock Operations
LCK.wq.colls 0 0.00 Collisions
LCK.objhdr.creat 1859 0.67 Created locks
LCK.objhdr.destroy 21 0.01 Destroyed locks
LCK.objhdr.locks 10933 3.93 Lock Operations
LCK.objhdr.colls 0 0.00 Collisions
LCK.exp.creat 1 0.00 Created locks
LCK.exp.destroy 0 0.00 Destroyed locks
LCK.exp.locks 5135 1.85 Lock Operations
LCK.exp.colls 0 0.00 Collisions
LCK.lru.creat 2 0.00 Created locks
LCK.lru.destroy 0 0.00 Destroyed locks
LCK.lru.locks 2341 0.84 Lock Operations
LCK.lru.colls 0 0.00 Collisions
LCK.cli.creat 1 0.00 Created locks
LCK.cli.destroy 0 0.00 Destroyed locks
LCK.cli.locks 940 0.34 Lock Operations
LCK.cli.colls 0 0.00 Collisions
LCK.ban.creat 1 0.00 Created locks
LCK.ban.destroy 0 0.00 Destroyed locks
LCK.ban.locks 4640 1.67 Lock Operations
LCK.ban.colls 0 0.00 Collisions
LCK.vbp.creat 1 0.00 Created locks
LCK.vbp.destroy 0 0.00 Destroyed locks
LCK.vbp.locks 561 0.20 Lock Operations
LCK.vbp.colls 0 0.00 Collisions
LCK.vbe.creat 1 0.00 Created locks
LCK.vbe.destroy 0 0.00 Destroyed locks
LCK.vbe.locks 3688 1.33 Lock Operations
LCK.vbe.colls 0 0.00 Collisions
LCK.backend.creat 2 0.00 Created locks
LCK.backend.destroy 0 0.00 Destroyed locks
LCK.backend.locks 7376 2.65 Lock Operations
LCK.backend.colls 0 0.00 Collisions
SMA.s0.c_req 3650 1.31 Allocator requests
SMA.s0.c_fail 0 0.00 Allocator failures
SMA.s0.c_bytes 14001522 5032.90 Bytes allocated
SMA.s0.c_freed 136448 49.05 Bytes freed
SMA.s0.g_alloc 3646 . Allocations outstanding
SMA.s0.g_bytes 13865074 . Bytes outstanding
SMA.s0.g_space 2133618574 . Bytes available
SMA.Transient.c_req 34 0.01 Allocator requests
SMA.Transient.c_fail 0 0.00 Allocator failures
SMA.Transient.c_bytes 77012 27.68 Bytes allocated
SMA.Transient.c_freed 77012 27.68 Bytes freed
SMA.Transient.g_alloc 0 . Allocations outstanding
SMA.Transient.g_bytes 0 . Bytes outstanding
SMA.Transient.g_space 0 . Bytes available
VBE.img_bal(172.27.101.150,,80).vcls 1 . VCL
references
VBE.img_bal(172.27.101.150,,80).happy18446744073707452415 .
Happy health probes
VBE.img_oks9(172.27.101.159,,80).vcls 1 . VCL
references
VBE.img_oks9(172.27.101.159,,80).happy 18446744073709551615 .
Happy health probes

}}}

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:3>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Apr 30, 2012, 3:19 AM

Post #5 of 24 (1749 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
-------------------------+--------------------------------------------------
Reporter: scorillo | Type: defect
Status: closed | Priority: normal
Milestone: | Component: varnishd
Version: 3.0.2 | Severity: normal
Resolution: worksforme | Keywords:
-------------------------+--------------------------------------------------
Changes (by phk):

* status: new => closed
* resolution: => worksforme


Comment:

I am closing this ticket due to timeout.

We don't see any obvious indications of what is wrong and there is high
probability that it is caused by some kind of local resource overload, but
there is insufficient data to rule either way.

If this problem persist with up to date version of varnish, feel free to
reopen the ticket with more info.

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:4>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

May 14, 2012, 1:50 AM

Post #6 of 24 (1739 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
-----------------------+----------------------------------------------------
Reporter: scorillo | Type: defect
Status: reopened | Priority: normal
Milestone: | Component: varnishd
Version: 3.0.2 | Severity: normal
Resolution: | Keywords:
-----------------------+----------------------------------------------------
Changes (by etherael):

* status: closed => reopened
* resolution: worksforme =>


Comment:

I am seeing the exact same issue on a varnish cache here, dies every 3-4
days. Details follow;
{{{

Messagelog content;
incident 1;
May 9 15:13:57 elk2 varnishd[25859]: Child (2551) not responding to CLI,
killing it.
May 9 15:14:07 elk2 varnishd[25859]: Child (2551) not responding to CLI,
killing it.
May 9 15:14:17 elk2 varnishd[25859]: Child (2551) not responding to CLI,
killing it.
May 9 15:14:27 elk2 varnishd[25859]: Child (2551) not responding to CLI,
killing it.
May 9 15:14:37 elk2 varnishd[25859]: Child (2551) not responding to CLI,
killing it.
May 9 15:14:47 elk2 varnishd[25859]: Child (2551) not responding to CLI,
killing it.
May 9 15:14:52 elk2 varnishd[25859]: Child (2551) not responding to CLI,
killing it.
May 9 15:14:52 elk2 varnishd[25859]: Child (2551) not responding to CLI,
killing it.
May 9 15:14:52 elk2 varnishd[25859]: Child (2551) died signal=3 (core
dumped)
May 9 15:14:52 elk2 varnishd[25859]: child (15127) Started
May 9 15:14:52 elk2 varnishd[25859]: Child (15127) said Child starts
May 9 15:14:52 elk2 varnishd[25859]: Child (15127) said SMF.s0 mmap'ed
311385128960 bytes of 311385128960


incident 2;

May 14 00:53:33 elk2 varnishd[25859]: Child (15127) not responding to CLI,
killing it.
May 14 00:53:43 elk2 varnishd[25859]: Child (15127) not responding to CLI,
killing it.
May 14 00:53:53 elk2 varnishd[25859]: Child (15127) not responding to CLI,
killing it.
May 14 00:54:03 elk2 varnishd[25859]: Child (15127) not responding to CLI,
killing it.
May 14 00:54:13 elk2 varnishd[25859]: Child (15127) not responding to CLI,
killing it.
May 14 00:54:20 elk2 abrt[30696]: file /usr/sbin/varnishd seems to be
deleted
May 14 00:54:21 elk2 varnishd[25859]: Child (15127) not responding to CLI,
killing it.
May 14 00:54:21 elk2 varnishd[25859]: Child (15127) not responding to CLI,
killing it.
May 14 00:54:21 elk2 varnishd[25859]: Child (15127) died signal=3 (core
dumped)
May 14 00:54:21 elk2 varnishd[25859]: child (30697) Started
May 14 00:54:21 elk2 varnishd[25859]: Child (30697) said Child starts
May 14 00:54:21 elk2 varnishd[25859]: Child (30697) said SMF.s0 mmap'ed
311385128960 bytes of 311385128960

version info;


root [at] parro:~$ rpm -qa |grep varnish
varnish-libs-3.0.2-1.el5.x86_64
varnish-3.0.2-1.el5.x86_64
varnish-release-3.0-1.noarch
root [at] parro:~$ uname -a
Linux parrot 2.6.32-220.7.1.el6.x86_64 #1 SMP Wed Mar 7 00:52:02 GMT 2012
x86_64 x86_64 x86_64 GNU/Linux
root [at] parro:~$ cat /etc/redhat-release
CentOS release 6.2 (Final)

startup cmd;

varnish 30697 25859 10 00:54 ? 00:17:52 /usr/sbin/varnishd -P
/var/run/varnish.pid -a :80 -T localhost:6082 -f /etc/varnish/default.vcl
-u varnish
-g varnish -S /etc/varnish/secret -s
file,/var/lib/varnish/varnish_storage.bin,290G

varnishstat;

root [at] parro:~$ varnishstat -1
client_conn 253164 25.32 Client connections accepted
client_drop 0 0.00 Connection dropped, no
sess/wrk
client_req 338951 33.91 Client requests received
cache_hit 156382 15.64 Cache hits
cache_hitpass 0 0.00 Cache hits for pass
cache_miss 153794 15.38 Cache misses
backend_conn 69798 6.98 Backend conn. success
backend_unhealthy 0 0.00 Backend conn. not attempted
backend_busy 0 0.00 Backend conn. too many
backend_fail 2 0.00 Backend conn. failures
backend_reuse 112847 11.29 Backend conn. reuses
backend_toolate 1116 0.11 Backend conn. was closed
backend_recycle 113971 11.40 Backend conn. recycles
backend_retry 38 0.00 Backend conn. retry
fetch_head 6 0.00 Fetch head
fetch_length 178831 17.89 Fetch with Length
fetch_chunked 3509 0.35 Fetch chunked
fetch_eof 0 0.00 Fetch EOF
fetch_bad 0 0.00 Fetch had bad headers
fetch_close 204 0.02 Fetch wanted close
fetch_oldhttp 0 0.00 Fetch pre HTTP/1.1 closed
fetch_zero 0 0.00 Fetch zero len
fetch_failed 0 0.00 Fetch failed
fetch_1xx 0 0.00 Fetch no body (1xx)
fetch_204 0 0.00 Fetch no body (204)
fetch_304 1 0.00 Fetch no body (304)
n_sess_mem 565 . N struct sess_mem
n_sess 145 . N struct sess
n_object 153780 . N struct object
n_vampireobject 0 . N unresurrected objects
n_objectcore 153835 . N struct objectcore
n_objecthead 153835 . N struct objecthead
n_waitinglist 1139 . N struct waitinglist
n_vbc 58 . N struct vbc
n_wrk 90 . N worker threads
n_wrk_create 360 0.04 N worker threads created
n_wrk_failed 0 0.00 N worker threads not created
n_wrk_max 0 0.00 N worker threads limited
n_wrk_lqueue 0 0.00 work request queue length
n_wrk_queued 4119 0.41 N queued work requests
n_wrk_drop 0 0.00 N dropped work requests
n_backend 3 . N backends
n_expired 0 . N expired objects
n_lru_nuked 0 . N LRU nuked objects
n_lru_moved 142671 . N LRU moved objects
losthdr 0 0.00 HTTP header overflows
n_objsendfile 0 0.00 Objects sent with sendfile
n_objwrite 329530 32.96 Objects sent with write
n_objoverflow 0 0.00 Objects overflowing workspace
s_sess 253107 25.32 Total Sessions
s_req 338951 33.91 Total Requests
s_pipe 0 0.00 Total pipe
s_pass 28775 2.88 Total pass
s_fetch 182551 18.26 Total fetch
s_hdrbytes 138408349 13844.99 Total header bytes
s_bodybytes 6200495767 620235.65 Total body bytes
sess_closed 106492 10.65 Session Closed
sess_pipeline 289 0.03 Session Pipeline
sess_readahead 72 0.01 Session Read Ahead
sess_linger 239202 23.93 Session Linger
sess_herd 246213 24.63 Session herd
shm_records 23151895 2315.88 SHM records
shm_writes 1718951 171.95 SHM writes
shm_flushes 0 0.00 SHM flushes due to overflow
shm_cont 6496 0.65 SHM MTX contention
shm_cycles 9 0.00 SHM cycles through buffer
sms_nreq 18 0.00 SMS allocator requests
sms_nobj 0 . SMS outstanding allocations
sms_nbytes 0 . SMS outstanding bytes
sms_balloc 7524 . SMS bytes allocated
sms_bfree 7524 . SMS bytes freed
backend_req 182657 18.27 Backend requests made
n_vcl 1 0.00 N vcl total
n_vcl_avail 1 0.00 N vcl available
n_vcl_discard 0 0.00 N vcl discarded
n_ban 1 . N total active bans
n_ban_add 1 0.00 N new bans added
n_ban_retire 0 0.00 N old bans deleted
n_ban_obj_test 0 0.00 N objects tested
n_ban_re_test 0 0.00 N regexps tested against
n_ban_dups 0 0.00 N duplicate bans removed
hcb_nolock 310164 31.03 HCB Lookups without lock
hcb_lock 153842 15.39 HCB Lookups with lock
hcb_insert 153842 15.39 HCB Inserts
esi_errors 0 0.00 ESI parse errors (unlock)
esi_warnings 0 0.00 ESI parse warnings (unlock)
accept_fail 0 0.00 Accept failures
client_drop_late 0 0.00 Connection dropped late
uptime 9997 1.00 Client uptime
dir_dns_lookups 0 0.00 DNS director lookups
dir_dns_failed 0 0.00 DNS director failed lookups
dir_dns_hit 0 0.00 DNS director cached lookups
hit
dir_dns_cache_full 0 0.00 DNS director full dnscache
vmods 0 . Loaded VMODs
n_gzip 0 0.00 Gzip operations
n_gunzip 235860 23.59 Gunzip operations
LCK.sms.creat 2 0.00 Created locks
LCK.sms.destroy 0 0.00 Destroyed locks
LCK.sms.locks 141948 14.20 Lock Operations
LCK.sms.colls 0 0.00 Collisions
LCK.smp.creat 0 0.00 Created locks
LCK.smp.destroy 0 0.00 Destroyed locks
LCK.smp.locks 0 0.00 Lock Operations
LCK.smp.colls 0 0.00 Collisions
LCK.sma.creat 2 0.00 Created locks
LCK.sma.destroy 0 0.00 Destroyed locks
LCK.sma.locks 4334608 433.59 Lock Operations
LCK.sma.colls 0 0.00 Collisions
LCK.smf.creat 2 0.00 Created locks
LCK.smf.destroy 0 0.00 Destroyed locks
LCK.smf.locks 20270014 2027.61 Lock Operations
LCK.smf.colls 0 0.00 Collisions
LCK.hsl.creat 0 0.00 Created locks
LCK.hsl.destroy 0 0.00 Destroyed locks
LCK.hsl.locks 0 0.00 Lock Operations
LCK.hsl.colls 0 0.00 Collisions
LCK.hcb.creat 2 0.00 Created locks
LCK.hcb.destroy 0 0.00 Destroyed locks
LCK.hcb.locks 6813196 681.52 Lock Operations
LCK.hcb.colls 0 0.00 Collisions
LCK.hcl.creat 0 0.00 Created locks
LCK.hcl.destroy 0 0.00 Destroyed locks
LCK.hcl.locks 0 0.00 Lock Operations
LCK.hcl.colls 0 0.00 Collisions
LCK.vcl.creat 2 0.00 Created locks
LCK.vcl.destroy 0 0.00 Destroyed locks
LCK.vcl.locks 27017 2.70 Lock Operations
LCK.vcl.colls 0 0.00 Collisions
LCK.stat.creat 2 0.00 Created locks
LCK.stat.destroy 0 0.00 Destroyed locks
LCK.stat.locks 3353 0.34 Lock Operations
LCK.stat.colls 0 0.00 Collisions
LCK.sessmem.creat 2 0.00 Created locks
LCK.sessmem.destroy 0 0.00 Destroyed locks
LCK.sessmem.locks 10434926 1043.81 Lock Operations
LCK.sessmem.colls 0 0.00 Collisions
LCK.wstat.creat 2 0.00 Created locks
LCK.wstat.destroy 0 0.00 Destroyed locks
LCK.wstat.locks 1062489 106.28 Lock Operations
LCK.wstat.colls 0 0.00 Collisions
LCK.herder.creat 2 0.00 Created locks
LCK.herder.destroy 0 0.00 Destroyed locks
LCK.herder.locks 28077 2.81 Lock Operations
LCK.herder.colls 0 0.00 Collisions
LCK.wq.creat 4 0.00 Created locks
LCK.wq.destroy 0 0.00 Destroyed locks
LCK.wq.locks 32085426 3209.51 Lock Operations
LCK.wq.colls 0 0.00 Collisions
LCK.objhdr.creat 6777786 677.98 Created locks
LCK.objhdr.destroy 38400 3.84 Destroyed locks
LCK.objhdr.locks 58265283 5828.28 Lock Operations
LCK.objhdr.colls 0 0.00 Collisions
LCK.exp.creat 2 0.00 Created locks
LCK.exp.destroy 0 0.00 Destroyed locks
LCK.exp.locks 7078276 708.04 Lock Operations
LCK.exp.colls 0 0.00 Collisions
LCK.lru.creat 4 0.00 Created locks
LCK.lru.destroy 0 0.00 Destroyed locks
LCK.lru.locks 6740732 674.28 Lock Operations
LCK.lru.colls 0 0.00 Collisions
LCK.cli.creat 2 0.00 Created locks
LCK.cli.destroy 0 0.00 Destroyed locks
LCK.cli.locks 112482 11.25 Lock Operations
LCK.cli.colls 0 0.00 Collisions
LCK.ban.creat 2 0.00 Created locks
LCK.ban.destroy 0 0.00 Destroyed locks
LCK.ban.locks 7078306 708.04 Lock Operations
LCK.ban.colls 0 0.00 Collisions
LCK.vbp.creat 2 0.00 Created locks
LCK.vbp.destroy 0 0.00 Destroyed locks
LCK.vbp.locks 0 0.00 Lock Operations
LCK.vbp.colls 0 0.00 Collisions
LCK.vbe.creat 2 0.00 Created locks
LCK.vbe.destroy 0 0.00 Destroyed locks
LCK.vbe.locks 7031988 703.41 Lock Operations
LCK.vbe.colls 0 0.00 Collisions
LCK.backend.creat 6 0.00 Created locks
LCK.backend.destroy 0 0.00 Destroyed locks
LCK.backend.locks 21975254 2198.18 Lock Operations
LCK.backend.colls 0 0.00 Collisions
SMF.s0.c_req 13531967 1353.60 Allocator requests
SMF.s0.c_fail 0 0.00 Allocator failures
SMF.s0.c_bytes 907854680064 90812711.82 Bytes allocated
SMF.s0.c_freed 795545841664 79578457.70 Bytes freed
SMF.s0.g_alloc 13484647 . Allocations outstanding
SMF.s0.g_bytes 112308838400 . Bytes outstanding
SMF.s0.g_space 510461419520 . Bytes available
SMF.s0.g_smf 14758490 . N struct smf
SMF.s0.g_smf_frag 1273251 . N small free smf
SMF.s0.g_smf_large 592 . N large free smf
SMA.Transient.c_req 57850 5.79 Allocator requests
SMA.Transient.c_fail 0 0.00 Allocator failures
SMA.Transient.c_bytes 3809803264 381094.65 Bytes allocated
SMA.Transient.c_freed 3809803264 381094.65 Bytes freed
SMA.Transient.g_alloc 0 . Allocations outstanding
SMA.Transient.g_bytes 0 . Bytes outstanding
SMA.Transient.g_space 0 . Bytes available
VBE.default(10.52.5.194,,8080).vcls 2 . VCL
references
VBE.default(10.52.5.194,,8080).happy 0 . Happy health
probes
VBE.thumbs(127.0.0.1,,8080).vcls 2 . VCL
references
VBE.thumbs(127.0.0.1,,8080).happy 0 . Happy health
probes
VBE.nm_elk(50.23.148.202,,80).vcls 2 . VCL
references
VBE.nm_elk(50.23.148.202,,80).happy 0 . Happy health
probes


}}}

Other strangeness; the directory that contains the actual varnish storage
bin behaves erratically, the file storage.bin is a sparse file of the size
requested on the startup (290gb) but sparse file listing shows that it
never climbs above the 205gb mark in terms of actual space used.

When varnish died on the 9th and auto restarted, this figure (actual size
used in sparse file) was registered as 153gb despite the cache being
completely empty.

client_req average hovers around 30 but peaks at 70, nothing at all huge
compared to what I've seen previously. The only thing that is out of the
ordinary I suppose is the size of the storage file (very long tail of
infrequently accessed objects on the backends with very high load on those
backend servers that we want to ease off on, hence the immense cache size)

The average io load is about 200kbytes reads per second and negligible
writes, load average rarely exceeds 3 (16 core Intel Xeon E5620 @ 2.40Ghz,
12gb memory.

That's all I can think of in terms of detail, I note the version of the
cache is up to date with upstream despite being a binary distributed from
the vendor, could it help to compile from source and see if the issue
persists?

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:5>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

May 14, 2012, 3:10 AM

Post #7 of 24 (1734 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
-----------------------+----------------------------------------------------
Reporter: scorillo | Type: defect
Status: reopened | Priority: normal
Milestone: | Component: varnishd
Version: 3.0.2 | Severity: normal
Resolution: | Keywords:
-----------------------+----------------------------------------------------

Comment(by tfheen):

Hi,

thanks for the data.

I was wondering if you could run iostat -x 1 and see if it increases
massively when Varnish restarts?

Also, could you please check if there's a panic, by doing varnishadm
panic.show ? Can you also please run gdb on the core dump and give us the
backtrace?

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:6>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

May 14, 2012, 3:43 AM

Post #8 of 24 (1728 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
-----------------------+----------------------------------------------------
Reporter: scorillo | Type: defect
Status: reopened | Priority: normal
Milestone: | Component: varnishd
Version: 3.0.2 | Severity: normal
Resolution: | Keywords:
-----------------------+----------------------------------------------------

Comment(by etherael):

The iostat output is quite huge but I couldn't see much difference between
the read / write rates during or after the varnishd restart. I have it
saved in case you want to analyse it directly.

{{{
root [at] parro:~$ varnishadm panic.show
Child has not panicked or panic has been cleared
Command failed with error code 300

}}}

I was unable to find a varnishd.core anywhere, following the instructions
at https://www.varnish-cache.org/trac/wiki/DebuggingVarnish I note that
the varnishd bin is supposed to be not stripped but the one on this system
is listed as stripped.


{{{
root [at] parro:~$ file /usr/sbin/varnishd
/usr/sbin/varnishd: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
dynamically linked (uses shared libs), for GNU/Linux 2.6.9, stripped
}}}

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:7>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

May 16, 2012, 7:13 PM

Post #9 of 24 (1732 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
-----------------------+----------------------------------------------------
Reporter: scorillo | Type: defect
Status: reopened | Priority: normal
Milestone: | Component: varnishd
Version: 3.0.2 | Severity: normal
Resolution: | Keywords:
-----------------------+----------------------------------------------------

Comment(by bstillwell):

I'm also seeing this problem with varnish 3.0.2 on CentOS 6.2. There
isn't any panic output from varnishadm:

{{{
[root [at] den2tpv1 ~]# varnishadm panic.show
Child has not panicked or panic has been cleared
Command failed with error code 300
}}}

However /var/log/messages has some details:
{{{
May 16 19:18:45 den2tpv16 varnishd[32205]: Child (32206) not responding to
CLI, killing it.
May 16 19:18:55 den2tpv16 varnishd[32205]: Child (32206) not responding to
CLI, killing it.
May 16 19:19:05 den2tpv16 varnishd[32205]: Child (32206) not responding to
CLI, killing it.
May 16 19:19:15 den2tpv16 varnishd[32205]: Child (32206) not responding to
CLI, killing it.
May 16 19:19:25 den2tpv16 varnishd[32205]: Child (32206) not responding to
CLI, killing it.
May 16 19:19:35 den2tpv16 varnishd[32205]: Child (32206) not responding to
CLI, killing it.
May 16 19:19:37 den2tpv16 varnishd[32205]: Child (32206) not responding to
CLI, killing it.
May 16 19:19:38 den2tpv16 varnishd[32205]: Child (32206) died signal=3
(core dumped)
May 16 19:19:38 den2tpv16 varnishd[32205]: child (4447) Started
May 16 19:19:38 den2tpv16 varnishd[32205]: Child (4447) said Child starts
May 16 19:19:38 den2tpv16 varnishd[32205]: Child (4447) said SMF.s0
mmap'ed 85899345920 bytes of 85899345920
}}}

The machine has 32GiB of memory with an SSD for storage (120GB OCZ
Vertex2). The storage option I'm using is: -s
file,/data/varnish_storage.bin,80G,32K

Using -s malloc,23G on this same box is quite stable, but I'm hoping to
start using SSDs to improve our hitrates (we have a very large set size).

'dstat -tlcmgndr 60' output:


{{{

----system---- ---load-avg--- ----total-cpu-usage---- ------memory-
usage----- ---paging-- -net/total- -dsk/total- --io/total-
date/time | 1m 5m 15m |usr sys idl wai hiq siq| used buff cach
free| in out | recv send| read writ| read writ
16-05 19:08:13|0.10 11.6 15.1| 3 2 93 2 0 1|2719M 52.7M 28.0G
328M| 0 0 |3675k 13M|1878k 136k| 469 11.3
16-05 19:09:13|0.12 9.52 14.2| 3 2 93 2 0 1|2721M 52.9M 28.0G
320M| 0 0 |3425k 13M|1753k 133k| 438 11.1
16-05 19:10:13|0.16 7.82 13.3| 3 2 93 2 0 1|2719M 53.1M 28.0G
298M| 0 0 |3069k 13M|1529k 139k| 382 12.1
16-05 19:11:13|0.47 6.50 12.5| 3 2 93 2 0 1|2721M 53.3M 28.0G
308M| 0 0 |3763k 13M|1925k 135k| 481 12.3
16-05 19:12:13|0.35 5.36 11.7| 3 2 93 2 0 1|2724M 53.5M 28.0G
310M| 0 0 |3652k 12M|1852k 140k| 463 12.5
16-05 19:13:13|0.17 4.40 11.0| 3 2 92 2 0 1|2725M 53.6M 28.0G
304M| 0 0 |4319k 14M|2498k 140k| 625 12.6
16-05 19:14:13|0.31 3.68 10.3| 3 2 93 2 0 1|2723M 51.2M 28.0G
316M| 0 0 |3437k 12M|2006k 135k| 501 12.0
16-05 19:15:13|0.26 3.05 9.71| 3 2 93 2 0 1|2722M 51.4M 28.0G
316M| 0 0 |4249k 14M|2313k 135k| 578 12.2
16-05 19:16:13|0.14 2.51 9.10| 3 2 93 2 0 1|2723M 51.6M 28.0G
308M| 0 0 |3741k 12M|1774k 132k| 443 11.6
16-05 19:17:13|17.6 6.52 10.0| 3 3 53 41 0 1|2725M 51.7M 28.0G
297M| 0 0 |3780k 12M|1755k 18M| 439 193
16-05 19:18:13| 302 78.2 33.7| 2 2 26 68 0 1|2778M 51.8M 27.9G
321M| 0 0 |2401k 9803k| 681k 18M| 170 463
16-05 19:19:13| 561 194 76.3| 1 1 35 63 0 1|2765M 51.8M 27.9G
350M| 0 0 | 438k 1137k| 148k 14M|37.1 840
16-05 19:20:13| 373 222 94.2| 4 5 27 62 0 2|2011M 52.1M 28.0G
1015M| 0 0 | 15M 13M|1474k 15M| 366 577
16-05 19:21:13| 142 183 88.9| 3 3 52 41 0 1|2012M 52.2M 28.1G
930M| 0 0 | 10M 13M|1371k 15M| 343 360
16-05 19:22:13|52.7 150 83.4| 3 3 73 20 0 1|2012M 52.4M 28.2G
781M| 0 0 |8901k 12M|2472k 8992k| 618 266
16-05 19:23:13|20.4 123 78.3| 3 2 73 21 0 1|2012M 52.6M 28.3G
630M| 0 0 |7799k 12M|2502k 8782k| 626 270
16-05 19:24:13|8.86 101 73.5| 3 3 76 18 0 1|2022M 52.8M 28.5G
453M| 0 0 |7689k 18M|2791k 7731k| 698 245
16-05 19:25:13|3.74 82.7 69.0| 3 3 74 20 0 1|2027M 52.9M 28.6G
315M| 0 0 |7754k 14M|2786k 8175k| 697 255
16-05 19:26:13|2.70 68.0 64.8| 3 2 80 15 0 1|2029M 53.1M 28.6G
315M| 0 0 |6680k 13M|2162k 6918k| 540 240

}}}

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:8>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Jun 12, 2012, 10:54 AM

Post #10 of 24 (1655 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
-----------------------+----------------------------------------------------
Reporter: scorillo | Type: defect
Status: reopened | Priority: normal
Milestone: | Component: varnishd
Version: 3.0.2 | Severity: normal
Resolution: | Keywords:
-----------------------+----------------------------------------------------

Comment(by bstillwell):

The source of the problem seems to be the defragmenting done by
transparent hugepages. Once we disabled them with the following command
our problems went away:

{{{
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
}}}

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:9>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Jun 18, 2012, 3:07 AM

Post #11 of 24 (1658 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
----------------------+-----------------------------------------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: new
Priority: normal | Milestone:
Component: varnishd | Version: 3.0.2
Severity: normal | Resolution:
Keywords: |
----------------------+-----------------------------------------------------
Changes (by lkarsten):

* owner: => lkarsten
* status: reopened => new


--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:10>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Jul 31, 2012, 1:51 AM

Post #12 of 24 (1606 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: new
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution:
Keywords: |
---------------------------+-----------------------
Changes (by phk):

* component: varnishd => documentation


Comment:

This needs to go into docs somewhere.

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:11>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Nov 22, 2012, 6:11 AM

Post #13 of 24 (1413 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: new
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution:
Keywords: |
---------------------------+-----------------------

Comment (by lkarsten):

Commited to the documentation:

https://www.varnish-cache.org/docs/3.0/tutorial/platformnotes.html

https://www.varnish-cache.org/docs/trunk/installation/platformnotes.html

Closing.

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:12>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Nov 22, 2012, 6:12 AM

Post #14 of 24 (1407 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: closed
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution: fixed
Keywords: |
---------------------------+-----------------------
Changes (by lkarsten):

* status: new => closed
* resolution: => fixed


--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:13>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Dec 14, 2012, 5:12 AM

Post #15 of 24 (1372 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: reopened
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution:
Keywords: |
---------------------------+-----------------------
Changes (by scorillo):

* status: closed => reopened
* resolution: fixed =>


Comment:

Hello

It seems that the problem persists.

/var/log/messages :
{{{
Dec 13 18:18:31 black-varnish varnishd[7071]: child (7072) Started
Dec 13 18:18:31 black-varnish varnishd[7071]: Child (7072) said Child
starts
Dec 14 04:12:02 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:12:12 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:12:22 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:12:33 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:12:43 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:12:53 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:13:03 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:13:13 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:13:23 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:13:33 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:13:43 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:13:53 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:14:03 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:14:06 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:14:06 black-varnish varnishd[7071]: Child (7072) not responding
to CLI, killing it.
Dec 14 04:14:06 black-varnish varnishd[7071]: Child (7072) died signal=3
Dec 14 04:14:06 black-varnish varnishd[7071]: child (13707) Started
Dec 14 04:14:06 black-varnish varnishd[7071]: Child (13707) said Child
starts
Dec 14 14:46:59 black-varnish varnishd[7071]: CLI telnet 127.0.0.1 42647
127.0.0.1 6082 Rd auth
0bb6df8f04bdb4a62657f4b22c5e28dc545384ea88a589258f7a3d300681531b
Dec 14 14:46:59 black-varnish varnishd[7071]: CLI telnet 127.0.0.1 42647
127.0.0.1 6082 Wr 200 -----------------------------#012Varnish Cache CLI
1.0#012-----------------------------#012Linux,2.6.32-279.14.1.el6.x86_64,x86_64,-smalloc,-smalloc,-hcritbit#012#012Type
'help' for command list.#012Type 'quit' to close CLI session.
Dec 14 14:46:59 black-varnish varnishd[7071]: CLI telnet 127.0.0.1 42647
127.0.0.1 6082 Rd ping
Dec 14 14:46:59 black-varnish varnishd[7071]: CLI telnet 127.0.0.1 42647
127.0.0.1 6082 Wr 200 PONG 1355489219 1.0
Dec 14 14:47:00 black-varnish varnishd[7071]: CLI telnet 127.0.0.1 42647
127.0.0.1 6082 Rd panic.show
Dec 14 14:47:00 black-varnish varnishd[7071]: CLI telnet 127.0.0.1 42647
127.0.0.1 6082 Wr 300 Child has not panicked or panic has been cleared
}}}

other infos:
{{{
[root [at] black-varnis ~]# cat /etc/redhat-release
CentOS release 6.3 (Final)

[root [at] black-varnis ~]# uname -a
Linux black-varnish 2.6.32-279.14.1.el6.x86_64 #1 SMP Tue Nov 6 23:43:09
UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

[root [at] black-varnis ~]# cat
/sys/kernel/mm/redhat_transparent_hugepage/enabled
always [never]
}}}

I must mention that the system is running inside a vmware virtual machine
configured as in the attached image.

[[Image(black-varnish.jpg)]]

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:14>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Dec 17, 2012, 3:35 AM

Post #16 of 24 (1360 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: reopened
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution:
Keywords: |
---------------------------+-----------------------

Comment (by martin):

Hi,

We believe that what you are experiencing is due to resource exhaustion on
your virtual machine, and the monitoring thread in Varnish is being
starved.

The memory available to your Varnish instance seems on the low side, with
only 2GB configured on the VM. This should be increased. If that is not
possible, you could prevent Varnish scaling up to using too much memory by
setting a much lower max number of threads. Maybe as low as 100-150.

You could also try to disable the core dumps on the box to see if there
are any hidden panic message that gets eaten due to cores taking too long
time. Though I don't expect this to be the case.

Regards,
Martin Blix Grydeland

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:15>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Dec 17, 2012, 7:43 AM

Post #17 of 24 (1365 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: reopened
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution:
Keywords: |
---------------------------+-----------------------

Comment (by scorillo):

Hi,

I don't think it is a memory problem, but a problem somehow related to CPU
(iowait).

Take a look:

{{{
/var/log/messages-20121216.gz:Dec 14 04:12:02 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:12:12 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:12:22 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:12:33 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:12:43 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:12:53 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:13:03 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:13:13 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:13:23 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:13:33 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:13:43 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:13:53 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:14:03 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:14:06 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
/var/log/messages-20121216.gz:Dec 14 04:14:06 black-varnish
varnishd[7071]: Child (7072) not responding to CLI, killing it.
}}}

{{{
[root [at] black-varnis src]# sar -f /var/log/sa/sa14
Linux 2.6.32-279.14.1.el6.x86_64 (black-varnish.okazii.ro) 12/14/2012
_x86_64_ (2 CPU)

12:00:01 AM CPU %user %nice %system %iowait %steal
%idle
...
04:10:01 AM all 0.22 0.00 0.11 0.62 0.00
99.05
04:20:01 AM all 0.22 0.00 0.11 27.46 0.00
72.22
04:30:01 AM all 0.21 0.00 0.10 0.45 0.00
99.24
}}}

And here:

{{{
Dec 16 04:10:02 black-varnish varnishd[7071]: Child (25212) not responding
to CLI, killing it.
Dec 16 04:10:02 black-varnish varnishd[7071]: Child (25212) not responding
to CLI, killing it.
}}}

{{{
Linux 2.6.32-279.14.1.el6.x86_64 (black-varnish.okazii.ro) 12/16/2012
_x86_64_ (2 CPU)

12:00:01 AM CPU %user %nice %system %iowait %steal
%idle
...
04:00:01 AM all 0.23 0.00 0.12 0.12 0.00
99.53
04:10:01 AM all 0.23 0.00 0.12 3.29 0.00
96.35
04:20:01 AM all 0.22 0.00 0.12 2.42 0.00
97.25
04:30:01 AM all 0.22 0.00 0.12 0.46 0.00
99.21
}}}

An finally here:

{{{
Dec 17 16:11:52 black-varnish varnishd[7071]: Child (14654) not responding
to CLI, killing it.
Dec 17 16:11:56 black-varnish varnishd[7071]: Child (14654) not responding
to CLI, killing it.
Dec 17 16:11:56 black-varnish varnishd[7071]: Child (14654) not responding
to CLI, killing it.
}}}

{{{
Linux 2.6.32-279.14.1.el6.x86_64 (black-varnish.okazii.ro) 12/17/2012
_x86_64_ (2 CPU)

12:00:01 AM CPU %user %nice %system %iowait %steal
%idle
...
04:10:01 PM all 0.25 0.00 0.14 0.65 0.00
98.96
04:20:01 PM all 0.64 0.00 0.21 8.00 0.00
91.15
04:30:01 PM all 0.25 0.00 0.15 0.37 0.00
99.24
}}}

Thank you,
Cristian

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:16>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Jan 28, 2013, 3:01 AM

Post #18 of 24 (1193 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: reopened
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution:
Keywords: |
---------------------------+-----------------------

Comment (by andrii.grytsenko):

Hi guys,

We have the same issue with varnish on all our
environments(test/preprod/prod). It's killing by parent process at least
once a day. It has nothing to do with load, because it fails even at
preproduction environment during weekend when there is no traffic at all.

{{{
Jan 28 05:20:14 proxy-002 varnishd[11866]: Child (31970) not responding to
CLI, killing it.
Jan 28 05:20:24 proxy-002 varnishd[11866]: Child (31970) not responding to
CLI, killing it.
Jan 28 05:20:34 proxy-002 varnishd[11866]: Child (31970) not responding to
CLI, killing it.
Jan 28 05:20:44 proxy-002 varnishd[11866]: Child (31970) not responding to
CLI, killing it.
Jan 28 05:20:54 proxy-002 varnishd[11866]: Child (31970) not responding to
CLI, killing it.
Jan 28 05:20:58 proxy-002 varnishd[11866]: Child (31970) not responding to
CLI, killing it.
Jan 28 05:20:58 proxy-002 varnishd[11866]: Child (31970) not responding to
CLI, killing it.
Jan 28 05:20:58 proxy-002 varnishd[11866]: Child (31970) died signal=3
(core dumped)
Jan 28 05:20:58 proxy-002 varnishd[11866]: child (5972) Started
Jan 28 05:20:58 proxy-002 varnishd[11866]: Child (5972) said Child starts
Jan 28 05:47:24 proxy-002 varnishd[11866]: Child (5972) not responding to
CLI, killing it.
Jan 28 05:47:30 proxy-002 varnishd[11866]: Child (5972) not responding to
CLI, killing it.
Jan 28 05:47:30 proxy-002 varnishd[11866]: Child (5972) not responding to
CLI, killing it.
Jan 28 05:47:30 proxy-002 varnishd[11866]: Child (5972) died signal=3
(core dumped)
Jan 28 05:47:31 proxy-002 varnishd[11866]: child (10110) Started
Jan 28 05:47:31 proxy-002 varnishd[11866]: Child (10110) said Child starts
}}}

I turned on core dump. Here is the backtrace from gdb:
{{{

gdb `which varnishd` core-varnishd-3-101-102-5972-1359348450
...
...
...
warning: no loadable sections found in added symbol-file system-supplied
DSO at 0x7fff8f7fd000
Core was generated by `/usr/sbin/varnishd -P /var/run/varnish.pid -a :80
-f /etc/varnish/default.vcl -'.
Program terminated with signal 3, Quit.
#0 0x000000359ac0d594 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0 0x000000359ac0d594 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x000000359ac08e8a in _L_lock_1034 () from /lib64/libpthread.so.0
#2 0x000000359ac08d4c in pthread_mutex_lock () from
/lib64/libpthread.so.0
#3 0x000000000043231a in ?? ()
#4 0x00000000004324b8 in ?? ()
#5 0x00000000004326a1 in VSL ()
#6 0x00000000004199be in ?? ()
#7 0x000000359bc0756c in ?? () from /usr/lib64/varnish/libvarnish.so
#8 0x000000359bc07bad in ?? () from /usr/lib64/varnish/libvarnish.so
#9 0x000000359bc0a88d in ?? () from /usr/lib64/varnish/libvarnish.so
#10 0x000000359bc06a22 in VCLS_Poll () from
/usr/lib64/varnish/libvarnish.so
#11 0x0000000000419a31 in CLI_Run ()
#12 0x000000000042ca1d in child_main ()
#13 0x000000000043ecbc in ?? ()
#14 0x000000000043f54c in ?? ()
#15 0x000000359bc09567 in ?? () from /usr/lib64/varnish/libvarnish.so
#16 0x000000359bc09bf8 in vev_schedule () from
/usr/lib64/varnish/libvarnish.so
#17 0x000000000043ee92 in MGT_Run ()
#18 0x000000000044e1bb in main ()


}}}




Information about system:

OS RHEL 5.8
{{{
cat /etc/issue
Red Hat Enterprise Linux Server release 5.8 (Tikanga)
Kernel \r on an \m
}}}

kernel:
{{{
2.6.18-308.11.1.el5xen
}}}

processors:
{{{
processor0 => Intel(R) Xeon(R) CPU L5420 @ 2.50GHz
processor1 => Intel(R) Xeon(R) CPU L5420 @ 2.50GHz
}}}

rpm package from official repo(http://repo.varnish-
cache.org/redhat/varnish-3.0/el$release/$arch/):
{{{
rpm -qa |grep varnish
varnish-libs-3.0.3-1.el5.centos
varnish-3.0.3-1.el5.centos
}}}

binary:
{{{
file /usr/sbin/varnishd
/usr/sbin/varnishd: ELF 64-bit LSB executable, AMD x86-64, version 1
(SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs),
stripped
}}}

there is no panic messages:
{{{
varnishadm panic.show
Child has not panicked or panic has been cleared
Command failed with error code 300
}}}

varnish started with next params:
{{{
varnish 10110 0.0 1.1 2273492 49452 ? Sl 05:47 0:03
/usr/sbin/varnishd -P /var/run/varnish.pid -a :80 -f
/etc/varnish/default.vcl -T 127.0.0.1:6082 -t 120 -w 100,250,120 -u
varnish -g varnish -s malloc,3G -h critbit -p thread_pools 2 -p
thread_pool_add_delay 2 -p listen_depth 1024 -p session_linger 50 -p
lru_interval 2 -p shm_reclen 255 -p sess_timeout 30
root 11866 0.0 0.0 110948 940 ? Ss Jan25 0:00
/usr/sbin/varnishd -P /var/run/varnish.pid -a :80 -f
/etc/varnish/default.vcl -T 127.0.0.1:6082 -t 120 -w 100,250,120 -u
varnish -g varnish -s malloc,3G -h critbit -p thread_pools 2 -p
thread_pool_add_delay 2 -p listen_depth 1024 -p session_linger 50 -p
lru_interval 2 -p shm_reclen 255 -p sess_timeout 30
}}}


varnish stat:
{{{
varnishstat -1
client_conn 16258 0.95 Client connections accepted
client_drop 0 0.00 Connection dropped, no
sess/wrk
client_req 23568 1.37 Client requests received
cache_hit 0 0.00 Cache hits
cache_hitpass 0 0.00 Cache hits for pass
cache_miss 0 0.00 Cache misses
backend_conn 106 0.01 Backend conn. success
backend_unhealthy 0 0.00 Backend conn. not attempted
backend_busy 0 0.00 Backend conn. too many
backend_fail 0 0.00 Backend conn. failures
backend_reuse 16621 0.97 Backend conn. reuses
backend_toolate 102 0.01 Backend conn. was closed
backend_recycle 16727 0.97 Backend conn. recycles
backend_retry 0 0.00 Backend conn. retry
fetch_head 0 0.00 Fetch head
fetch_length 5920 0.34 Fetch with Length
fetch_chunked 10807 0.63 Fetch chunked
fetch_eof 0 0.00 Fetch EOF
fetch_bad 0 0.00 Fetch had bad headers
fetch_close 0 0.00 Fetch wanted close
fetch_oldhttp 0 0.00 Fetch pre HTTP/1.1 closed
fetch_zero 0 0.00 Fetch zero len
fetch_failed 0 0.00 Fetch failed
fetch_1xx 0 0.00 Fetch no body (1xx)
fetch_204 0 0.00 Fetch no body (204)
fetch_304 0 0.00 Fetch no body (304)
n_sess_mem 23 . N struct sess_mem
n_sess 5 . N struct sess
n_object 0 . N struct object
n_vampireobject 0 . N unresurrected objects
n_objectcore 9 . N struct objectcore
n_objecthead 9 . N struct objecthead
n_waitinglist 9 . N struct waitinglist
n_vbc 4 . N struct vbc
n_wrk 200 . N worker threads
n_wrk_create 200 0.01 N worker threads created
n_wrk_failed 0 0.00 N worker threads not created
n_wrk_max 0 0.00 N worker threads limited
n_wrk_lqueue 0 0.00 work request queue length
n_wrk_queued 0 0.00 N queued work requests
n_wrk_drop 0 0.00 N dropped work requests
n_backend 2 . N backends
n_expired 0 . N expired objects
n_lru_nuked 0 . N LRU nuked objects
n_lru_moved 0 . N LRU moved objects
losthdr 0 0.00 HTTP header overflows
n_objsendfile 0 0.00 Objects sent with sendfile
n_objwrite 37957 2.21 Objects sent with write
n_objoverflow 0 0.00 Objects overflowing workspace
s_sess 16258 0.95 Total Sessions
s_req 23568 1.37 Total Requests
s_pipe 0 0.00 Total pipe
s_pass 16727 0.97 Total pass
s_fetch 16727 0.97 Total fetch
s_hdrbytes 2988339 173.95 Total header bytes
s_bodybytes 2500675296 145565.82 Total body bytes
sess_closed 7191 0.42 Session Closed
sess_pipeline 0 0.00 Session Pipeline
sess_readahead 0 0.00 Session Read Ahead
sess_linger 16726 0.97 Session Linger
sess_herd 14469 0.84 Session herd
shm_records 1266876 73.75 SHM records
shm_writes 138871 8.08 SHM writes
shm_flushes 0 0.00 SHM flushes due to overflow
shm_cont 691 0.04 SHM MTX contention
shm_cycles 0 0.00 SHM cycles through buffer
sms_nreq 6841 0.40 SMS allocator requests
sms_nobj 0 . SMS outstanding allocations
sms_nbytes 0 . SMS outstanding bytes
sms_balloc 89378 . SMS bytes allocated
sms_bfree 89378 . SMS bytes freed
backend_req 16727 0.97 Backend requests made
n_vcl 1 0.00 N vcl total
n_vcl_avail 1 0.00 N vcl available
n_vcl_discard 0 0.00 N vcl discarded
n_ban 1 . N total active bans
n_ban_gone 1 . N total gone bans
n_ban_add 1 0.00 N new bans added
n_ban_retire 0 0.00 N old bans deleted
n_ban_obj_test 0 0.00 N objects tested
n_ban_re_test 0 0.00 N regexps tested against
n_ban_dups 0 0.00 N duplicate bans removed
hcb_nolock 0 0.00 HCB Lookups without lock
hcb_lock 0 0.00 HCB Lookups with lock
hcb_insert 0 0.00 HCB Inserts
esi_errors 0 0.00 ESI parse errors (unlock)
esi_warnings 0 0.00 ESI parse warnings (unlock)
accept_fail 0 0.00 Accept failures
client_drop_late 0 0.00 Connection dropped late
uptime 17179 1.00 Client uptime
dir_dns_lookups 0 0.00 DNS director lookups
dir_dns_failed 0 0.00 DNS director failed lookups
dir_dns_hit 0 0.00 DNS director cached lookups
hit
dir_dns_cache_full 0 0.00 DNS director full dnscache
vmods 0 . Loaded VMODs
n_gzip 0 0.00 Gzip operations
n_gunzip 0 0.00 Gunzip operations
LCK.sms.creat 3 0.00 Created locks
LCK.sms.destroy 0 0.00 Destroyed locks
LCK.sms.locks 181323 10.55 Lock Operations
LCK.sms.colls 0 0.00 Collisions
LCK.smp.creat 0 0.00 Created locks
LCK.smp.destroy 0 0.00 Destroyed locks
LCK.smp.locks 0 0.00 Lock Operations
LCK.smp.colls 0 0.00 Collisions
LCK.sma.creat 6 0.00 Created locks
LCK.sma.destroy 0 0.00 Destroyed locks
LCK.sma.locks 948103 55.19 Lock Operations
LCK.sma.colls 0 0.00 Collisions
LCK.smf.creat 0 0.00 Created locks
LCK.smf.destroy 0 0.00 Destroyed locks
LCK.smf.locks 0 0.00 Lock Operations
LCK.smf.colls 0 0.00 Collisions
LCK.hsl.creat 0 0.00 Created locks
LCK.hsl.destroy 0 0.00 Destroyed locks
LCK.hsl.locks 0 0.00 Lock Operations
LCK.hsl.colls 0 0.00 Collisions
LCK.hcb.creat 3 0.00 Created locks
LCK.hcb.destroy 0 0.00 Destroyed locks
LCK.hcb.locks 845 0.05 Lock Operations
LCK.hcb.colls 0 0.00 Collisions
LCK.hcl.creat 0 0.00 Created locks
LCK.hcl.destroy 0 0.00 Destroyed locks
LCK.hcl.locks 0 0.00 Lock Operations
LCK.hcl.colls 0 0.00 Collisions
LCK.vcl.creat 3 0.00 Created locks
LCK.vcl.destroy 0 0.00 Destroyed locks
LCK.vcl.locks 39 0.00 Lock Operations
LCK.vcl.colls 0 0.00 Collisions
LCK.stat.creat 3 0.00 Created locks
LCK.stat.destroy 0 0.00 Destroyed locks
LCK.stat.locks 146498 8.53 Lock Operations
LCK.stat.colls 0 0.00 Collisions
LCK.sessmem.creat 3 0.00 Created locks
LCK.sessmem.destroy 0 0.00 Destroyed locks
LCK.sessmem.locks 154696 9.00 Lock Operations
LCK.sessmem.colls 0 0.00 Collisions
LCK.wstat.creat 3 0.00 Created locks
LCK.wstat.destroy 0 0.00 Destroyed locks
LCK.wstat.locks 304602 17.73 Lock Operations
LCK.wstat.colls 0 0.00 Collisions
LCK.herder.creat 3 0.00 Created locks
LCK.herder.destroy 0 0.00 Destroyed locks
LCK.herder.locks 3 0.00 Lock Operations
LCK.herder.colls 0 0.00 Collisions
LCK.wq.creat 6 0.00 Created locks
LCK.wq.destroy 0 0.00 Destroyed locks
LCK.wq.locks 687062 39.99 Lock Operations
LCK.wq.colls 0 0.00 Collisions
LCK.objhdr.creat 32 0.00 Created locks
LCK.objhdr.destroy 0 0.00 Destroyed locks
LCK.objhdr.locks 0 0.00 Lock Operations
LCK.objhdr.colls 0 0.00 Collisions
LCK.exp.creat 3 0.00 Created locks
LCK.exp.destroy 0 0.00 Destroyed locks
LCK.exp.locks 151859 8.84 Lock Operations
LCK.exp.colls 0 0.00 Collisions
LCK.lru.creat 6 0.00 Created locks
LCK.lru.destroy 0 0.00 Destroyed locks
LCK.lru.locks 0 0.00 Lock Operations
LCK.lru.colls 0 0.00 Collisions
LCK.cli.creat 3 0.00 Created locks
LCK.cli.destroy 0 0.00 Destroyed locks
LCK.cli.locks 50672 2.95 Lock Operations
LCK.cli.colls 0 0.00 Collisions
LCK.ban.creat 3 0.00 Created locks
LCK.ban.destroy 0 0.00 Destroyed locks
LCK.ban.locks 151878 8.84 Lock Operations
LCK.ban.colls 0 0.00 Collisions
LCK.vbp.creat 3 0.00 Created locks
LCK.vbp.destroy 0 0.00 Destroyed locks
LCK.vbp.locks 60615 3.53 Lock Operations
LCK.vbp.colls 0 0.00 Collisions
LCK.vbe.creat 3 0.00 Created locks
LCK.vbe.destroy 0 0.00 Destroyed locks
LCK.vbe.locks 1969 0.11 Lock Operations
LCK.vbe.colls 0 0.00 Collisions
LCK.backend.creat 6 0.00 Created locks
LCK.backend.destroy 0 0.00 Destroyed locks
LCK.backend.locks 298061 17.35 Lock Operations
LCK.backend.colls 0 0.00 Collisions
SMA.s0.c_req 6841 0.40 Allocator requests
SMA.s0.c_fail 0 0.00 Allocator failures
SMA.s0.c_bytes 233360192 13584.04 Bytes allocated
SMA.s0.c_freed 233360192 13584.04 Bytes freed
SMA.s0.g_alloc 0 . Allocations outstanding
SMA.s0.g_bytes 0 . Bytes outstanding
SMA.s0.g_space 3221225472 . Bytes available
SMA.Transient.c_req 47843 2.78 Allocator requests
SMA.Transient.c_fail 0 0.00 Allocator failures
SMA.Transient.c_bytes 3323928244 193487.88 Bytes allocated
SMA.Transient.c_freed 3323928244 193487.88 Bytes freed
SMA.Transient.g_alloc 0 . Allocations outstanding
SMA.Transient.g_bytes 0 . Bytes outstanding
SMA.Transient.g_space 0 . Bytes available
VBE.preprod_001(172.19.xx.xx,,8080).vcls 3 . VCL
references
VBE.preprod_001(172.19.xx.xx,,8080).happy18446744073709551615 .
Happy health probes
VBE.preprod_002(172.19.xx.yy,,8080).vcls 3 . VCL
references
VBE.preprod_002(172.19.xx.yy,,8080).happy18446744073709551615 .
Happy health probes
}}}


This server is not caching anything, but balance traffic between couples
backends.

Apparently, there is no huge load on the box during the crash:
memory:
{{{
05:20:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree
kbswpused %swpused kbswpcad
05:30:01 AM 449312 3744992 89.29 256468 2058492 2097144
0 0.00 0
05:40:01 AM 448924 3745380 89.30 256468 2058616 2097144
0 0.00 0
05:50:01 AM 442336 3751968 89.45 256496 2066160 2097144
0 0.00 0
}}}

CPU:
{{{
05:20:01 AM CPU %user %nice %system %iowait %steal
%idle
05:30:01 AM all 0.09 0.00 0.15 14.53 0.02
85.21
05:40:01 AM all 0.09 0.00 0.12 3.58 0.02
96.19
05:50:01 AM all 0.03 0.00 0.12 3.62 0.02
96.21
}}}

Disk IO:
{{{
05:20:01 AM tps rtps wtps bread/s bwrtn/s
05:30:01 AM 19.23 0.00 19.23 0.00 199.72
05:40:01 AM 9.37 0.00 9.37 0.00 110.86
05:50:01 AM 16.73 0.00 16.73 0.00 169.21
}}}

If you need more details please do not hesitate to contact me.

Thank you!

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:17>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Jan 28, 2013, 4:24 AM

Post #19 of 24 (1195 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: reopened
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution:
Keywords: |
---------------------------+-----------------------

Comment (by lkarsten):

Thank you for the information.

Can you please increase cli_timeout and see if this problem persists?

30 seconds should be sufficient.

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:18>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Jan 30, 2013, 2:24 AM

Post #20 of 24 (1198 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: reopened
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution:
Keywords: |
---------------------------+-----------------------

Comment (by andrii.grytsenko):

Thank you for your answer.

I increased cli_timeout to 30 seconds but child keeps failing anyway.

{{{
Jan 29 12:59:00 proxy-001 varnishd[14728]: CLI telnet 127.0.0.1 57463
127.0.0.1 6082 Wr 200 cli_timeout 30 [seconds]#012
Default is 10#012 Timeout for the childs
replies to CLI requests#012 from the master.
Jan 29 12:59:04 proxy-001 varnishd[14728]: CLI telnet 127.0.0.1 57463
127.0.0.1 6082 Rd quit
Jan 29 12:59:04 proxy-001 varnishd[14728]: CLI telnet 127.0.0.1 57463
127.0.0.1 6082 Wr 500 Closing CLI connection
Jan 30 04:39:31 proxy-001 varnishd[14728]: Child (31581) not responding to
CLI, killing it.
Jan 30 04:39:32 proxy-001 varnishd[14728]: Child (31581) not responding to
CLI, killing it.
Jan 30 04:39:32 proxy-001 varnishd[14728]: Child (31581) died signal=3
Jan 30 04:39:32 proxy-001 varnishd[14728]: child (16605) Started
Jan 30 04:39:32 proxy-001 varnishd[14728]: Child (16605) said Child starts
}}}

by the way varnish is run inside xen virtual machine, may be is has
something to do with this crashes.

Thanks

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:19>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Jun 5, 2013, 6:06 AM

Post #21 of 24 (845 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: reopened
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution:
Keywords: |
---------------------------+-----------------------

Comment (by andrii.grytsenko):

Hi,

The problem is still exist, but now even worse, because every now and then
child process is killed by parent and has not been re-spawned back.

{{{
May 26 05:01:49 proxy-001 varnishd[20918]: child (19143) Started
May 26 05:01:49 proxy-001 varnishd[20918]: Child (19143) said Child starts
May 26 05:02:17 proxy-001 varnishd[20918]: Child (19143) not responding to
CLI, killing it.
May 26 05:02:20 proxy-001 varnishd[20918]: Child (19143) not responding to
CLI, killing it.
May 26 05:02:20 proxy-001 varnishd[20918]: Child (19143) died signal=3
May 26 05:02:20 proxy-001 varnishd[20918]: child (19977) Started
May 26 05:02:30 proxy-001 varnishd[20918]: Pushing vcls failed:#012CLI
communication error (hdr)
May 26 05:02:30 proxy-001 varnishd[20918]: Child (19977) said Child starts
May 26 05:02:35 proxy-001 varnishd[20918]: Child (19977) said Child dies
May 26 05:02:35 proxy-001 varnishd[20918]: Child (19977) died status=1
}}}

port is 08 is not listened

{{{
PROD/root [at] proxy-00:~$ netstat -lnp |grep 80
tcp 0 0 0.0.0.0:56806 0.0.0.0:*
LISTEN -
}}}

and only parent process is running

{{{
root 20918 0.0 0.0 110296 1084 ? Ss May16 0:00
/usr/sbin/varnishd -P /var/run/varnish.pid -a :80 -f
/etc/varnish/default.vcl -T 127.0.0.1:6082 -t 120 -w 400,1000,120 -u
varnish -g varnish -s malloc,512M -h critbit -p thread_pools 2 -p
thread_pool_add_delay 2 -p listen_depth 1024 -p session_linger 50 -p
lru_interval 2 -p shm_reclen 255 -p sess_timeout 30
}}}


when stracing parent process it's doing nothing but waiting for some
external event or something:

{{{
strace -f -p 20918
Process 20918 attached - interrupt to quit
poll([{fd=6, events=POLLIN}], 1, -1
}}}

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:20>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Jun 5, 2013, 9:06 AM

Post #22 of 24 (839 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: reopened
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution:
Keywords: |
---------------------------+-----------------------

Comment (by igorm):

Hi,

i have exactly a same problem.


{{{
Jun 5 17:02:19 mweb2 varnishd[25112]: Child (23672) not responding to
CLI, killing it.
Jun 5 17:02:29 mweb2 varnishd[25112]: Child (23672) not responding to
CLI, killing it.
Jun 5 17:02:31 mweb2 varnishd[25112]: Child (23672) not responding to
CLI, killing it.
Jun 5 17:02:31 mweb2 varnishd[25112]: Child (23672) died signal=3 (core
dumped)
Jun 5 17:02:31 mweb2 varnishd[25112]: Child cleanup complete
Jun 5 17:02:31 mweb2 varnishd[25112]: child (23889) Started
Jun 5 17:02:32 mweb2 varnishd[25112]: Child (23889) said Child starts
Jun 5 17:02:56 mweb2 varnishd[25112]: Child (23889) not responding to
CLI, killing it.
Jun 5 17:03:06 mweb2 varnishd[25112]: Child (23889) not responding to
CLI, killing it.
Jun 5 17:03:16 mweb2 varnishd[25112]: Child (23889) not responding to
CLI, killing it.
Jun 5 17:03:17 mweb2 varnishd[25112]: Child (23889) not responding to
CLI, killing it.
Jun 5 17:03:17 mweb2 varnishd[25112]: Child (23889) died signal=3 (core
dumped)
Jun 5 17:03:17 mweb2 varnishd[25112]: Child cleanup complete
Jun 5 17:03:17 mweb2 varnishd[25112]: child (24111) Started
Jun 5 17:03:27 mweb2 varnishd[25112]: Pushing vcls failed:#012CLI
communication error (hdr)
Jun 5 17:03:27 mweb2 varnishd[25112]: Stopping Child
Jun 5 17:03:27 mweb2 varnishd[25112]: Child (24111) said Child starts
Jun 5 17:04:03 mweb2 varnishd[25112]: Child (24111) said Child dies
Jun 5 17:04:19 mweb2 varnishd[25112]: Child (24111) died status=1
Jun 5 17:04:19 mweb2 varnishd[25112]: Child cleanup complete
}}}

Any suggestions?

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:21>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Jun 6, 2013, 9:18 AM

Post #23 of 24 (840 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: reopened
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution:
Keywords: |
---------------------------+-----------------------

Comment (by igorm):

Similar problem few minutes ago but on different server

{{{
Jun 6 18:08:16 mweb1 varnishd[1278]: Child (1279) not responding to CLI,
killing it.
Jun 6 18:08:16 mweb1 varnishd[1278]: Child (1279) not responding to CLI,
killing it.
Jun 6 18:08:16 mweb1 varnishd[1278]: Child (1279) died signal=6 (core
dumped)
}}}

After the warning in syslog, i also saw next exception but it's probably a
different issue

{{{
Jun 6 18:08:16 mweb1 varnishd[1278]: Child (1279) Panic message: Assert
error in vfp_esi_bytes_gg(), cache_esi_fetch.c line 274:#012 Condition(i
ent =
Linux,3.2.0-23-generic,x86_64,-smalloc,-smalloc,-hcritbit,epoll#012Backtrace:#012
0x4310e5: /usr/sbin/varnishd() [0x4310e5]#012 0x41e76d:
/usr/sbin/varnishd() [0x41e76d]#012 0x41e908: /usr/sbin/v
arnishd() [0x41e908]#012 0x4257ac: /usr/sbin/varnishd(FetchBody+0x2ec)
[0x4257ac]#012 0x4175c8: /usr/sbin/varnishd() [0x4175c8]#012 0x4192ed:
/usr/sbin/varnishd(CNT_Session+0xdad) [0x4192ed]#012 0x432
ee5: /usr/sbin/varnishd() [0x432ee5]#012 0x7f2edbfe0e9a: /lib/x86_64
-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f2edbfe0e9a]#012 0x7f2edbd0e4bd:
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f2edbd0e4bd]#
012sp = 0x7f2ed3844008 {#012 fd = 15, id = 15, xid = 1585814891,#012
client = 77.247.201.44 47308,#012 step = STP_FETCHBODY,#012 handling =
hit_for_pass,#012 err_code = 206, err_reason = (null),#012
restarts = 0, esi_level = 0#012 flags = do_esi is_gzip#012 bodystatus
= 4#012 ws = 0x7f2ed3844080 { #012 id = "sess",#012 {s,f,r,e} =
{0x7f2ed3844c78,+584,(nil),+262144},#012 },#012...
}}}

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:22>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs


varnish-bugs at varnish-cache

Jun 10, 2013, 3:30 AM

Post #24 of 24 (842 views)
Permalink
Re: #1054: Child not responding to CLI, killing it [In reply to]

#1054: Child not responding to CLI, killing it
---------------------------+-----------------------
Reporter: scorillo | Owner: lkarsten
Type: defect | Status: closed
Priority: normal | Milestone:
Component: documentation | Version: 3.0.2
Severity: normal | Resolution: invalid
Keywords: |
---------------------------+-----------------------
Changes (by phk):

* status: reopened => closed
* resolution: => invalid


Comment:

I am closing this ticket as "invalid" for the lack of a better
classification.

At this point, this ticket is not actionable for us, the underlying issues
we have identified have been fixed, while other issues are local
performance issue.

Please don't reopen this ticket, open a new instead.

--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:23>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator

_______________________________________________
varnish-bugs mailing list
varnish-bugs [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-bugs

Varnish bugs RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.