Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Varnish: Dev

malloc storage memory leak?

 

 

Varnish dev RSS feed   Index | Next | Previous | View Threaded


cal at fbsdata

Jul 19, 2012, 12:08 PM

Post #1 of 7 (962 views)
Permalink
malloc storage memory leak?

Hi everyone,

I just upgraded to Varnish 3.0.2, and changed the storage method to malloc,
with a size limitation of 5GB on a machine with 6GB of physical memory.
Varnish doesn't appear to be using the size limitation, and is consuming
memory until the machine dips into swap and becomes unresponsive.

Here are my options to varnishd:

varnishd -P /var/run/varnish.pid
-a :80
-f /etc/varnish/flexmaps.vcl
-T :6082
-t 120
-w 8,4000,120
-u apache -g apache
-p thread_pools 4
-p listen_depth 4096
-p lru_interval 3600
-h classic,70001
-s malloc,5000M

I've tried to lower the malloc setting, and it still seems to consume a ton
of memory. Am I doing something wrong here, or is this a leak?

Thank you,

--Cal


apj at mutt

Jul 19, 2012, 12:20 PM

Post #2 of 7 (936 views)
Permalink
Re: malloc storage memory leak? [In reply to]

On Thu, Jul 19, 2012 at 02:08:20PM -0500, Cal Heldenbrand wrote:
>
> I just upgraded to Varnish 3.0.2, and changed the storage method to malloc,
> with a size limitation of 5GB on a machine with 6GB of physical memory.
> Varnish doesn't appear to be using the size limitation, and is consuming
> memory until the machine dips into swap and becomes unresponsive.

Please attach a varnishstat. A copy of your vcl will probably be helpful too.

--
Andreas

_______________________________________________
varnish-dev mailing list
varnish-dev [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev


cal at fbsdata

Jul 19, 2012, 2:11 PM

Post #3 of 7 (935 views)
Permalink
Re: malloc storage memory leak? [In reply to]

Sure no problem. I'll show a few more details below too. For a quick
rundown of my application, I'm placing varnish in front of a big mapping
system. We have about 22TB in mapping tiles, and I have 4 varnish machines
sitting in front of them, with 6GB of memory each. A load balancer
distributes the zoom levels between two clusters (two members each) in an
attempt to balance my hit ratio and hit traffic. It works pretty well, the
LRU seems to keep my hit ratio at around 60%. Which still results in very
quick map loads for most views. I was previously using the *file* storage,
but it just doesn't make sense to stick with that in my application. The
backend NFS server is quicker than the VM images that varnish is running
on, so I decided to try out the malloc storage and ran into this problem.


Here's a screenshot of top. This machine was set to malloc max at 5500MB.
[image: Inline image 1]

Here's a cacti graph of me trying to lower the malloc size the last few
days. Even set at 4000MB it climbs up to 6GB and crashes the box

[image: Inline image 1]

Here's the output of varnishstat -1 on a machine during the time when it
started to swap out. (IP addresses removed)
---------------------------------------------------------------------------------------------------------------------------------------------------
# varnishstat -1
client_conn 4516253 24.35 Client connections accepted
client_drop 0 0.00 Connection dropped, no sess/wrk
client_req 9665270 52.10 Client requests received
cache_hit 8866434 47.80 Cache hits
cache_hitpass 0 0.00 Cache hits for pass
cache_miss 798835 4.31 Cache misses
backend_conn 96001 0.52 Backend conn. success
backend_unhealthy 0 0.00 Backend conn. not attempted
backend_busy 0 0.00 Backend conn. too many
backend_fail 5 0.00 Backend conn. failures
backend_reuse 702842 3.79 Backend conn. reuses
backend_toolate 95484 0.51 Backend conn. was closed
backend_recycle 798339 4.30 Backend conn. recycles
backend_retry 7 0.00 Backend conn. retry
fetch_head 0 0.00 Fetch head
fetch_length 638890 3.44 Fetch with Length
fetch_chunked 159945 0.86 Fetch chunked
fetch_eof 0 0.00 Fetch EOF
fetch_bad 0 0.00 Fetch had bad headers
fetch_close 0 0.00 Fetch wanted close
fetch_oldhttp 0 0.00 Fetch pre HTTP/1.1 closed
fetch_zero 0 0.00 Fetch zero len
fetch_failed 0 0.00 Fetch failed
fetch_1xx 0 0.00 Fetch no body (1xx)
fetch_204 0 0.00 Fetch no body (204)
fetch_304 0 0.00 Fetch no body (304)
n_sess_mem 512 . N struct sess_mem
n_sess 318 . N struct sess
n_object 798209 . N struct object
n_vampireobject 0 . N unresurrected objects
n_objectcore 798256 . N struct objectcore
n_objecthead 798256 . N struct objecthead
n_waitinglist 2017 . N struct waitinglist
n_vbc 15 . N struct vbc
n_wrk 49 . N worker threads
n_wrk_create 1039 0.01 N worker threads created
n_wrk_failed 0 0.00 N worker threads not created
n_wrk_max 3421 0.02 N worker threads limited
n_wrk_lqueue 0 0.00 work request queue length
n_wrk_queued 20376 0.11 N queued work requests
n_wrk_drop 0 0.00 N dropped work requests
n_backend 8 . N backends
n_expired 626 . N expired objects
n_lru_nuked 0 . N LRU nuked objects
n_lru_moved 2011016 . N LRU moved objects
losthdr 0 0.00 HTTP header overflows
n_objsendfile 0 0.00 Objects sent with sendfile
n_objwrite 9650020 52.02 Objects sent with write
n_objoverflow 0 0.00 Objects overflowing workspace
s_sess 4516229 24.35 Total Sessions
s_req 9665270 52.10 Total Requests
s_pipe 1 0.00 Total pipe
s_pass 0 0.00 Total pass
s_fetch 798835 4.31 Total fetch
s_hdrbytes 3810296042 20540.68 Total header bytes
s_bodybytes 67134175568 361909.30 Total body bytes
sess_closed 121965 0.66 Session Closed
sess_pipeline 0 0.00 Session Pipeline
sess_readahead 0 0.00 Session Read Ahead
sess_linger 9634396 51.94 Session Linger
sess_herd 9535551 51.40 Session herd
shm_records 459505182 2477.12 SHM records
shm_writes 35192359 189.72 SHM writes
shm_flushes 0 0.00 SHM flushes due to overflow
shm_cont 6447 0.03 SHM MTX contention
shm_cycles 208 0.00 SHM cycles through buffer
sms_nreq 0 0.00 SMS allocator requests
sms_nobj 0 . SMS outstanding allocations
sms_nbytes 0 . SMS outstanding bytes
sms_balloc 0 . SMS bytes allocated
sms_bfree 0 . SMS bytes freed
backend_req 798844 4.31 Backend requests made
n_vcl 1 0.00 N vcl total
n_vcl_avail 1 0.00 N vcl available
n_vcl_discard 0 0.00 N vcl discarded
n_ban 1 . N total active bans
n_ban_add 1 0.00 N new bans added
n_ban_retire 0 0.00 N old bans deleted
n_ban_obj_test 0 0.00 N objects tested
n_ban_re_test 0 0.00 N regexps tested against
n_ban_dups 0 0.00 N duplicate bans removed
hcb_nolock 0 0.00 HCB Lookups without lock
hcb_lock 0 0.00 HCB Lookups with lock
hcb_insert 0 0.00 HCB Inserts
esi_errors 0 0.00 ESI parse errors (unlock)
esi_warnings 0 0.00 ESI parse warnings (unlock)
accept_fail 0 0.00 Accept failures
client_drop_late 0 0.00 Connection dropped late
uptime 185500 1.00 Client uptime
dir_dns_lookups 0 0.00 DNS director lookups
dir_dns_failed 0 0.00 DNS director failed lookups
dir_dns_hit 0 0.00 DNS director cached lookups hit
dir_dns_cache_full 0 0.00 DNS director full dnscache
vmods 0 . Loaded VMODs
n_gzip 0 0.00 Gzip operations
n_gunzip 0 0.00 Gunzip operations
LCK.sms.creat 1 0.00 Created locks
LCK.sms.destroy 0 0.00 Destroyed locks
LCK.sms.locks 0 0.00 Lock Operations
LCK.sms.colls 0 0.00 Collisions
LCK.smp.creat 0 0.00 Created locks
LCK.smp.destroy 0 0.00 Destroyed locks
LCK.smp.locks 0 0.00 Lock Operations
LCK.smp.colls 0 0.00 Collisions
LCK.sma.creat 2 0.00 Created locks
LCK.sma.destroy 0 0.00 Destroyed locks
LCK.sma.locks 1759151 9.48 Lock Operations
LCK.sma.colls 0 0.00 Collisions
LCK.smf.creat 0 0.00 Created locks
LCK.smf.destroy 0 0.00 Destroyed locks
LCK.smf.locks 0 0.00 Lock Operations
LCK.smf.colls 0 0.00 Collisions
LCK.hsl.creat 0 0.00 Created locks
LCK.hsl.destroy 0 0.00 Destroyed locks
LCK.hsl.locks 0 0.00 Lock Operations
LCK.hsl.colls 0 0.00 Collisions
LCK.hcb.creat 0 0.00 Created locks
LCK.hcb.destroy 0 0.00 Destroyed locks
LCK.hcb.locks 0 0.00 Lock Operations
LCK.hcb.colls 0 0.00 Collisions
LCK.hcl.creat 70001 0.38 Created locks
LCK.hcl.destroy 0 0.00 Destroyed locks
LCK.hcl.locks 18532340 99.90 Lock Operations
LCK.hcl.colls 0 0.00 Collisions
LCK.vcl.creat 1 0.00 Created locks
LCK.vcl.destroy 0 0.00 Destroyed locks
LCK.vcl.locks 3999 0.02 Lock Operations
LCK.vcl.colls 0 0.00 Collisions
LCK.stat.creat 1 0.00 Created locks
LCK.stat.destroy 0 0.00 Destroyed locks
LCK.stat.locks 512 0.00 Lock Operations
LCK.stat.colls 0 0.00 Collisions
LCK.sessmem.creat 1 0.00 Created locks
LCK.sessmem.destroy 0 0.00 Destroyed locks
LCK.sessmem.locks 4533824 24.44 Lock Operations
LCK.sessmem.colls 0 0.00 Collisions
LCK.wstat.creat 1 0.00 Created locks
LCK.wstat.destroy 0 0.00 Destroyed locks
LCK.wstat.locks 575635 3.10 Lock Operations
LCK.wstat.colls 0 0.00 Collisions
LCK.herder.creat 1 0.00 Created locks
LCK.herder.destroy 0 0.00 Destroyed locks
LCK.herder.locks 17392 0.09 Lock Operations
LCK.herder.colls 0 0.00 Collisions
LCK.wq.creat 4 0.00 Created locks
LCK.wq.destroy 0 0.00 Destroyed locks
LCK.wq.locks 20053031 108.10 Lock Operations
LCK.wq.colls 0 0.00 Collisions
LCK.objhdr.creat 799708 4.31 Created locks
LCK.objhdr.destroy 1454 0.01 Destroyed locks
LCK.objhdr.locks 20929870 112.83 Lock Operations
LCK.objhdr.colls 0 0.00 Collisions
LCK.exp.creat 1 0.00 Created locks
LCK.exp.destroy 0 0.00 Destroyed locks
LCK.exp.locks 984222 5.31 Lock Operations
LCK.exp.colls 0 0.00 Collisions
LCK.lru.creat 2 0.00 Created locks
LCK.lru.destroy 0 0.00 Destroyed locks
LCK.lru.locks 798835 4.31 Lock Operations
LCK.lru.colls 0 0.00 Collisions
LCK.cli.creat 1 0.00 Created locks
LCK.cli.destroy 0 0.00 Destroyed locks
LCK.cli.locks 61755 0.33 Lock Operations
LCK.cli.colls 0 0.00 Collisions
LCK.ban.creat 1 0.00 Created locks
LCK.ban.destroy 0 0.00 Destroyed locks
LCK.ban.locks 984223 5.31 Lock Operations
LCK.ban.colls 0 0.00 Collisions
LCK.vbp.creat 1 0.00 Created locks
LCK.vbp.destroy 0 0.00 Destroyed locks
LCK.vbp.locks 623084 3.36 Lock Operations
LCK.vbp.colls 0 0.00 Collisions
LCK.vbe.creat 1 0.00 Created locks
LCK.vbe.destroy 0 0.00 Destroyed locks
LCK.vbe.locks 191999 1.04 Lock Operations
LCK.vbe.colls 0 0.00 Collisions
LCK.backend.creat 8 0.00 Created locks
LCK.backend.destroy 0 0.00 Destroyed locks
LCK.backend.locks 2779529 14.98 Lock Operations
LCK.backend.colls 0 0.00 Collisions
SMA.s0.c_req 1596704 8.61 Allocator requests
SMA.s0.c_fail 0 0.00 Allocator failures
SMA.s0.c_bytes 23077754159 124408.38 Bytes allocated
SMA.s0.c_freed 17482598971 94245.82 Bytes freed
SMA.s0.g_alloc 1596704 . Allocations outstanding
SMA.s0.g_bytes 5595155188 . Bytes outstanding
SMA.s0.g_space 172012812 . Bytes available
SMA.Transient.c_req 1252 0.01 Allocator requests
SMA.Transient.c_fail 0 0.00 Allocator failures
SMA.Transient.c_bytes 556514 3.00 Bytes allocated
SMA.Transient.c_freed 556514 3.00 Bytes freed
SMA.Transient.g_alloc 0 . Allocations outstanding
SMA.Transient.g_bytes 0 . Bytes outstanding
SMA.Transient.g_space 0 . Bytes available
VBE.mapweb11(,,80).vcls 1 . VCL references
VBE.mapweb11(,,80).happy 0 . Happy health probes
VBE.mapweb1(,,80).vcls 1 . VCL references
VBE.mapweb1(,,80).happy 18446744073709551615 . Happy health
probes
VBE.mapweb2(,,80).vcls 1 . VCL references
VBE.mapweb2(,80).happy 18446744073709551615 . Happy health probes
VBE.mapweb3(,,80).vcls 1 . VCL references
VBE.mapweb3(,,80).happy 18446744073709551615 . Happy health
probes
VBE.mapweb7(,,80).vcls 1 . VCL references
VBE.mapweb7(,80).happy 18446744073709551615 . Happy health probes
VBE.mapweb8(,80).vcls 1 . VCL references
VBE.mapweb8(80).happy 18446744073709551615 . Happy health probes
VBE.mapweb9(,80).vcls 1 . VCL references
VBE.mapweb9(80).happy 18446744073709551615 . Happy health
probes
VBE.mapweb10(,80).vcls 1 . VCL references
VBE.mapweb10(,80).happy 18446744073709551615 . Happy health
probes
---------------------------------------------------------------------------------------------------------------------------------------------------


mapping system VCL config. I've snipped the unimportant stuff, and changed
my names around a bit
---------------------------------------------------------------------------------------------------------------------------------------------------
backend mapweb11 {
.host = "...";
.port = "80";
.probe = {
.url = "/ka-map/images/a_pixel.gif";
.interval = 5s;
.timeout = 500 ms;
.window = 5;
.threshold = 3;
}
}

# ... More backends defined here ...

director maps round-robin
{
{ .backend = mapweb1; }
{ .backend = mapweb2; }
{ .backend = mapweb3; }
{ .backend = mapweb7; }
{ .backend = mapweb8; }
{ .backend = mapweb9; }
{ .backend = mapweb10; }
}
sub vcl_recv {
set req.backend = maps;
if (req.request != "GET" &&
req.request != "HEAD" &&
req.request != "PUT" &&
req.request != "POST" &&
req.request != "TRACE" &&
req.request != "OPTIONS" &&
req.request != "DELETE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}

if (req.request != "GET" && req.request != "HEAD") {
/* We only deal with GET and HEAD by default */
return (pass);
}
# Don't cache any shapegen calls, they're dynamic
if ( req.url ~ "^/cgi-bin/shapegen" )
{
return(pass);
}
# Change anything *.mydomain.com to just mydomain.com
# to avoid duplications in the cache
if ( req.http.host ~ "mydomain.com" )
{
set req.http.host = "mydomain.com";
}

# for mapping grace period
set req.grace = 30s;
return (lookup);
}
sub vcl_fetch {
if (beresp.ttl <= 0s ||
beresp.http.Set-Cookie ||
beresp.http.Vary == "*")
{
return(hit_for_pass);
}
/* Remove Expires from backend, it's not long enough */
unset beresp.http.expires;

/* Set the clients TTL on this object */
set beresp.http.Cache-Control = "max-age=604800";

/* Set how long Varnish will keep it */
set beresp.ttl = 1w;

/* marker for vcl_deliver to reset Age: */
set beresp.http.magicmarker = "1";
set beresp.grace = 30s;
return (deliver);
}
sub vcl_deliver {
# Added by Cal to easily see a HIT / MISS in the headers
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
} else {
set resp.http.X-Cache = "MISS";
}

if (resp.http.magicmarker) {
/* Remove the magic marker */
unset resp.http.magicmarker;

/* By definition we have a fresh object */
set resp.http.age = "0";
}

# Set a nice header to show which varnish server this comes from
set resp.http.X-Hostname = regsub(server.identity, ".mydomain.com", "");

return (deliver);
}
---------------------------------------------------------------------------------------------------------------------------------------------------

Thanks for any help!

--Cal



On Thu, Jul 19, 2012 at 2:20 PM, Andreas Plesner Jacobsen <apj [at] mutt>wrote:

> On Thu, Jul 19, 2012 at 02:08:20PM -0500, Cal Heldenbrand wrote:
> >
> > I just upgraded to Varnish 3.0.2, and changed the storage method to
> malloc,
> > with a size limitation of 5GB on a machine with 6GB of physical memory.
> > Varnish doesn't appear to be using the size limitation, and is consuming
> > memory until the machine dips into swap and becomes unresponsive.
>
> Please attach a varnishstat. A copy of your vcl will probably be helpful
> too.
>
> --
> Andreas
>
> _______________________________________________
> varnish-dev mailing list
> varnish-dev [at] varnish-cache
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
>
Attachments: top.jpeg (30.1 KB)
  mv4.png (47.9 KB)


apj at mutt

Jul 23, 2012, 12:39 AM

Post #4 of 7 (915 views)
Permalink
Re: malloc storage memory leak? [In reply to]

On Thu, Jul 19, 2012 at 04:11:20PM -0500, Cal Heldenbrand wrote:
>
> Here's a screenshot of top. This machine was set to malloc max at 5500MB.

And it hasn't passed that. Remember that there's additional overhead of about
1KB/object outside the actual storage backend.

> SMA.s0.g_bytes 5595155188 . Bytes outstanding

It has allocated 5.5G

> SMA.Transient.g_bytes 0 . Bytes outstanding

And isn't gobbling up transient space at the moment.

I don't see a problem (in varnish at least).

--
Andreas

_______________________________________________
varnish-dev mailing list
varnish-dev [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev


cal at fbsdata

Jul 23, 2012, 7:41 AM

Post #5 of 7 (916 views)
Permalink
Re: malloc storage memory leak? [In reply to]

Thanks for the clarification Andreas. I'm running around 600k objects in
memory, which should be around 586MB of overhead.

If I could ask you guys your opinion -- is there a better way to configure
Varnish for my environment? My backend is 22TB of mapping tiles, each file
being anywhere from 100bytes to 3KB. So a small 4GB cache results in just
being an LRU, caching the most popular tiles. Which makes outer zoom
levels very fast, but misses on almost all of the low parcel levels.

Thanks for any advice!

--Cal

On Mon, Jul 23, 2012 at 2:39 AM, Andreas Plesner Jacobsen <apj [at] mutt>wrote:

> On Thu, Jul 19, 2012 at 04:11:20PM -0500, Cal Heldenbrand wrote:
> >
> > Here's a screenshot of top. This machine was set to malloc max at
> 5500MB.
>
> And it hasn't passed that. Remember that there's additional overhead of
> about
> 1KB/object outside the actual storage backend.
>
> > SMA.s0.g_bytes 5595155188 . Bytes outstanding
>
> It has allocated 5.5G
>
> > SMA.Transient.g_bytes 0 . Bytes outstanding
>
> And isn't gobbling up transient space at the moment.
>
> I don't see a problem (in varnish at least).
>
> --
> Andreas
>


drwilco at drwilco

Jul 23, 2012, 8:20 AM

Post #6 of 7 (899 views)
Permalink
Re: malloc storage memory leak? [In reply to]

I would go out and drop 700 bucks on 4x16GB for my server if I were you.

http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=100007952
600336949&IsNodeId=1&name=64GB (4 x 16GB)

22TB of tiles is quite a bit, so maybe go through the logfiles a bit and
see exactly how much having 50G of cache would gain you. And if the
zoomlevel is in the URL of the tiles, I would maybe choose to just cache
the outer N levels, so that inner levels don't blow away the cache of the
outer. Or run two varnishes, one caching outer layers, like I just
described and have a set amount of memory for that, and then run the
second one with a smaller set, just to catch those requests for inner
tiles that somehow are very popular.

Which brings me back to the fact that this is a -dev list (so your
question really doesn't belong here) and I just had an idea.

This separating storage deal, I'm a bit behind on 3/master, so do we have
stevedores exposed to VCL in any way? I recall storage hints, but not sure
if there's anything using that beyond transient?

Anyhoo, it might be worth it to segregate objects by various properties
and not have to run separate varnish instances. :)

Cheers,

Doc


On Mon, 23 Jul 2012, Cal Heldenbrand wrote:

> Thanks for the clarification Andreas. I'm running around 600k objects in
> memory, which should be around 586MB of overhead.
>
> If I could ask you guys your opinion -- is there a better way to configure
> Varnish for my environment? My backend is 22TB of mapping tiles, each file
> being anywhere from 100bytes to 3KB. So a small 4GB cache results in just
> being an LRU, caching the most popular tiles. Which makes outer zoom
> levels very fast, but misses on almost all of the low parcel levels.
>
> Thanks for any advice!
>
> --Cal
>
> On Mon, Jul 23, 2012 at 2:39 AM, Andreas Plesner Jacobsen <apj [at] mutt>wrote:
>
>> On Thu, Jul 19, 2012 at 04:11:20PM -0500, Cal Heldenbrand wrote:
>>>
>>> Here's a screenshot of top. This machine was set to malloc max at
>> 5500MB.
>>
>> And it hasn't passed that. Remember that there's additional overhead of
>> about
>> 1KB/object outside the actual storage backend.
>>
>>> SMA.s0.g_bytes 5595155188 . Bytes outstanding
>>
>> It has allocated 5.5G
>>
>>> SMA.Transient.g_bytes 0 . Bytes outstanding
>>
>> And isn't gobbling up transient space at the moment.
>>
>> I don't see a problem (in varnish at least).
>>
>> --
>> Andreas
>>
>

_______________________________________________
varnish-dev mailing list
varnish-dev [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev


phk at phk

Jul 30, 2012, 8:13 AM

Post #7 of 7 (896 views)
Permalink
Re: malloc storage memory leak? [In reply to]

In message <20120723170907.D37583 [at] ishtar>, "Rogier R. Mulhuijzen" w
rites:

>This separating storage deal, I'm a bit behind on 3/master, so do we have
>stevedores exposed to VCL in any way? I recall storage hints, but not sure
>if there's anything using that beyond transient?

Yes, there is a storage hint: You can pick and choose which stevedore
you want to prefer, not sure if it will do all you need in this case,
but I think it would help.

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk [at] FreeBSD | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Varnish dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.