Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Varnish: Dev

rdbms as backend

 

 

Varnish dev RSS feed   Index | Next | Previous | View Threaded


mrkafk at gmail

Jul 30, 2013, 3:58 AM

Post #1 of 10 (66 views)
Permalink
rdbms as backend

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

This is a peculiar topic that I think goes beyond typical use of
varnish so I post it here.

At my company we have a need peculiar sort of infrastructural subsystem:

- - HTTP requests are done to find if smth is cached
- - RDBMS (mysql, oracle) backends
- - Other subsystems as backends

Clients use unified protocol based on (simple) http requests to get
their data. (it's for this reason that we do not use caches built into
rdbms directly, as well as we do not want to do tight coupling of a
particular client to a particular rdbms or subsystem)


Either we write the whole thing ourselves or we use smth else like
varnish.


I like the thought of using varnish, although I'm not sure if it this
is not shoehorning it into such role. However, when it comes to
caching, load balancing, failover and cached HTTP results serving it's
ideal in such role.

The only problem is backend. Essentially, what we need is e.g. for
mysql backend:

on cache miss:

- - connect to mysql

- - run the query we received in GET/POST/whatever

- - JSONIFy result (query results are not big in our application,
limited size of the result is a tolerable limitation for us)

- - cache result, return it


on cache hit:

- - retrieve from cache, return it


(and so on for other backends)


Is this feasible? Is it even sane? Should I use smth else maybe?

Essentially, what we need are pluggable, modular backends. (obviously
we can handle writing the part that transforms particular backend
response into HTTP response, the snag is how to plug this correctly
into backend usable by varnish)

I was thinking about using VMODs but none of the modules available
seem to meddle with backends themselves somehow.


Thanks!
MK

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJR95xLAAoJEFMgHzhQQ7hOXOYH/2QZgry4O55fRvvEE0I9tPnw
xLCXV/snfnH2jawW01sFDZ87JV3S9s10bqSoqTs21vMkd7M+hHsp3I/wKTg2dwqR
kQ0M9p2aPiLyijI7v6FCcpaorL10wZ0/12i6A+RnsyPLN2FDeDpCCDeB0oVNauqc
K/yE2lfsAvVu+jxlmzygePlV9ZZ+B455G8GFIIvN+S10QFly1rAOLbCg7Mi1KRq9
4BYPsISZnYDIbAUKjQWLQDmfhoXfbgIZQJMLzIJFiQC2+G1WFIVp94Z5JRsVQlHm
PcIx3Gt0zq5Qd198+97qTDZLPO1irNSpYpWLidd9dnA+2rI85HZU+9B0Ktuwc1g=
=/D+o
-----END PGP SIGNATURE-----

_______________________________________________
varnish-dev mailing list
varnish-dev [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev


bilbo at hobbiton

Jul 30, 2013, 5:46 AM

Post #2 of 10 (66 views)
Permalink
Re: rdbms as backend [In reply to]

Hi,

I'm not a varnish dev but I've been working with varnish and DBs for a long
time.

My knee jerk is that you'll end up with more application logic in vcl than
vcl is suitable for. Vcl can't loop or touch response bodies (without
vmods). It'd be kind of neat to skip middleware and plug varnish straight
into the db, but I wouldn't guess it to be worth the effort.

I'd instead write a mini web server using uwsgi or CherryPy to act as
middleware. These tools have great memory footprints and performance.

- Leif
On 2013-07-30 6:30 AM, "Marcin Krol" <mrkafk [at] gmail> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
>
> This is a peculiar topic that I think goes beyond typical use of
> varnish so I post it here.
>
> At my company we have a need peculiar sort of infrastructural subsystem:
>
> - - HTTP requests are done to find if smth is cached
> - - RDBMS (mysql, oracle) backends
> - - Other subsystems as backends
>
> Clients use unified protocol based on (simple) http requests to get
> their data. (it's for this reason that we do not use caches built into
> rdbms directly, as well as we do not want to do tight coupling of a
> particular client to a particular rdbms or subsystem)
>
>
> Either we write the whole thing ourselves or we use smth else like
> varnish.
>
>
> I like the thought of using varnish, although I'm not sure if it this
> is not shoehorning it into such role. However, when it comes to
> caching, load balancing, failover and cached HTTP results serving it's
> ideal in such role.
>
> The only problem is backend. Essentially, what we need is e.g. for
> mysql backend:
>
> on cache miss:
>
> - - connect to mysql
>
> - - run the query we received in GET/POST/whatever
>
> - - JSONIFy result (query results are not big in our application,
> limited size of the result is a tolerable limitation for us)
>
> - - cache result, return it
>
>
> on cache hit:
>
> - - retrieve from cache, return it
>
>
> (and so on for other backends)
>
>
> Is this feasible? Is it even sane? Should I use smth else maybe?
>
> Essentially, what we need are pluggable, modular backends. (obviously
> we can handle writing the part that transforms particular backend
> response into HTTP response, the snag is how to plug this correctly
> into backend usable by varnish)
>
> I was thinking about using VMODs but none of the modules available
> seem to meddle with backends themselves somehow.
>
>
> Thanks!
> MK
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.20 (MingW32)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQEcBAEBAgAGBQJR95xLAAoJEFMgHzhQQ7hOXOYH/2QZgry4O55fRvvEE0I9tPnw
> xLCXV/snfnH2jawW01sFDZ87JV3S9s10bqSoqTs21vMkd7M+hHsp3I/wKTg2dwqR
> kQ0M9p2aPiLyijI7v6FCcpaorL10wZ0/12i6A+RnsyPLN2FDeDpCCDeB0oVNauqc
> K/yE2lfsAvVu+jxlmzygePlV9ZZ+B455G8GFIIvN+S10QFly1rAOLbCg7Mi1KRq9
> 4BYPsISZnYDIbAUKjQWLQDmfhoXfbgIZQJMLzIJFiQC2+G1WFIVp94Z5JRsVQlHm
> PcIx3Gt0zq5Qd198+97qTDZLPO1irNSpYpWLidd9dnA+2rI85HZU+9B0Ktuwc1g=
> =/D+o
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> varnish-dev mailing list
> varnish-dev [at] varnish-cache
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
>


mrkafk at gmail

Jul 30, 2013, 6:12 AM

Post #3 of 10 (66 views)
Permalink
Re: rdbms as backend [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Leif,

Thanks for answer!


What you write about is what we tried at first, using Tornado. It
works, except it simply does not have performance good enough: 600
req/sec while we need 30,000 req/sec (yes 30 thousand). This is for
entire subsystem which can consist of more than 1 machine but at
Tornado performance that would be at least 50 cores over many serves +
HA redundancy + all the associated overhead...

Currently we're into node.js which is able to serve 4000-5000 req/sec,
still not quite good enough.

Varnish performance is so high that it would be very attractive in
this role.

Re application logic in VCL I don't think we'd suffer this much: it's
"just" about fast cache. About the only serious problem I could see is
if failover and load balancing in varnish can't be made work well
enough (detect failure quickly and start using spare/backup
systematically instead of timing out on failed backends, cause uneven
load on backends, etc).


Thanks!
MK



W dniu 7/30/2013 14:46, Leif Pedersen pisze:
> Hi,
>
> I'm not a varnish dev but I've been working with varnish and DBs
> for a long time.
>
> My knee jerk is that you'll end up with more application logic in
> vcl than vcl is suitable for. Vcl can't loop or touch response
> bodies (without vmods). It'd be kind of neat to skip middleware and
> plug varnish straight into the db, but I wouldn't guess it to be
> worth the effort.
>
> I'd instead write a mini web server using uwsgi or CherryPy to act
> as middleware. These tools have great memory footprints and
> performance.
>
> - Leif
>
> On 2013-07-30 6:30 AM, "Marcin Krol" <mrkafk [at] gmail
> <mailto:mrkafk [at] gmail>> wrote:
>
> Hello,
>
> This is a peculiar topic that I think goes beyond typical use of
> varnish so I post it here.
>
> At my company we have a need peculiar sort of infrastructural
> subsystem:
>
> - HTTP requests are done to find if smth is cached - RDBMS (mysql,
> oracle) backends - Other subsystems as backends
>
> Clients use unified protocol based on (simple) http requests to
> get their data. (it's for this reason that we do not use caches
> built into rdbms directly, as well as we do not want to do tight
> coupling of a particular client to a particular rdbms or
> subsystem)
>
>
> Either we write the whole thing ourselves or we use smth else like
> varnish.
>
>
> I like the thought of using varnish, although I'm not sure if it
> this is not shoehorning it into such role. However, when it comes
> to caching, load balancing, failover and cached HTTP results
> serving it's ideal in such role.
>
> The only problem is backend. Essentially, what we need is e.g. for
> mysql backend:
>
> on cache miss:
>
> - connect to mysql
>
> - run the query we received in GET/POST/whatever
>
> - JSONIFy result (query results are not big in our application,
> limited size of the result is a tolerable limitation for us)
>
> - cache result, return it
>
>
> on cache hit:
>
> - retrieve from cache, return it
>
>
> (and so on for other backends)
>
>
> Is this feasible? Is it even sane? Should I use smth else maybe?
>
> Essentially, what we need are pluggable, modular backends.
> (obviously we can handle writing the part that transforms
> particular backend response into HTTP response, the snag is how to
> plug this correctly into backend usable by varnish)
>
> I was thinking about using VMODs but none of the modules available
> seem to meddle with backends themselves somehow.
>
>
> Thanks! MK
>
>
> _______________________________________________ varnish-dev mailing
> list varnish-dev [at] varnish-cache
> <mailto:varnish-dev [at] varnish-cache>
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJR97uhAAoJEFMgHzhQQ7hOSPkIAIRCFwLmt5FtIa/VHVJoMxZR
BeLx5yKZBSCHJPhBIeywK4sd7+bW7AMLAEh9VYRfe5c7yuZO8mxAXYmvVXPmD/cl
ySUqZJD0LnpdL1kos25K/r8vn6PRPul2jm22u0SloKqQ5ME8TGNsk/SLmFbiLwCt
nhfL7ia+19/ZVK3XSZ5pcvsCyrM8flFS3TdM5TeKWDBhxu0uuccaBkuYy0qOK0zs
5YoLUD9GRwj9WN0pt2xFWYncgFP2wGsPzO9YrpVyc73ExXviI1LF+i+1K2wrOk+v
nYqekDAHufQmLfeupY5Dq/EHyWnN2Yec/cLaiUQADZKs7ua43GkiUqqJ/pxIUbk=
=hymW
-----END PGP SIGNATURE-----

_______________________________________________
varnish-dev mailing list
varnish-dev [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev


mrkafk at gmail

Jul 30, 2013, 6:26 AM

Post #4 of 10 (66 views)
Permalink
Re: rdbms as backend [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


P.S. What I want to do is precisely avoiding modifying response
bodies: JSONIFy/format in the backend handler, handing over typical
HTTP objects to varnish in "Fetch from backend" operation:

https://www.varnish-software.com/static/book/_images/vcl.png




W dniu 7/30/2013 14:46, Leif Pedersen pisze:
> Hi,
>
> I'm not a varnish dev but I've been working with varnish and DBs
> for a long time.
>
> My knee jerk is that you'll end up with more application logic in
> vcl than vcl is suitable for. Vcl can't loop or touch response
> bodies (without vmods). It'd be kind of neat to skip middleware and
> plug varnish straight into the db, but I wouldn't guess it to be
> worth the effort.
>
> I'd instead write a mini web server using uwsgi or CherryPy to act
> as middleware. These tools have great memory footprints and
> performance.
>
> - Leif
>
> On 2013-07-30 6:30 AM, "Marcin Krol" <mrkafk [at] gmail
> <mailto:mrkafk [at] gmail>> wrote:
>
> Hello,
>
> This is a peculiar topic that I think goes beyond typical use of
> varnish so I post it here.
>
> At my company we have a need peculiar sort of infrastructural
> subsystem:
>
> - HTTP requests are done to find if smth is cached - RDBMS (mysql,
> oracle) backends - Other subsystems as backends
>
> Clients use unified protocol based on (simple) http requests to
> get their data. (it's for this reason that we do not use caches
> built into rdbms directly, as well as we do not want to do tight
> coupling of a particular client to a particular rdbms or
> subsystem)
>
>
> Either we write the whole thing ourselves or we use smth else like
> varnish.
>
>
> I like the thought of using varnish, although I'm not sure if it
> this is not shoehorning it into such role. However, when it comes
> to caching, load balancing, failover and cached HTTP results
> serving it's ideal in such role.
>
> The only problem is backend. Essentially, what we need is e.g. for
> mysql backend:
>
> on cache miss:
>
> - connect to mysql
>
> - run the query we received in GET/POST/whatever
>
> - JSONIFy result (query results are not big in our application,
> limited size of the result is a tolerable limitation for us)
>
> - cache result, return it
>
>
> on cache hit:
>
> - retrieve from cache, return it
>
>
> (and so on for other backends)
>
>
> Is this feasible? Is it even sane? Should I use smth else maybe?
>
> Essentially, what we need are pluggable, modular backends.
> (obviously we can handle writing the part that transforms
> particular backend response into HTTP response, the snag is how to
> plug this correctly into backend usable by varnish)
>
> I was thinking about using VMODs but none of the modules available
> seem to meddle with backends themselves somehow.
>
>
> Thanks! MK
>
>
> _______________________________________________ varnish-dev mailing
> list varnish-dev [at] varnish-cache
> <mailto:varnish-dev [at] varnish-cache>
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJR978LAAoJEFMgHzhQQ7hOugUH/2WLtSaYO+/iLlyX/moGNBqR
HTa0E1W+43vIVIeq9OLzKmmxzJm2075Erq952B9/UIVnxNDZ4jo+whOk8foo1K0Q
h4A8XloKrMG1BSNnYqCsdop202YIIPI1AxX8V30slx1btjttdVedAQxwlGZ1j6NC
S43z5EqJ0dKjOAv1JVTLkeAkCWJCXmYj0xIiPYNCcRQ9uR8QCA1Nm+cL3MDuzcIS
TQuEW/wihy4bfNeKe5H7+GutARGHHtiJXzvzgiGFLcto0IasTUroTKN05MTuIjEE
e4t7n4MfLxwninEWitF23UjCQUvoq05cZMihxMD1XjSPtG+qsK+BmyTHUqTab98=
=bT0L
-----END PGP SIGNATURE-----

_______________________________________________
varnish-dev mailing list
varnish-dev [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev


varnish at bsdchicks

Jul 30, 2013, 3:09 PM

Post #5 of 10 (66 views)
Permalink
Re: rdbms as backend [In reply to]

My kneejerk reaction is somewhat the same. Varnish is HTTP, and should stay
HTTP only.

It would be the best solution to have an HTTP to your RDBMS layer. Call it
middleware if you will, or maybe in interface.

You don't really need logic for the situation you outlined, if Varnish can
do the caching, all you need to do is set your HTTP-to-RDBMS layer as
backend. Then it's just the standard hit/miss stuff from Varnish point of
view.

If you can't write anything yourself that's fast enough to do that (in any
language), it's not going to be any faster if it's part of Varnish. So last
ditch effort, write something that's multithreaded or async in C to handle
it.

If that is fast enough, and there's a really good reason to integrate it
with Varnish we can always revisit putting it into Varnish. :)


On Tue, Jul 30, 2013 at 3:26 PM, Marcin Krol <mrkafk [at] gmail> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> P.S. What I want to do is precisely avoiding modifying response
> bodies: JSONIFy/format in the backend handler, handing over typical
> HTTP objects to varnish in "Fetch from backend" operation:
>
> https://www.varnish-software.com/static/book/_images/vcl.png
>
>
>
>
> W dniu 7/30/2013 14:46, Leif Pedersen pisze:
> > Hi,
> >
> > I'm not a varnish dev but I've been working with varnish and DBs
> > for a long time.
> >
> > My knee jerk is that you'll end up with more application logic in
> > vcl than vcl is suitable for. Vcl can't loop or touch response
> > bodies (without vmods). It'd be kind of neat to skip middleware and
> > plug varnish straight into the db, but I wouldn't guess it to be
> > worth the effort.
> >
> > I'd instead write a mini web server using uwsgi or CherryPy to act
> > as middleware. These tools have great memory footprints and
> > performance.
> >
> > - Leif
> >
> > On 2013-07-30 6:30 AM, "Marcin Krol" <mrkafk [at] gmail
> > <mailto:mrkafk [at] gmail>> wrote:
> >
> > Hello,
> >
> > This is a peculiar topic that I think goes beyond typical use of
> > varnish so I post it here.
> >
> > At my company we have a need peculiar sort of infrastructural
> > subsystem:
> >
> > - HTTP requests are done to find if smth is cached - RDBMS (mysql,
> > oracle) backends - Other subsystems as backends
> >
> > Clients use unified protocol based on (simple) http requests to
> > get their data. (it's for this reason that we do not use caches
> > built into rdbms directly, as well as we do not want to do tight
> > coupling of a particular client to a particular rdbms or
> > subsystem)
> >
> >
> > Either we write the whole thing ourselves or we use smth else like
> > varnish.
> >
> >
> > I like the thought of using varnish, although I'm not sure if it
> > this is not shoehorning it into such role. However, when it comes
> > to caching, load balancing, failover and cached HTTP results
> > serving it's ideal in such role.
> >
> > The only problem is backend. Essentially, what we need is e.g. for
> > mysql backend:
> >
> > on cache miss:
> >
> > - connect to mysql
> >
> > - run the query we received in GET/POST/whatever
> >
> > - JSONIFy result (query results are not big in our application,
> > limited size of the result is a tolerable limitation for us)
> >
> > - cache result, return it
> >
> >
> > on cache hit:
> >
> > - retrieve from cache, return it
> >
> >
> > (and so on for other backends)
> >
> >
> > Is this feasible? Is it even sane? Should I use smth else maybe?
> >
> > Essentially, what we need are pluggable, modular backends.
> > (obviously we can handle writing the part that transforms
> > particular backend response into HTTP response, the snag is how to
> > plug this correctly into backend usable by varnish)
> >
> > I was thinking about using VMODs but none of the modules available
> > seem to meddle with backends themselves somehow.
> >
> >
> > Thanks! MK
> >
> >
> > _______________________________________________ varnish-dev mailing
> > list varnish-dev [at] varnish-cache
> > <mailto:varnish-dev [at] varnish-cache>
> > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
> >
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.20 (MingW32)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQEcBAEBAgAGBQJR978LAAoJEFMgHzhQQ7hOugUH/2WLtSaYO+/iLlyX/moGNBqR
> HTa0E1W+43vIVIeq9OLzKmmxzJm2075Erq952B9/UIVnxNDZ4jo+whOk8foo1K0Q
> h4A8XloKrMG1BSNnYqCsdop202YIIPI1AxX8V30slx1btjttdVedAQxwlGZ1j6NC
> S43z5EqJ0dKjOAv1JVTLkeAkCWJCXmYj0xIiPYNCcRQ9uR8QCA1Nm+cL3MDuzcIS
> TQuEW/wihy4bfNeKe5H7+GutARGHHtiJXzvzgiGFLcto0IasTUroTKN05MTuIjEE
> e4t7n4MfLxwninEWitF23UjCQUvoq05cZMihxMD1XjSPtG+qsK+BmyTHUqTab98=
> =bT0L
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> varnish-dev mailing list
> varnish-dev [at] varnish-cache
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
>
>


bilbo at hobbiton

Jul 30, 2013, 5:45 PM

Post #6 of 10 (66 views)
Permalink
Re: rdbms as backend [In reply to]

Interesting. So you're saying that you need 30k rps to the DB, and the
request rate to the front end of Varnish is much higher? If the Varnish
front end is "only" getting 30k rps, then perhaps there's something you can
do to improve the cacheability of objects. Otherwise, that's impressive
even by my standards.

You're sure your DB can keep up in a useful way with 30k rps? That's not
the real bottleneck, is it? I've made that mistake myself, so I feel
compelled to ask. :) If so, also quite impressive that your DB can keep up.

Have you tried implementing this sort of middleware in node.js? Sounds like
you've used node.js a lot, but not for this particular problem if I
understand correctly. Perhaps it would be worth the experiment.

Here's a great read of WSGI servers with surprising performance
differences. Tornado looks okay, but there is better.
http://nichol.as/benchmark-of-python-web-servers

It sounds like the middleware is really trivial, and compared to the entry
bar for adapting Varnish, probably worth trying several WSGI servers and/or
node.js if need be.

I am indeed intrigued by connecting directly to the DB, but you can
probably see my skepticism between the lines here in bright orange. :)
Seems like a much more flexible approach to use middleware, worth a bit of
extra hardware...on the other hand, maybe not worth it if the cost
difference really is 10x.

- Leif


On Tue, Jul 30, 2013 at 8:26 AM, Marcin Krol <mrkafk [at] gmail> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> P.S. What I want to do is precisely avoiding modifying response
> bodies: JSONIFy/format in the backend handler, handing over typical
> HTTP objects to varnish in "Fetch from backend" operation:
>
> https://www.varnish-software.com/static/book/_images/vcl.png
>
>
>
>
> W dniu 7/30/2013 14:46, Leif Pedersen pisze:
> > Hi,
> >
> > I'm not a varnish dev but I've been working with varnish and DBs
> > for a long time.
> >
> > My knee jerk is that you'll end up with more application logic in
> > vcl than vcl is suitable for. Vcl can't loop or touch response
> > bodies (without vmods). It'd be kind of neat to skip middleware and
> > plug varnish straight into the db, but I wouldn't guess it to be
> > worth the effort.
> >
> > I'd instead write a mini web server using uwsgi or CherryPy to act
> > as middleware. These tools have great memory footprints and
> > performance.
> >
> > - Leif
> >
> > On 2013-07-30 6:30 AM, "Marcin Krol" <mrkafk [at] gmail
> > <mailto:mrkafk [at] gmail>> wrote:
> >
> > Hello,
> >
> > This is a peculiar topic that I think goes beyond typical use of
> > varnish so I post it here.
> >
> > At my company we have a need peculiar sort of infrastructural
> > subsystem:
> >
> > - HTTP requests are done to find if smth is cached - RDBMS (mysql,
> > oracle) backends - Other subsystems as backends
> >
> > Clients use unified protocol based on (simple) http requests to
> > get their data. (it's for this reason that we do not use caches
> > built into rdbms directly, as well as we do not want to do tight
> > coupling of a particular client to a particular rdbms or
> > subsystem)
> >
> >
> > Either we write the whole thing ourselves or we use smth else like
> > varnish.
> >
> >
> > I like the thought of using varnish, although I'm not sure if it
> > this is not shoehorning it into such role. However, when it comes
> > to caching, load balancing, failover and cached HTTP results
> > serving it's ideal in such role.
> >
> > The only problem is backend. Essentially, what we need is e.g. for
> > mysql backend:
> >
> > on cache miss:
> >
> > - connect to mysql
> >
> > - run the query we received in GET/POST/whatever
> >
> > - JSONIFy result (query results are not big in our application,
> > limited size of the result is a tolerable limitation for us)
> >
> > - cache result, return it
> >
> >
> > on cache hit:
> >
> > - retrieve from cache, return it
> >
> >
> > (and so on for other backends)
> >
> >
> > Is this feasible? Is it even sane? Should I use smth else maybe?
> >
> > Essentially, what we need are pluggable, modular backends.
> > (obviously we can handle writing the part that transforms
> > particular backend response into HTTP response, the snag is how to
> > plug this correctly into backend usable by varnish)
> >
> > I was thinking about using VMODs but none of the modules available
> > seem to meddle with backends themselves somehow.
> >
> >
> > Thanks! MK
> >
> >
> > _______________________________________________ varnish-dev mailing
> > list varnish-dev [at] varnish-cache
> > <mailto:varnish-dev [at] varnish-cache>
> > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
> >
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.20 (MingW32)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQEcBAEBAgAGBQJR978LAAoJEFMgHzhQQ7hOugUH/2WLtSaYO+/iLlyX/moGNBqR
> HTa0E1W+43vIVIeq9OLzKmmxzJm2075Erq952B9/UIVnxNDZ4jo+whOk8foo1K0Q
> h4A8XloKrMG1BSNnYqCsdop202YIIPI1AxX8V30slx1btjttdVedAQxwlGZ1j6NC
> S43z5EqJ0dKjOAv1JVTLkeAkCWJCXmYj0xIiPYNCcRQ9uR8QCA1Nm+cL3MDuzcIS
> TQuEW/wihy4bfNeKe5H7+GutARGHHtiJXzvzgiGFLcto0IasTUroTKN05MTuIjEE
> e4t7n4MfLxwninEWitF23UjCQUvoq05cZMihxMD1XjSPtG+qsK+BmyTHUqTab98=
> =bT0L
> -----END PGP SIGNATURE-----
>



--

As implied by email protocols, the information in this message is
not confidential. Any middle-man or recipient may inspect, modify,
copy, forward, reply to, delete, or filter email for any purpose unless
said parties are otherwise obligated. As the sender, I acknowledge that
I have a lower expectation of the control and privacy of this message
than I would a post-card. Further, nothing in this message is
legally binding without cryptographic evidence of its integrity.

http://bilbo.hobbiton.org/wiki/Eat_My_Sig


mrkafk at gmail

Jul 31, 2013, 1:51 AM

Post #7 of 10 (66 views)
Permalink
Re: rdbms as backend [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Leif!

W dniu 7/31/2013 02:45, Leif Pedersen pisze:
> Interesting. So you're saying that you need 30k rps to the DB, and
> the request rate to the front end of Varnish is much higher? If the
> Varnish front end is "only" getting 30k rps, then perhaps there's
> something you can do to improve the cacheability of objects.
> Otherwise, that's impressive even by my standards.

> You're sure your DB can keep up in a useful way with 30k rps?

That's not a single DB, that's *all the subsystems* for which we cache
the results: 30+ mysql and oracle instances (all replicated into
additional set of mirrors for HA course, 3 DBAs babysitting all this
rubbish), several other specialized subsystems. If you aggregate
traffic for all those, it's about 30K rps, more in peak hrs and under
some circumstances actually.

That those cannot keep up is precisely why we're developing caching
layer (actually we have one, but it's spaghetti C developed years ago,
basically unmaintainable by now). The load on those machines is high
at all times, as queries are not very expensive but they're not
trivially cheap either.


That's not
> the real bottleneck, is it? I've made that mistake myself, so I
> feel compelled to ask. :) If so, also quite impressive that your DB
> can keep up.
>
> Have you tried implementing this sort of middleware in node.js?
> Sounds like you've used node.js a lot,

We haven't, that's the problem: we're a Python shop (with some C
skills available too). Other divisions have used node.js a lot (hence
that's the next solution under consideration as we can get some
assistance).

Still not good enough. Let's face it: caching server is what C / C++ /
binary-compiled static-typing high-performance system-programming
language is built for. In ideal world I'd build this thing in D
(that's unavailable for obvious "chicken" lack of solutions and "egg"
lack of human skills dilemma).


but not for this particular problem if I
> understand correctly. Perhaps it would be worth the experiment.
>
> Here's a great read of WSGI servers with surprising performance
> differences. Tornado looks okay, but there is better.
> http://nichol.as/benchmark-of-python-web-servers

Erm, I do not want to sound ungrateful but that's exactly one of the
pages I started with...

My colleagues investigated gevent. It fell by the waysides, mostly bc
of some Python's C-based extensions leaking memory under so much load.

I have investigated FAPWS3 with surprisingly good results: 3K rps, no
leaking (at least in my application..). It's little known but works
suprisingly well. I'm not sure if credit goes to libev or good FAPWS3
coding but there it is.

Still, not good enough. Etc etc. Memory leaks, crashes, lots of failed
requests, etc etc. Heck I modified hello world example from Cowboy
(Erlang-based http framework) to do stuff like talking over memcached
protocol (another thing on working pile for this caching server) and
got 5K rps but it's not like Erlang programmers are available in
numbers and we're not going to hire one for this project alone. Oh well.

I thought: "what the heck I'll give varnish a try" (on caching
http-interfaced subsystem backend, we have some of those apart from
rdbmses). Result: 7K rps at 1K concurrent connections.

And stays approximately at this rate with increasing number of
concurrent connections, basically up to 5K concurrent connections and
grand total of 4 failed requests (0 at lower rates).

Woohoo!

Other solutions were basically crushed like bugs under this number of
concurrent connections. That's why you should not trust those blog
pages with high benchmark results: it's all fine and dandy to get
those serially. But if the solution has to handle large number of
concurrent connections at any moment - that's where things turn sour.
I had more than one high-serial-performance Python solution fail
miserably under such circumstances.

But now I have a problem: how to plug mysql or oracle into varnish?
I've done some C coding but replacing http-oriented backend handling
in varnish is a little ambitious for me.


>
> It sounds like the middleware is really trivial, and compared to
> the entry bar for adapting Varnish, probably worth trying several
> WSGI servers and/or node.js if need be.
>
> I am indeed intrigued by connecting directly to the DB, but you
> can probably see my skepticism between the lines here in bright
> orange. :)

If I understand you correctly, you would not connect to DB directly
either?

We tried that in the past (in small scale). It works for the moment
and that's the problem: WI you need to say, upgrade mysql? Your entire
client logic layer changes, in most/all of the infrastructure. Tight
coupling. Sucks.


> Seems like a much more flexible approach to use middleware, worth a
> bit of extra hardware...on the other hand, maybe not worth it if
> the cost difference really is 10x.

That's what we're trying to reduce: infrastructure, power, maintenance
costs.


Regards,
MK
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJR+NAhAAoJEFMgHzhQQ7hOPfYH/iWRxNhVICFZw2I7F/hIUHTv
fL+vk4a6mXLGdK4S8tjDqo9xPXBLpdBHUmQiPqVLZouwz/zy+E15l0zteMcj08Qg
LPrq8/m9E2smcFvPwKTTVpUtq0VmE+MoZqq289VbLxxoxN8v9mwzPy2C/iDBwMu9
hp939RCBTkATJ7XP+ilXvumKsMhRFVCfbdkpbQbSjNifEiDYplwGLV4FheuMOa9F
T4k3M0kFu3BOJmqIvkGrtV5n8ygRtOEn+aK+C/Kq8M2wUsumsLHVnfk+KFmoASev
sh3fRK7AV0lhXoxNM6TG994UqzFZ9zh3yZFTPwYED1ck4VYX8/D1xWuGjrYzN8s=
=VHHc
-----END PGP SIGNATURE-----

_______________________________________________
varnish-dev mailing list
varnish-dev [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev


bilbo at hobbiton

Aug 1, 2013, 8:57 AM

Post #8 of 10 (60 views)
Permalink
Re: rdbms as backend [In reply to]

Hm, lemmie step back a sec. So as I understand, you currently get 30k
frontend requests per sec, and with Varnish to cache results, you have
about 7k backend requests per sec. Does this line up now?

Seems to me that if you're provisioning the middleware for 7k rqs instead
of 30k rqs, the problem is much easier to solve. It may require a few
machines, but it sounds like your DB costs are so high that saving you 76%
on database traffic would be an easy budget to meet. You've done FAPWS3 on
one machine at 3k rps? How about simply running that solution on 3 machines
plus a spare or two, which should provision it for about 9k rps? One nice
thing about this approach is that it's usually far easier to add (and fail
over) HTTP nodes and database clients than to add database servers.

I'm pragmatic in this. If the middleware costs a lot more than a custom
vmod to connect to the DBs, then I'd most likely do that. My skepticism is
just that it doesn't seem likely. So I won't answer your question ("you
would not connect to DB directly either?") in the absolute affirmative.
However, with an experienced guess and only a little information about your
problem space, that is my inclination, yes. And I wouldn't build the vmod
for purity or fun -- it sounds to me like a daunting gnarly thing with lots
of maintainability issues. But don't let me tell you not to, if you really
believe in the cause. :)

With deference to the authors, I'd be a bit astonished to see such a vmod
in Varnish's distribution. But if it's worth it in comparison to a
middleware solution (be it Python, node.js, C++, or whatever), the results
would certainly be interesting as a third-party vmod if you don't mind
sharing.

- Leif


--

As implied by email protocols, the information in this message is
not confidential. Any middle-man or recipient may inspect, modify,
copy, forward, reply to, delete, or filter email for any purpose unless
said parties are otherwise obligated. As the sender, I acknowledge that
I have a lower expectation of the control and privacy of this message
than I would a post-card. Further, nothing in this message is
legally binding without cryptographic evidence of its integrity.

http://bilbo.hobbiton.org/wiki/Eat_My_Sig


mrkafk at gmail

Aug 1, 2013, 9:52 AM

Post #9 of 10 (59 views)
Permalink
Re: rdbms as backend [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Leif,

Thanks for interest in this topic! Nobody else picked it up it seems. :-)


To tell the truth it's somewhat different:

1. 30K rps *total* is what we have to handle at typical hrs.


2. 7K rps is what I managed to get ONE INSTANCE of Varnish to achieve
*as cache for one of our HTTP backends*. (our legacy solution caches
Oracle, MySQL *and* HTTP backends, transforming it all into JSON/HTTP
responses, used by lots of different client libraries and subsystems -
this is all internal infrastructure).


3. 20-30 instances of node.js spread across several machines + nginx
at the front would be doable for obvious reasons of load balancing and
failover (that's what tests of working prototype implemented in
node.js suggest). That's sort of plan A. I'm investigating plan B.

What I don't like about plan A is that it's not only heavy and costly,
what's even worse is that we have to build our own HA and load
balancing into it that like most of infrastructual stuff somebody
develops for themselves is half-baked and coded in haste. Another
piece of old unmaintainable cruft in the making, after telling
ourselves once again "this time it will be different" (no it won't be
unless we take a different approach).

Varnish has probes, load balancing, random and round robin failover
and load balancing that's (probably?) battle-tested by many people and
companies and frankly the bulk of dev costs is on somebody else's
shoulders.

I do not like building caching server myself anymore than I like
building nginx or apache replacements myself.

All clients talk http anyway. Some backends talk http anyway.

So the only thing I'd have to do to achieve nirvana would be making
databases translate their result sets into JSON and make them
available over http. Which eventually we have to do anyway at some
place (lots of different clients, can't rewrite and upgrade them all
anyway on version change of mysql from X to Y).

We have already done loose coupling in the databases: to avoid having
to rewrite SQL on every upgrade or possibly switch to another DB, we
implement everything possible in dbs as stored procedures and use dirt
simple queries calling those stored procedures, so SQL "frontend"
stays the same while you can tweak stored procedure behind it to your
liking. There's only a single step from there to query result
uniformization.

Admittedly, this sort of thing - plugging db into varnish - looks
weird, even outlandish. But it's so logical and fits so well I have
trouble giving up this thought!


I may give up though and simply add another layer between varnish and
databases.

Regards,
MK











W dniu 8/1/2013 17:57, Leif Pedersen pisze:
> Hm, lemmie step back a sec. So as I understand, you currently get
> 30k frontend requests per sec, and with Varnish to cache results,
> you have about 7k backend requests per sec. Does this line up now?
>
> Seems to me that if you're provisioning the middleware for 7k rqs
> instead of 30k rqs, the problem is much easier to solve. It may
> require a few machines, but it sounds like your DB costs are so
> high that saving you 76% on database traffic would be an easy
> budget to meet. You've done FAPWS3 on one machine at 3k rps? How
> about simply running that solution on 3 machines plus a spare or
> two, which should provision it for about 9k rps? One nice thing
> about this approach is that it's usually far easier to add (and
> fail over) HTTP nodes and database clients than to add database
> servers.
>
> I'm pragmatic in this. If the middleware costs a lot more than a
> custom vmod to connect to the DBs, then I'd most likely do that. My
> skepticism is just that it doesn't seem likely. So I won't answer
> your question ("you would not connect to DB directly either?") in
> the absolute affirmative. However, with an experienced guess and
> only a little information about your problem space, that is my
> inclination, yes. And I wouldn't build the vmod for purity or fun
> -- it sounds to me like a daunting gnarly thing with lots of
> maintainability issues. But don't let me tell you not to, if you
> really believe in the cause. :)
>
> With deference to the authors, I'd be a bit astonished to see such
> a vmod in Varnish's distribution. But if it's worth it in
> comparison to a middleware solution (be it Python, node.js, C++, or
> whatever), the results would certainly be interesting as a
> third-party vmod if you don't mind sharing.
>
> - Leif
>
>
> --
>
> As implied by email protocols, the information in this message is
> not confidential. Any middle-man or recipient may inspect,
> modify, copy, forward, reply to, delete, or filter email for any
> purpose unless said parties are otherwise obligated. As the
> sender, I acknowledge that I have a lower expectation of the
> control and privacy of this message than I would a post-card.
> Further, nothing in this message is legally binding without
> cryptographic evidence of its integrity.
>
> http://bilbo.hobbiton.org/wiki/Eat_My_Sig
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJR+pJSAAoJEFMgHzhQQ7hOEjIH/R3MrpOdXSPeBDZVTgrhg63U
lUjsuobvDJDYXeMSoBNI24aeFUdCwnWrPJjNHN1qG9TAZl7+Rtq0muLkHb/ToAlx
n4L7A18omg1Lqp9SAHboL4+OHgpBSfNOLDtuuXS6L1NoOxkdWJxAyBVCrz1x+QqO
HxvKWPy0pVUYwX6P9tdjTTNIjlzJpNrshV036MCe3cdnFRMvlR1sFGOKc8DGQjEb
6VZ7pxQEzCFj6D9cYfe4a8X3x46nRMshlD+k3su2Zsp7t/450mEKS4LiJf9H6Aht
rqhZl1LHbXm5dAghJxC4FZOhsr2HTgpQQSdHgA79gv+Ej1GhkO9iAHcjI2XPouA=
=92FJ
-----END PGP SIGNATURE-----

_______________________________________________
varnish-dev mailing list
varnish-dev [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev


slaweuk at gmail

Aug 1, 2013, 10:18 AM

Post #10 of 10 (59 views)
Permalink
Re: rdbms as backend [In reply to]

On 1 Aug 2013, at 17:52, Marcin Krol <mrkafk [at] gmail> wrote:
> So the only thing I'd have to do to achieve nirvana would be making
> databases translate their result sets into JSON and make them
> available over http. Which eventually we have to do anyway at some
> place (lots of different clients, can't rewrite and upgrade them all
> anyway on version change of mysql from X to Y).


Hi Marcin,

There are some projects already covering this subject:

http://code.nytimes.com/projects/dbslayer
http://code.google.com/p/mod-ndb/
http://www.slashdb.com/
http://jersey.java.net/
http://restsql.org/

I would also consider adding HAProxy behind the Varnish to support more complex proxy configurations (also better insights in connections metrics).

Best Regards,
Slawek

Varnish dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.