Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Varnish: Misc

Mass redirects/backend selection with Varnish?

 

 

Varnish misc RSS feed   Index | Next | Previous | View Threaded


jwellband at gmail

Oct 23, 2011, 10:40 AM

Post #1 of 3 (312 views)
Permalink
Mass redirects/backend selection with Varnish?

We're looking to move from squid as a reverse proxy to using varnish.
However, I'm not able to come up with a drop-in replacement for
backend selection and 301 redirects.

Currently, we have squid using squirm[1] as a redirector. For every
request coming to squid, a list of squirm patterns (regexes) is
consulted and a rewritten URL is constructed. This URL can be either a
301 redirect (URL prefaced with "301:") or a backend URL. If it's a
301, squid removes the 301: and serves up the redirect. If it's a
backend URL, squid rewrites the URL internally (the new URL is what
gets stored in the cache but the client never sees it) and fetches it
from the backend. We don't configure squid itself to use a single
origin. If the URL isn't matched by a squirm pattern, it's not
successfully served by squid.

Given a single squid instance (we have 7 currently), there can be
anywhere from 50 to 1000 patterns. They are stored in a
space-delimited text file with a regex that is matched on the URL and
a rewritten URL, some of which use backreferences from the matched
regex.

Obviously, we could do this with a giant list of if statements using
req.url and/or req.host. This doesn't strike me as ideal. Our thinking
is to abstract the selection of backend URL and/or whether to 301
redirect out of VCL. We've considered writing a custom VMOD to handle
this (either implement the squirm functionality or use squirm in the
same way that squid does), but I wanted to get the community's take
before we reinvent the wheel or do something crazy.

Is this something that is feasible with varnish or should I move this
functionality elsewhere in the stack? Any ideas are welcome.

Thanks much!

[1] http://squirm.foote.com.au/

--
HTH, YMMV, HANW :)

Jason

The path to enlightenment is /usr/bin/enlightenment.

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc


varnish at mm

Oct 24, 2011, 7:05 AM

Post #2 of 3 (309 views)
Permalink
Re: Mass redirects/backend selection with Varnish? [In reply to]

On Sun, Oct 23, 2011 at 01:40:57PM -0400, Jason W. wrote:
> We're looking to move from squid as a reverse proxy to using varnish.
> However, I'm not able to come up with a drop-in replacement for
> backend selection and 301 redirects.
>
> Currently, we have squid using squirm[1] as a redirector. For every
> request coming to squid, a list of squirm patterns (regexes) is
> consulted and a rewritten URL is constructed. This URL can be either a
> 301 redirect (URL prefaced with "301:") or a backend URL. If it's a
> 301, squid removes the 301: and serves up the redirect. If it's a
> backend URL, squid rewrites the URL internally (the new URL is what
> gets stored in the cache but the client never sees it) and fetches it
> from the backend. We don't configure squid itself to use a single
> origin. If the URL isn't matched by a squirm pattern, it's not
> successfully served by squid.
>
> Given a single squid instance (we have 7 currently), there can be
> anywhere from 50 to 1000 patterns. They are stored in a
> space-delimited text file with a regex that is matched on the URL and
> a rewritten URL, some of which use backreferences from the matched
> regex.
>
> Obviously, we could do this with a giant list of if statements using
> req.url and/or req.host. This doesn't strike me as ideal.

I had the same thought when we migrated from squid + jesred to Varnish;
we had several thousand patterns across a few sites. I did simplify
things a little by implementing virtual-hosting type behaviour within
Varnish, so it only had to process redirects for the particular site the
request was actually for. If you can make a similar optimisation you
might find the amount of processing per request drops considerably.

While the if/elsif ladder looks a bit ugly and like a lot of work, it's
actually pretty much exactly what squirm is already doing. So I think
you'll find the performance to be about the same; possibly a bit faster
since if you implement it within VCL you won't have the overhead of
communicating over a pipe.

The only issue you'd have with doing it in Varnish is if your backend
hosts are pretty much arbitrary; Varnish needs each origin to be
explicitly defined. This requires a slight change to the logic, in that
you need to set req.backend appropriately, in addition to req.host
and/or req.url. But, it's not really complex.

> Our thinking is to abstract the selection of backend URL and/or
> whether to 301 redirect out of VCL. We've considered writing a custom
> VMOD to handle this (either implement the squirm functionality or use
> squirm in the same way that squid does), but I wanted to get the
> community's take before we reinvent the wheel or do something crazy.
>
> Is this something that is feasible with varnish or should I move this
> functionality elsewhere in the stack? Any ideas are welcome.

I do think that abstracting it out of the VCL so you don't have to
actually manage the if/elsif ladder directly is probably a good idea.
It'd certainly be workable but you've probably got better things to do
with your time. I guess it depends how frequently you make changes or
additions to your redirections.

When I moved to Varnish, I took the opportunity to place all the
redirects and rewrites into our "DNS management system", which is just
an in-house hodge-podge of Python and perl. The redirects are specified
in a similar format to squirm/jesred, and I have a script that parses
them and spits out appropriate VCL. That file is then rsynced to each of
the proxies, and included from the appropriate site's configuration.

If you are happy editing it directly, then there's no real issue. One
nice thing about Varnish is you can tell it to load a new config while
it's running, and if it can't compile it, it'll just tell you to rack
off and keep running the existing one. So even if you break the config
the site keeps running without a hiccup. jesred liked to just stop doing
any redirects if I broke its pattern file, and the comment about "Dodo
mode" makes me think squirm may well do the same thing.

So in summary: try not to fret about the ugliness of a massive if/elsif
ladder. That's what squirm is doing, anyway. It might be a good time to
decide if directly editing the pattern file is how you want to be
managing all those redirects, and if that's what's really bugging you,
implement a better solution for that. The VCL itself isn't really a
problem, Varnish seems quite happy to load massive configurations.

I don't think I directly answered your question, so in case you didn't
infer an answer: I personally don't think you'd benefit from doing your
own custom squirm-like (or other) handler if performance is your concern.
I think you'd be better off doing a quick hackish mass-conversion of as
many patterns as you can and seeing how Varnish performs. My hunch is
that'll alleviate any concerns you've got.

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc


jwellband at gmail

Oct 24, 2011, 5:20 PM

Post #3 of 3 (308 views)
Permalink
Re: Mass redirects/backend selection with Varnish? [In reply to]

Michael (and list),

First, thanks much for the response. It's much appreciated.

> I had the same thought when we migrated from squid + jesred to Varnish;
> we had several thousand patterns across a few sites. I did simplify
> things a little by implementing virtual-hosting type behaviour within
> Varnish, so it only had to process redirects for the particular site the
> request was actually for. If you can make a similar optimisation you
> might find the amount of processing per request drops considerably.

Not sure I follow here - sounds like you mean putting different sites
on diffing IPs and/or different varnishes? We do that with squid to
match each app cluster but there is no squid-centric reason for this.
We've 13 major sites that we'd front with varnish initially, and if
that works well, then we'd use it for more, so possibly 100 disparate
sites - each with more than one backend.

> While the if/elsif ladder looks a bit ugly and like a lot of work, it's
> actually pretty much exactly what squirm is already doing. So I think
> you'll find the performance to be about the same; possibly a bit faster
> since if you implement it within VCL you won't have the overhead of
> communicating over a pipe.

Guess I wasn't clear :) It's more of a configuration query since
we're used to defining behavior in one place (the patterns) versus two
(patterns and a preconfigured backend).

> The only issue you'd have with doing it in Varnish is if your backend
> hosts are pretty much arbitrary; Varnish needs each origin to be
> explicitly defined. This requires a slight change to the logic, in that
> you need to set req.backend appropriately, in addition to req.host
> and/or req.url. But, it's not really complex.

That was my initial concern. After reading your email, I realized I
could define all the backends we'd need with deterministic names and
possibly write some VCL to select the backend from the "rewritten"
hostname. I will have to play with this and see if it works as I think
it does.

Luckily, we don't rewrite the Host header in squid, so that's one less
thing varnish has to do :)

Some of our patterns don't specify a hostname component; e.g.
http://([^/]+)/images(/.*) ==> http://static.\1/images\2. I know about
Varnish's regex substitution, so I'm hopeful that I can do the
rewrites in VCL.

> I do think that abstracting it out of the VCL so you don't have to
> actually manage the if/elsif ladder directly is probably a good idea.
> It'd certainly be workable but you've probably got better things to do
> with your time. I guess it depends how frequently you make changes or
> additions to your redirections.

Changes are made by the dev team a few times a week and rolled out
weekly via cfengine grabbing the latest from their VCS, pushing the
patterns out to the caching boxes and poking squid/squirm.

> When I moved to Varnish, I took the opportunity to place all the
> redirects and rewrites into our "DNS management system", which is just
> an in-house hodge-podge of Python and perl. The redirects are specified
> in a similar format to squirm/jesred, and I have a script that parses
> them and spits out appropriate VCL. That file is then rsynced to each of
> the proxies, and included from the appropriate site's configuration.

Heh - I've used this idea elsewhere but never thought of writing
something to generate VCL. Thanks for the reminder ;)

> If you are happy editing it directly, then there's no real issue. One
> nice thing about Varnish is you can tell it to load a new config while
> it's running, and if it can't compile it, it'll just tell you to rack
> off and keep running the existing one. So even if you break the config
> the site keeps running without a hiccup. jesred liked to just stop doing
> any redirects if I broke its pattern file, and the comment about "Dodo
> mode" makes me think squirm may well do the same thing.

Heh - this happened once or twice, then the dev team wrote a test
harness that calls squirm with a bunch of URLs and ensures that the
expected rewrites are output. We may have to substitute with curls
against a non-prod varnish (assuming the config compiles).

> I don't think I directly answered your question, so in case you didn't
> infer an answer: I personally don't think you'd benefit from doing your
> own custom squirm-like (or other) handler if performance is your concern.
> I think you'd be better off doing a quick hackish mass-conversion of as
> many patterns as you can and seeing how Varnish performs. My hunch is
> that'll alleviate any concerns you've got.

Thanks much for the ideas and for telling me that someone else was
(ab)using squid redirecters ;)

--
HTH, YMMV, HANW :)

Jason

The path to enlightenment is /usr/bin/enlightenment.

_______________________________________________
varnish-misc mailing list
varnish-misc [at] varnish-cache
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc

Varnish misc RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.