Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: ModPerl: ModPerl

Best filesystem type for mod_cache in reverse proxy?

 

 

First page Previous page 1 2 Next page Last page  View All ModPerl modperl RSS feed   Index | Next | Previous | View Threaded


perrin at elem

Nov 25, 2008, 10:55 AM

Post #26 of 29 (447 views)
Permalink
Re: Best filesystem type for mod_cache in reverse proxy? [In reply to]

On Tue, Nov 25, 2008 at 1:30 PM, Neil Gunton <neil [at] nilspace> wrote:
> The only downside is that people on extremely slow dialup connections might
> notice longer download times for page text... but I have to wonder if that's
> really an issue today. Back in 1998 perhaps you might care about something
> being 20KB rather than 80KB, but surely not today. In any case, don't dialup
> ISPs often implement their own compression now?

Compressing is pretty important:
http://developer.yahoo.net/blog/archives/2007/07/high_performanc_3.html

I wonder if there's a way to make the mod_deflate Vary header a bit
saner, so it just reflects compressed or not, rather than every
possible User-Agent.

There are also alternative ways to cache pages, like pre-publishing
them as static files or doing page caching with mod_perl handlers that
intercept the request before the response phase and serve a cached
copy. It's very convenient to use mod_cache though.

- Perrin


rwan at kuicr

Nov 25, 2008, 11:46 PM

Post #27 of 29 (439 views)
Permalink
Re: Best filesystem type for mod_cache in reverse proxy? [In reply to]

Hi

Neil Gunton wrote:
> Well, that seemed to do the trick! So the caveat seems to be: Be
> careful using both mod_deflate and mod_cache (mod_disk_cache
> specifically) together if you have a large dynamic website that can
> generate a large number of distinct pages. Mod_deflate produces a


This is probably a digression from your discussion, but I'm not sure if
any of you have used gzip + md5sum together before. I have, and it can
be annoying especially if you are playing with large data files like I
do. This is because gzip seems to (not 100% sure) store some time
information in the archive. So, if you create two archives of the same
files, they aren't identical...their md5sums do not match.

As deflate is essentially the same algorithm as gzip, it is somewhat the
same annoyance...


> Web pages seem to render a little faster in the browser too. That may
> be my imagination and/or placebo effect, but it might make sense if
> there isn't that additional compression/decompression going on both ends.
>
> The only downside is that people on extremely slow dialup connections
> might notice longer download times for page text... but I have to
> wonder if that's really an issue today. Back in 1998 perhaps you might
> care about something being 20KB rather than 80KB, but surely not
> today. In any case, don't dialup ISPs often implement their own
> compression now?


I had looked at the effect compression has on web pages a while ago.
Though not relevant to modperl, there is obviously a cost to compression
and since most HTML pages are small, sometimes it is hard to justify.
If users are downloading XML files of data, though, then that is of
course worth it...but one could argue that if you are making XML files
available for download, then wouldn't it be better to compress it
yourself rather than asking Apache to compress on-the-fly.

As for dialup, if I remember from those dark modem days :-), even many
of them had compression built in. In fact, I think they had some form
of the deflate/gzip/sliding window algorithm. And for those of us who
have tried gzipping an already-gzipped file, adding compression to
something that is already compressed is generally counter-productive...

Anyway, I don't think it is much of an issue...might be more helpful to
educate web page creators to not put MBs of images on a single page. :-)

Ray




>
> Anyway, hope that's helpful to anybody running large dynamic websites
> behind a reverse proxy. Keep mod_cache, maybe think about ditching
> mod_deflate. The combination does technically work, but for large
> numbers of pages, it can make your cache size (and your iowait) explode.


mpeters at plusthree

Nov 26, 2008, 6:27 AM

Post #28 of 29 (438 views)
Permalink
Re: Best filesystem type for mod_cache in reverse proxy? [In reply to]

Raymond Wan wrote:

> I had looked at the effect compression has on web pages a while ago.
> Though not relevant to modperl, there is obviously a cost to compression
> and since most HTML pages are small, sometimes it is hard to justify.

Not to discredit the work you did researching this, but a lot of people are studying the same thing
and coming to different conclusions:

http://developer.yahoo.com/performance/rules.html

Yes, backend performance matters, but more and more we realize that the front end tweaks we can make
give a better performance for users.

Take google as an example. The overhead of compressing their content and decompressing it on the
browser takes less time than sending the same content uncompressed over the network. I'd say the
same is true for most other applications too.

> As for dialup, if I remember from those dark modem days :-)

Even non dialup customers can benefit. Many "broadband" connections aren't very fast, especially in
rural places (I'm thinking large portions of the US).

But all this talk is really useless in the abstract. Take a tool like YSlow for a spin and see how
your sites perform with and without compression. Especially looking at the waterfall display.

--
Michael Peters
Plus Three, LP


rwan at kuicr

Nov 26, 2008, 8:14 AM

Post #29 of 29 (439 views)
Permalink
Re: Best filesystem type for mod_cache in reverse proxy? [In reply to]

Hi Michael,


Michael Peters wrote:
> Raymond Wan wrote:
>> I had looked at the effect compression has on web pages a while ago.
>> Though not relevant to modperl, there is obviously a cost to
>> compression and since most HTML pages are small, sometimes it is hard
>> to justify.
>
> Not to discredit the work you did researching this, but a lot of
> people are studying the same thing and coming to different conclusions:
>
> http://developer.yahoo.com/performance/rules.html
>
> Yes, backend performance matters, but more and more we realize that
> the front end tweaks we can make give a better performance for users.
>
> Take google as an example. The overhead of compressing their content
> and decompressing it on the browser takes less time than sending the
> same content uncompressed over the network. I'd say the same is true
> for most other applications too.


It's ok; I don't consider another opinion as discrediting my work. :-)
Actually, it was a while ago and it was only one aspect of my work and
in a smaller test bed. My fault for handwaving in my reply, though.

The point is actually the "sometimes"... My research was more in
general compression and web compression was only one aspect. My point
is if you take a one byte file and run gzip -9 on it (again, the same
algorithm as deflate), you get a 24 byte file. As you increase that
file size, you will reach a point where it becomes more beneficial to
compress. Though my example is both silly and pathological, it just
shows that there are cases when compression may not be beneficial. And
one can imagine the average file size of a web site to be some kind of
knob and as it turns (average file size increases as you go from site to
site), the benefits become more and more evident.

For example, compressing an already compressed file is generally
pointless (if it was done right the first time). MP3, JPEG, GIF, etc.
are all file formats that have or may have compression incorporated.
PDFs can be compressed too if someone selected that option when creating
it. English text compresses well (25%, in general?) but two-byte
encodings such as Chinese and Japanese (I think) get around 40-50%
[handwaving again :-) there are more updated numbers out there]. Also,
compression works if it is a uniform file; if a web page has a mix of
text, images, etc., then each one has to be compressed individually.

As for Google, you are right -- I can imagine why it would work well for
Google. However, I can also hypothesize that it might be a special
case. I presume you mean the results of a query. The result we get is
a list of results which all are related to each other. i.e., if you
searched for "apache2 modperl", we can expect those two words to be in
every result and the type of words to be similar from result to result
[they would all be computer-oriented]. As compression aims to reduce
redundancy, their results are perfect for it. Especially if

Anyway, what I wanted to say is that there ought to be instances when
compression is beneficial and when it isn't. I think it is fine to do
what the Yahoo site says and have it "on" by default; but if someone
examines the traffic and data and realizes it should be "off", that
isn't beyond reason.


>> As for dialup, if I remember from those dark modem days :-)
>
> Even non dialup customers can benefit. Many "broadband" connections
> aren't very fast, especially in rural places (I'm thinking large
> portions of the US).
>
> But all this talk is really useless in the abstract. Take a tool like
> YSlow for a spin and see how your sites perform with and without
> compression. Especially looking at the waterfall display.
>

Well, one good thing about deflate is that it is *fast*. Very fast.
So, while my silly one byte file example shows there are exceptions, it
might be closer to one byte. :-)

One cost savings might be to pre-compress files since it is more
time-consuming to compress than decompress using deflate. i.e., have
them reside on the server in compressed form. Of course, that offers
many problems and is one reason why things like Stacker didn't really
catch on (much)...

Ray

First page Previous page 1 2 Next page Last page  View All ModPerl modperl RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.