Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech

Apache mod_php in wikipedia

 

 

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded


howachen at gmail

Aug 26, 2008, 7:44 AM

Post #1 of 13 (1679 views)
Permalink
Apache mod_php in wikipedia

Hello,

I was heard that wikipedia is running on Apache/mod_php, are there any
reason not to use fast-cgi approach for performance (e.g. lighty /
ngnix + fast cgi?)


Thanks.

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 26, 2008, 8:00 AM

Post #2 of 13 (1646 views)
Permalink
Re: Apache mod_php in wikipedia [In reply to]

On Tue, Aug 26, 2008 at 10:44 AM, howard chen <howachen [at] gmail> wrote:
> I was heard that wikipedia is running on Apache/mod_php, are there any
> reason not to use fast-cgi approach for performance (e.g. lighty /
> ngnix + fast cgi?)

The reason that mod_php is slow is because you have to have an
instance of PHP running when much if not most of the time, Apache is
waiting to serve results to slow clients. This means that you have
many more instances of PHP than you actually need, which means much
more memory. In Wikipedia's case, however, Apache is serving data
exclusively to Squid, which is reading the data at least as fast as it
can write it. It wouldn't save anything to have FastCGI serve data to
lighttpd or nginx instead; it would be no faster than mod_php serving
it to Squid. So little to nothing would be saved by using FastCGI.

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


howachen at gmail

Aug 26, 2008, 8:28 AM

Post #3 of 13 (1645 views)
Permalink
Re: Apache mod_php in wikipedia [In reply to]

Hi,

On 8/26/08, Aryeh Gregor <Simetrical+wikilist [at] gmail> wrote:
> The reason that mod_php is slow is because you have to have an
> instance of PHP running when much if not most of the time, Apache is
> waiting to serve results to slow clients. This means that you have
> many more instances of PHP than you actually need, which means much
> more memory. In Wikipedia's case, however, Apache is serving data
> exclusively to Squid, which is reading the data at least as fast as it
> can write it. It wouldn't save anything to have FastCGI serve data to
> lighttpd or nginx instead; it would be no faster than mod_php serving
> it to Squid. So little to nothing would be saved by using FastCGI.
>

But there are still a number of pages which cannot be cached by Squid?

Thanks.

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 26, 2008, 8:37 AM

Post #4 of 13 (1642 views)
Permalink
Re: Apache mod_php in wikipedia [In reply to]

On Tue, Aug 26, 2008 at 11:28 AM, howard chen <howachen [at] gmail> wrote:
> But there are still a number of pages which cannot be cached by Squid?

Even pages that aren't cached by Squid are still proxied through it.
It proxies all requests, and caches those that are cacheable. No one
outside Wikimedia's internal network is ever communicating directly
with an Apache or lighttpd server when using the Wikimedia projects,
to my knowledge. (lighttpd is in fact used by the image servers,
incidentally.)

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


dan_the_man at telus

Aug 26, 2008, 9:44 AM

Post #5 of 13 (1649 views)
Permalink
Re: Apache mod_php in wikipedia [In reply to]

No PHP benefit either way.
There's no reason to go to nginx for performance, but there's also no
reason not to use it.
Personally, I've got a preference for nginx, so I'm using it myself.

And for smaller sites not using a front end cache or multiple domains,
it is nice when you are serving out the /images folder with the same
webserver.

So, no nginx performance benefit for MediaWiki. But there's no real
benefit of Apache over other webservers either.

~Daniel Friesen(Dantman, Nadir-Seen-Fire) of:
-The Nadir-Point Group (http://nadir-point.com)
--It's Wiki-Tools subgroup (http://wiki-tools.com)
--The ElectronicMe project (http://electronic-me.org)
--Games-G.P.S. (http://ggps.org)
-And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
--Animepedia (http://anime.wikia.com)
--Narutopedia (http://naruto.wikia.com)

Aryeh Gregor wrote:
> On Tue, Aug 26, 2008 at 11:28 AM, howard chen <howachen [at] gmail> wrote:
>
>> But there are still a number of pages which cannot be cached by Squid?
>>
>
> Even pages that aren't cached by Squid are still proxied through it.
> It proxies all requests, and caches those that are cacheable. No one
> outside Wikimedia's internal network is ever communicating directly
> with an Apache or lighttpd server when using the Wikimedia projects,
> to my knowledge. (lighttpd is in fact used by the image servers,
> incidentally.)
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 26, 2008, 11:01 AM

Post #6 of 13 (1644 views)
Permalink
Re: Apache mod_php in wikipedia [In reply to]

On Tue, Aug 26, 2008 at 12:44 PM, Daniel Friesen <dan_the_man [at] telus> wrote:
> No PHP benefit either way.
> There's no reason to go to nginx for performance, but there's also no
> reason not to use it.
> Personally, I've got a preference for nginx, so I'm using it myself.

The only reason Apache is still being used, AFAIK, is because it's
what Wikimedia has always used. With no gain in switching, you may as
well stick with whatever you have to avoid transition costs. lighttpd
didn't even exist in 2001, let alone nginx.

Actually, is there any reason lighttpd is used for image serving?
Just because it was trendy when the image servers got set up, or
because it actually has concrete advantages of some kind?

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


dan_the_man at telus

Aug 26, 2008, 11:18 AM

Post #7 of 13 (1637 views)
Permalink
Re: Apache mod_php in wikipedia [In reply to]

For lighttpd vs. Apache. lighttpd is a lightweight webserver, and does
have a real performance benefit over serving out images with bloated Apache.

As for lighttpd vs. nginx... I believe the author of lighttpd actually
had some contact with Wikimedia, perhaps a bit of changes were made to
lighttpd or something for them... Dunno... So whether anyone else gos
for lighttpd vs. nginx there's no relevant benefit of one over the other.

~Daniel Friesen(Dantman, Nadir-Seen-Fire) of:
-The Nadir-Point Group (http://nadir-point.com)
--It's Wiki-Tools subgroup (http://wiki-tools.com)
--The ElectronicMe project (http://electronic-me.org)
--Games-G.P.S. (http://ggps.org)
-And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
--Animepedia (http://anime.wikia.com)
--Narutopedia (http://naruto.wikia.com)

Aryeh Gregor wrote:
> On Tue, Aug 26, 2008 at 12:44 PM, Daniel Friesen <dan_the_man [at] telus> wrote:
>
>> No PHP benefit either way.
>> There's no reason to go to nginx for performance, but there's also no
>> reason not to use it.
>> Personally, I've got a preference for nginx, so I'm using it myself.
>>
>
> The only reason Apache is still being used, AFAIK, is because it's
> what Wikimedia has always used. With no gain in switching, you may as
> well stick with whatever you have to avoid transition costs. lighttpd
> didn't even exist in 2001, let alone nginx.
>
> Actually, is there any reason lighttpd is used for image serving?
> Just because it was trendy when the image servers got set up, or
> because it actually has concrete advantages of some kind?
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


brion at wikimedia

Aug 26, 2008, 11:24 AM

Post #8 of 13 (1642 views)
Permalink
Re: Apache mod_php in wikipedia [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Daniel Friesen wrote:
> For lighttpd vs. Apache. lighttpd is a lightweight webserver, and does
> have a real performance benefit over serving out images with bloated Apache.

We started using lighty to serve images in 2005:
http://meta.wikimedia.org/wiki/November_2005_image_server

Domas did a fair amount of benchmarking and testing on this over the
months prior to the switch, including working with the author on some
fixes, and it was a pretty clear winner at the time.

> As for lighttpd vs. nginx... I believe the author of lighttpd actually
> had some contact with Wikimedia, perhaps a bit of changes were made to
> lighttpd or something for them... Dunno... So whether anyone else gos
> for lighttpd vs. nginx there's no relevant benefit of one over the other.

Well, to be honest I never heard of nginx before today. :)

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAki0SkQACgkQwRnhpk1wk46O0wCgkAbv8gM5/LTWBvoZD5logxcF
ZNsAn2K0G34CRGUQHQ0bYH7aX/8z31Ir
=uUIu
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 26, 2008, 11:37 AM

Post #9 of 13 (1645 views)
Permalink
Re: Apache mod_php in wikipedia [In reply to]

On Tue, Aug 26, 2008 at 2:18 PM, Daniel Friesen <dan_the_man [at] telus> wrote:
> For lighttpd vs. Apache. lighttpd is a lightweight webserver, and does
> have a real performance benefit over serving out images with bloated Apache.

By itself, certainly, but I wasn't sure if it had an advantage serving
to Squid as well. If Domas said so, though, I'm happy to believe him.
:)

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


dan_the_man at telus

Aug 26, 2008, 12:01 PM

Post #10 of 13 (1642 views)
Permalink
Re: Apache mod_php in wikipedia [In reply to]

Brion Vibber wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Daniel Friesen wrote:
>
>> For lighttpd vs. Apache. lighttpd is a lightweight webserver, and does
>> have a real performance benefit over serving out images with bloated Apache.
>>
>
> We started using lighty to serve images in 2005:
> http://meta.wikimedia.org/wiki/November_2005_image_server
>
> Domas did a fair amount of benchmarking and testing on this over the
> months prior to the switch, including working with the author on some
> fixes, and it was a pretty clear winner at the time.
>
>
>> As for lighttpd vs. nginx... I believe the author of lighttpd actually
>> had some contact with Wikimedia, perhaps a bit of changes were made to
>> lighttpd or something for them... Dunno... So whether anyone else gos
>> for lighttpd vs. nginx there's no relevant benefit of one over the other.
>>
>
> Well, to be honest I never heard of nginx before today. :)
>
> - -- brion
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.8 (Darwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAki0SkQACgkQwRnhpk1wk46O0wCgkAbv8gM5/LTWBvoZD5logxcF
> ZNsAn2K0G34CRGUQHQ0bYH7aX/8z31Ir
> =uUIu
> -----END PGP SIGNATURE-----
>
Heh, that's amusing... Jae poked me about it all the time back on the
old Wiki-Tools project.
http://nginx.net/

http://survey.netcraft.com/Reports/200806/
http://hostingfu.com/article/nginx-vs-lighttpd-for-a-small-vps
http://www.joeandmotorboat.com/2008/02/28/apache-vs-nginx-web-server-performance-deathmatch/

I love how I have it setup for my own MediaWiki sites.
I passed off nginx on the old wiki-tools project as we couldn't find a
single way to get nginx to work with short urls.
When the second wiki-tools project came around I made yet another
attempt at it... I even attempted using real complex rewrites. I manged
to get it to sorta work... well, short urls worked, and amperstands
didn't have an issue... well, to a limit... heh, the regex I used
worked, but not all to well... ^_^ after a certain number of amperstands
in the title, everything just sorta... went white... heh
Some time after that I suddenly thought of a bit of a brilliant idea.
Just came to me, ^_^ and personally I find it beautifully elegant. More
beautiful than anything you could ever do with a rewrite...

location ~ /(index.php5?/|wiki|view|render|print|viewsource|purge|(form)?edit|submit|history|info|credits|(un)?watch|(un)?delete|revert|rollback|(un)?protect|markpatrolled|validate|deletetrackback|dublincore|creativecommons) {
include fastcgi.conf;
fastcgi_param SCRIPT_FILENAME $document_root/index.php;
fastcgi_param SCRIPT_NAME /index.php;
fastcgi_param PHP_SELF /index.php;
fastcgi_param SCRIPT_URL /index.php;
fastcgi_param PATH_INFO $fastcgi_script_name;
fastcgi_pass php;
}

include /etc/nginx/default.conf;
include /etc/nginx/php.conf;

The one line is a bit long because I have an affinity for action paths.
Honestly, that could be reduced to a single "location /wiki {" line if
you just wanted short urls (actually that's what I basically listed on
the English nginx wiki.

Quite simply... the regex matching "/wiki", "/edit", etc... whatever...
matches as if it was a directory and $fastcgi_script_name matches
whatever's after it. ie: /wiki/Foobar, $fastcgi_script_name is /Foobar
Really the PATH_INFO is just there for a preference on completeness.
MediaWiki uses the REQUEST_PATH so it's not really used (actually that's
a good thing, since I needed to add a / after the index.php5?).
Basically it takes all those requests and points them to /index.php,
basically funneling all those through MediaWiki using it as a web
application.

No cruddy rewrite rules trying to rewrite some ugly path into something
that vaguely matches a query to index.php... Instead all requests to the
wiki are just funneled through the wiki letting the software handle all
the paths like it was built to.

Sorry bout the long code... It's just I get an indescribable feeling
looking at this kind of elegant code, I was in awe for a fair bit of
time after this idea hit me and worked...

~Daniel Friesen(Dantman, Nadir-Seen-Fire) of:
-The Nadir-Point Group (http://nadir-point.com)
--It's Wiki-Tools subgroup (http://wiki-tools.com)
--The ElectronicMe project (http://electronic-me.org)
--Games-G.P.S. (http://ggps.org)
-And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
--Animepedia (http://anime.wikia.com)
--Narutopedia (http://naruto.wikia.com)

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


tstarling at wikimedia

Aug 26, 2008, 7:32 PM

Post #11 of 13 (1629 views)
Permalink
Re: Apache mod_php in wikipedia [In reply to]

Brion Vibber wrote:
> Daniel Friesen wrote:
>> For lighttpd vs. Apache. lighttpd is a lightweight webserver, and does
>> have a real performance benefit over serving out images with bloated Apache.
>
> We started using lighty to serve images in 2005:
> http://meta.wikimedia.org/wiki/November_2005_image_server
>
> Domas did a fair amount of benchmarking and testing on this over the
> months prior to the switch, including working with the author on some
> fixes, and it was a pretty clear winner at the time.

Domas tested CPU performance, which lighttpd did much better than Apache,
especially in the large file case, because at the time, lighttpd used
sendfile() and Apache didn't. Since 2.0.44, Apache has sendfile support,
and our storage servers use negligible CPU anyway, they're disk bound. So
it's pretty likely that we could put apache on them with no significant
performance loss.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


midom.lists at gmail

Aug 27, 2008, 12:15 AM

Post #12 of 13 (1627 views)
Permalink
Re: Apache mod_php in wikipedia [In reply to]

Hi!

> sendfile() and Apache didn't. Since 2.0.44, Apache has sendfile
> support,
> and our storage servers use negligible CPU anyway, they're disk
> bound. So
> it's pretty likely that we could put apache on them with no
> significant
> performance loss.


Well, there's another side issue - the memory used.
Event-model based server doesn't have to spawn that many children, and
even with all the copy-on-write efficiency, memory footprint of 2.0
apaches would be way higher. Of course, 2.2 has the event based model
too.
And memory wasted = memory not used for filesystem caches.

But yeah, at the time lighty was the champ, and now it probably
doesn't make much difference.

I also liked lighttpd configuration simplicity :)

--
Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]



_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


midom.lists at gmail

Aug 27, 2008, 12:17 AM

Post #13 of 13 (1623 views)
Permalink
Re: Apache mod_php in wikipedia [In reply to]

Hi!

> By itself, certainly, but I wasn't sure if it had an advantage serving
> to Squid as well.

Squid may not buffer entire result immediately, so writing to squids
sometimes blocks. Though it is tolerable with lots and lots of app
servers, few poor image servers better to be more efficient :)

> If Domas said so, though, I'm happy to believe him.
> :)


woot!

--
Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]



_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.