
huskyr at gmail
Nov 24, 2006, 3:29 AM
Post #5 of 10
(379 views)
Permalink
|
You could also try this link if you want general statistics on Wikipedia: http://stats.wikimedia.org/EN/Sitemap.htm -- Hay Kranen / [[User:Husky]] On 11/24/06, Gregory Maxwell <gmaxwell[at]gmail.com> wrote: > > On 11/24/06, Antonio Gulli <gulli[at]di.unipi.it> wrote: > > Is wiki using apache web server or something equivalent server? > > I was referring to the access.log file > > Although we use Apache, we do not store an access.log. > We also use squid, but have disabled logging in that as well. > > At peak we are serving over 20,000 requests per second. At this > activity level logging would present a non-negligible performance and > administrative overhead. > > Lets pretend for a moment that all access hit apache: > > My local mediawiki installation on apache produces log entries of > 232.13 bytes per hit on average. I would expect that my log entries > would be shorter than the entries we'd see in production. > > Over a day we are receiving about 1,188,345,600 http requests. > > This would be 256.9 GiB/day in access logs. > > At 7.8 terabytes of log data to simply preserve a month's history, > keeping full access logs would be both unreasonable and wasteful. > > If you have some especially interesting research ideas, and your > research can be done on smaller amounts of data that we might be > collecting (such as the wikicharts data) then I would be glad to > discuss the possibilities. But it would be best to take that > discussion off list... > _______________________________________________ > foundation-l mailing list > foundation-l[at]wikimedia.org > http://mail.wikipedia.org/mailman/listinfo/foundation-l > _______________________________________________ foundation-l mailing list foundation-l[at]wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
|