Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Mediawiki

JAMWiki benchmark (was Re: Alternative implementations)

 

 

Wikipedia mediawiki RSS feed   Index | Next | Previous | View Threaded


tstarling at wikimedia

Jul 23, 2008, 2:59 AM

Post #1 of 2 (685 views)
Permalink
JAMWiki benchmark (was Re: Alternative implementations)

Dirk Riehle wrote:
> Here an interesting alternative implementation for MediaWiki/Wikipedia:
>
> *
> http://armstrongonsoftware.blogspot.com/2008/06/itching-my-programming-nerve.html
>
> * http://video.google.com/videoplay?docid=6981137233069932108 (Wikipedia
> discussion starts 30min into the video)
>
> Basically a p2p backend that claims order of magnitude performance gains
> for writing pages. They ignore the front caches etc. Done in Erlang (+Java).
>
> I was trying to figure out whether this would really be feature parity
> but couldn't fully see it.
>
> For the rendering, they use plog4u---does someone know whether this has
> feature parity with Mediawiki (markup)? We used JAMWiki (Java
> implementation of MediaWiki) only to see later that there was no
> ParserFunctions extension available. (Why is this an extension rather
> than a core part in the first place?)

If the only thing missing from JAMWiki was ParserFunctions, that would be
very impressive. ParserFunctions is simple. And indeed, there's a lot of
really impressive code in there, although it's easy to find edge cases
that don't work the same way.

But I thought I'd better test its performance, before I got too excited
and started integrating it into MediaWiki. It turns out that it's full of
O(N^2) cases, which made my usual testing method using repeated text to
measure loop performance rather difficult.

For example, for the test text str_repeat("'''b''' ", 1000), JAMWiki
showed O(N^2) performance:

1000 iterations: 1148ms
2000 iterations: 3916ms
4000 iterations: 15320ms

For str_repeat("[http://a] ", 1000), it took so long that I gave up
waiting. MediaWiki does either of these things in linear time, on the
order of hundreds of microseconds per loop.

It's unfortunate that a modern parser generator for a supposedly fast
language like Java can't match hand-optimised PHP for speed. It's not like
we've set a high bar here.

-- Tim Starling


_______________________________________________
MediaWiki-l mailing list
MediaWiki-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l


dirk at riehle

Jul 23, 2008, 10:06 AM

Post #2 of 2 (652 views)
Permalink
Re: JAMWiki benchmark (was Re: Alternative implementations) [In reply to]

Tim Starling wrote:
> If the only thing missing from JAMWiki was ParserFunctions, that would be
> very impressive. ParserFunctions is simple. And indeed, there's a lot of
> really impressive code in there, although it's easy to find edge cases
> that don't work the same way.
>

True; it was just one of the first things we ran into with basic
rendering of Wikipedia pages.

> For str_repeat("[http://a] ", 1000), it took so long that I gave up
> waiting. MediaWiki does either of these things in linear time, on the
> order of hundreds of microseconds per loop.
>

[...]

> It's unfortunate that a modern parser generator for a supposedly fast
> language like Java can't match hand-optimised PHP for speed. It's not like
> we've set a high bar here.
>

I'm not sure about not having set a high bar... However, we can confirm
the parser generator vs hand-optimized parser issue. You just showed
that JFlex, the parser generator used by JAMWiki doesn't scale up
nicely. We found the same for ANTLR, another parser generator for Java,
which also doesn't perform as well as MediaWiki when run against
stripped down pages (our parser parses Wiki Creole which on a stripped
down level is equivalent to MediaWiki syntax) [1]. MediaWiki performed
equally well or better; in general I think the advantage of parser
generator is easier maintainability and clarity of the language (you can
view the grammar as a domain-specific language for describing acceptable
syntax), but not performance :-(

Thanks for your insights!

Dirk

[1] http://www.riehle.org/2008/07/19/a-grammar-for-standardized-wiki-markup/


--
Phone: + 1 (650) 215 3459, Web: http://www.riehle.org



_______________________________________________
MediaWiki-l mailing list
MediaWiki-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

Wikipedia mediawiki RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.