
jra at baylink
Aug 29, 2006, 8:49 AM
Post #16 of 22
(957 views)
Permalink
|
On Mon, Aug 28, 2006 at 10:52:42PM -0400, Simetrical wrote: > On 8/28/06, Jay R. Ashworth <jra [at] baylink> wrote: > > Because in wikitext, everything is in-band; in XML, the structure is > > out-of-band, on purpose. This requires an entirely different, and I > > suspect, much more complicated diff algorithm. > > I don't know what "in-band" and "out-of-band" mean ([[Out of band]] > doesn't help either), The current diff engine, with which I'm not familiar intimately (read that as I haven't looked at the code at all, but I'm assuming it's somewhat familiar with the Unix diff internals) is working on one big object of stream text. The structural markup is *part* of that stream of text, hence, in-band. > but if the diff engine parses the XML, it can > look for a) changes in structure/markup and b) changes in content. Yep, and those will interact in ways different from the ways that they do now: the current diff engine need not "trip over" the edges of objects in the way that an XML parser will have to. > Either one should be very easy and fast to diff, given XML-parsing > library functions (for the C++ module used on WMF sites, that is). > Faster than present, I don't know, but the present differ is hardly a > bottleneck. Certainly. I wasn't suggesting that it was; rather, the opposite. Anyone got any implementation experience with diffing XML trees? Cheers, -- jra -- Jay R. Ashworth jra [at] baylink Designer Baylink RFC 2100 Ashworth & Associates The Things I Think '87 e24 St Petersburg FL USA http://baylink.pitas.com +1 727 647 1274 The Internet: We paved paradise, and put up a snarking lot. _______________________________________________ Wikitech-l mailing list Wikitech-l [at] wikimedia http://mail.wikipedia.org/mailman/listinfo/wikitech-l
|