Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech

HTML/Inline/Visual Diff

 

 

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded


guyvdb at gmail

Aug 17, 2008, 5:39 PM

Post #1 of 11 (735 views)
Permalink
HTML/Inline/Visual Diff

Hi,

I think the HTML diff page I've been developing for the Google Summer
of Code is ready to be tested as an experimental feature. You enable
it by setting $wgEnableHtmlDiff to true in r39564. What you'll see is
a rendered version of the diff page with indications where words were
added or removed. Image edits are supported too. Words that got a
different style are underlined and you get an English (only, for now)
explanation of what happened.

The interface is pretty basic and needs work. I'm not very good with
cross browser stuff though. I can provide meta data in the HTML such
as descriptions, id's, pointers to the previous and next change, etc.
Usability can be enhanced by adding links that take you to the first
or last change on the page, tool tips that open when clicking a
change, or keyboard shortcuts that scroll through the changes. Help is
appreciated in this department.

I spent a lot of time optimizing the code (include/HTMLDiff.php) for
speed which makes the code less readable but performance is an issue.
PHP is not my native tongue and the code would probably run faster if
an expert took a look at it. I think the performance is pretty decent
as it is (what do you expect from code that needs to parse 2 pages,
diff every single word and keep everything in memory). The algorithm
will probably choke on big pages (set your available memory high!).

Huge changes make the page look messy but that can't be avoided. In my
biased opinion the results look very good for reasonably sized pages
and versions that are not too distant.

So here is where your feedback and bug reports kick in.

Cheers,

Guy

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


roan.kattouw at home

Aug 18, 2008, 6:56 AM

Post #2 of 11 (699 views)
Permalink
Re: HTML/Inline/Visual Diff [In reply to]

Guy Van den Broeck schreef:
> Hi,
>
> I think the HTML diff page I've been developing for the Google Summer
> of Code is ready to be tested as an experimental feature. You enable
> it by setting $wgEnableHtmlDiff to true in r39564. What you'll see is
> a rendered version of the diff page with indications where words were
> added or removed. Image edits are supported too. Words that got a
> different style are underlined and you get an English (only, for now)
> explanation of what happened.
>
> The interface is pretty basic and needs work. I'm not very good with
> cross browser stuff though. I can provide meta data in the HTML such
> as descriptions, id's, pointers to the previous and next change, etc.
> Usability can be enhanced by adding links that take you to the first
> or last change on the page, tool tips that open when clicking a
> change, or keyboard shortcuts that scroll through the changes. Help is
> appreciated in this department.
>
> I spent a lot of time optimizing the code (include/HTMLDiff.php) for
> speed which makes the code less readable but performance is an issue.
> PHP is not my native tongue and the code would probably run faster if
> an expert took a look at it. I think the performance is pretty decent
> as it is (what do you expect from code that needs to parse 2 pages,
> diff every single word and keep everything in memory). The algorithm
> will probably choke on big pages (set your available memory high!).
I cleaned up the code a bit in r39585. I rewrote two loops, so that may
influence performance (haven't done any tests or benchmarks). In the
optimization department I can't really help you with more than these
generic tips:
* Put wfProfileIn() and wfProfileOut() calls all over the place and do
some profiling to see which functions are bottlenecks
* If you're foreach()ing large arrays somewhere, try to use references:
foreach($arr as $key => &$value) instead of foreach($arr as $key =>
$value)
The latter makes a copy of $arr whereas the former doesn't. The
former also allows you to change $value.

I'll start experimenting with HTMLDiff on my wiki now, input will follow.

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


guyvdb at gmail

Aug 18, 2008, 7:51 AM

Post #3 of 11 (703 views)
Permalink
Re: HTML/Inline/Visual Diff [In reply to]

2008/8/18 Roan Kattouw <roan.kattouw[at]home.nl>:
> Guy Van den Broeck schreef:
>> Hi,
>>
>> I think the HTML diff page I've been developing for the Google Summer
>> of Code is ready to be tested as an experimental feature. You enable
>> it by setting $wgEnableHtmlDiff to true in r39564. What you'll see is
>> a rendered version of the diff page with indications where words were
>> added or removed. Image edits are supported too. Words that got a
>> different style are underlined and you get an English (only, for now)
>> explanation of what happened.
>>
>> The interface is pretty basic and needs work. I'm not very good with
>> cross browser stuff though. I can provide meta data in the HTML such
>> as descriptions, id's, pointers to the previous and next change, etc.
>> Usability can be enhanced by adding links that take you to the first
>> or last change on the page, tool tips that open when clicking a
>> change, or keyboard shortcuts that scroll through the changes. Help is
>> appreciated in this department.
>>
>> I spent a lot of time optimizing the code (include/HTMLDiff.php) for
>> speed which makes the code less readable but performance is an issue.
>> PHP is not my native tongue and the code would probably run faster if
>> an expert took a look at it. I think the performance is pretty decent
>> as it is (what do you expect from code that needs to parse 2 pages,
>> diff every single word and keep everything in memory). The algorithm
>> will probably choke on big pages (set your available memory high!).
> I cleaned up the code a bit in r39585. I rewrote two loops, so that may
> influence performance (haven't done any tests or benchmarks). In the
> optimization department I can't really help you with more than these
> generic tips:
> * Put wfProfileIn() and wfProfileOut() calls all over the place and do
> some profiling to see which functions are bottlenecks
My experience is that wfProfile gives too much overhead for the diff
code. There are just too many nested loops and the function call is
pretty expensive. I use the XDEBUG profiler. I assume it is at least
as accurate as wfProfile.

> * If you're foreach()ing large arrays somewhere, try to use references:
> foreach($arr as $key => &$value) instead of foreach($arr as $key =>
> $value)
> The latter makes a copy of $arr whereas the former doesn't. The
> former also allows you to change $value.
>
Doesn't make a significant difference here, added it anyway.

> I'll start experimenting with HTMLDiff on my wiki now, input will follow.
Great! Is your wiki publicly available? I don't have a public test
server of my own.

>
> Roan Kattouw (Catrope)
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l[at]lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 18, 2008, 10:56 AM

Post #4 of 11 (698 views)
Permalink
Re: HTML/Inline/Visual Diff [In reply to]

On Mon, Aug 18, 2008 at 9:56 AM, Roan Kattouw <roan.kattouw[at]home.nl> wrote:
> * If you're foreach()ing large arrays somewhere, try to use references:
> foreach($arr as $key => &$value) instead of foreach($arr as $key =>
> $value)
> The latter makes a copy of $arr whereas the former doesn't. The
> former also allows you to change $value.

I looked this syntax up, and found this interestingly braindead gotcha
that can occur when you do this. It's PHP bug 29992, of course marked
BOGUS: <http://bugs.php.net/bug.php?id=29992>. Test case:

<?php
$array = array( 1, 2, 3 );
foreach( $array as &$item );
foreach( $array as $item );
print_r( $array );

Outputs

Array
(
[0] => 1
[1] => 2
[2] => 2
)

Clever, huh? Naturally it can't be changed because "people might use
this for some weird reason". It's probably a good idea to either not
use this syntax, or make sure you unset the variable after the loop:

foreach( $array as &$item ) { ... }
unset( $item );

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


roan.kattouw at home

Aug 18, 2008, 10:58 AM

Post #5 of 11 (696 views)
Permalink
Re: HTML/Inline/Visual Diff [In reply to]

Aryeh Gregor schreef:
> On Mon, Aug 18, 2008 at 9:56 AM, Roan Kattouw <roan.kattouw[at]home.nl> wrote:
>
>> * If you're foreach()ing large arrays somewhere, try to use references:
>> foreach($arr as $key => &$value) instead of foreach($arr as $key =>
>> $value)
>> The latter makes a copy of $arr whereas the former doesn't. The
>> former also allows you to change $value.
>>
>
> I looked this syntax up, and found this interestingly braindead gotcha
> that can occur when you do this. It's PHP bug 29992, of course marked
> BOGUS: <http://bugs.php.net/bug.php?id=29992>. Test case:
>
> <?php
> $array = array( 1, 2, 3 );
> foreach( $array as &$item );
> foreach( $array as $item );
> print_r( $array );
>
> Outputs
>
> Array
> (
> [0] => 1
> [1] => 2
> [2] => 2
> )
>
> Clever, huh? Naturally it can't be changed because "people might use
> this for some weird reason". It's probably a good idea to either not
> use this syntax, or make sure you unset the variable after the loop:
>
> foreach( $array as &$item ) { ... }
> unset( $item );
I know about this. But in my understanding, this bug can only occur when
you mix non-referenced foreach() loops with referenced ones. If you use
a reference at *every* loop (which is what Guy did), you should be fine.

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


guyvdb at gmail

Aug 20, 2008, 5:48 AM

Post #6 of 11 (683 views)
Permalink
Re: HTML/Inline/Visual Diff [In reply to]

Is anyone actually interested in this feature? My GSoC is officially
over now and I need to decide how much work I want to put in to the
HTML differ voluntarily.

2008/8/18 Guy Van den Broeck <guyvdb[at]gmail.com>:
> Hi,
>
> I think the HTML diff page I've been developing for the Google Summer
> of Code is ready to be tested as an experimental feature. You enable
> it by setting $wgEnableHtmlDiff to true in r39564. What you'll see is
> a rendered version of the diff page with indications where words were
> added or removed. Image edits are supported too. Words that got a
> different style are underlined and you get an English (only, for now)
> explanation of what happened.
>
> The interface is pretty basic and needs work. I'm not very good with
> cross browser stuff though. I can provide meta data in the HTML such
> as descriptions, id's, pointers to the previous and next change, etc.
> Usability can be enhanced by adding links that take you to the first
> or last change on the page, tool tips that open when clicking a
> change, or keyboard shortcuts that scroll through the changes. Help is
> appreciated in this department.
>
> I spent a lot of time optimizing the code (include/HTMLDiff.php) for
> speed which makes the code less readable but performance is an issue.
> PHP is not my native tongue and the code would probably run faster if
> an expert took a look at it. I think the performance is pretty decent
> as it is (what do you expect from code that needs to parse 2 pages,
> diff every single word and keep everything in memory). The algorithm
> will probably choke on big pages (set your available memory high!).
>
> Huge changes make the page look messy but that can't be avoided. In my
> biased opinion the results look very good for reasonably sized pages
> and versions that are not too distant.
>
> So here is where your feedback and bug reports kick in.
>
> Cheers,
>
> Guy
>

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


brion at wikimedia

Aug 20, 2008, 9:00 AM

Post #7 of 11 (687 views)
Permalink
Re: HTML/Inline/Visual Diff [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Guy Van den Broeck wrote:
> Is anyone actually interested in this feature? My GSoC is officially
> over now and I need to decide how much work I want to put in to the
> HTML differ voluntarily.

Well, I am! :)

It could make for much more legible RSS feeds, for example.

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkisP50ACgkQwRnhpk1wk44E3ACbBgd1B3LKA7Cw7fSb7TAAcufV
iAAAnitqrMKPUnDDW06YsihGm5962+EQ
=UWXa
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


roan.kattouw at home

Aug 20, 2008, 10:51 AM

Post #8 of 11 (681 views)
Permalink
Re: HTML/Inline/Visual Diff [In reply to]

Brion Vibber schreef:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Guy Van den Broeck wrote:
>
>> Is anyone actually interested in this feature? My GSoC is officially
>> over now and I need to decide how much work I want to put in to the
>> HTML differ voluntarily.
>>
>
> Well, I am! :)
>
> It could make for much more legible RSS feeds, for example.
Maybe someone with a public test wiki could enable this? Maybe
test.wikipedia.org could?

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


dan_the_man at telus

Aug 22, 2008, 1:34 AM

Post #9 of 11 (667 views)
Permalink
Re: HTML/Inline/Visual Diff [In reply to]

I did for sandbox.wiki-tools.com, however the whole thing looks utterly
broken to me. I've disabled it since it basically breaks my ability to
diff things in the sandbox.

~Daniel Friesen(Dantman, Nadir-Seen-Fire) of:
-The Nadir-Point Group (http://nadir-point.com)
--It's Wiki-Tools subgroup (http://wiki-tools.com)
--The ElectronicMe project (http://electronic-me.org)
--Games-G.P.S. (http://ggps.org)
-And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
--Animepedia (http://anime.wikia.com)
--Narutopedia (http://naruto.wikia.com)

Roan Kattouw wrote:
> Brion Vibber schreef:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Guy Van den Broeck wrote:
>>
>>
>>> Is anyone actually interested in this feature? My GSoC is officially
>>> over now and I need to decide how much work I want to put in to the
>>> HTML differ voluntarily.
>>>
>>>
>> Well, I am! :)
>>
>> It could make for much more legible RSS feeds, for example.
>>
> Maybe someone with a public test wiki could enable this? Maybe
> test.wikipedia.org could?
>
> Roan Kattouw (Catrope)
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


guyvdb at gmail

Aug 22, 2008, 1:54 AM

Post #10 of 11 (667 views)
Permalink
Re: HTML/Inline/Visual Diff [In reply to]

Can you elaborate one what you think is "utterly broken"?

2008/8/22 Daniel Friesen <dan_the_man[at]telus.net>:
> I did for sandbox.wiki-tools.com, however the whole thing looks utterly
> broken to me. I've disabled it since it basically breaks my ability to
> diff things in the sandbox.
>
> ~Daniel Friesen(Dantman, Nadir-Seen-Fire) of:
> -The Nadir-Point Group (http://nadir-point.com)
> --It's Wiki-Tools subgroup (http://wiki-tools.com)
> --The ElectronicMe project (http://electronic-me.org)
> --Games-G.P.S. (http://ggps.org)
> -And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
> --Animepedia (http://anime.wikia.com)
> --Narutopedia (http://naruto.wikia.com)
>
> Roan Kattouw wrote:
>> Brion Vibber schreef:
>>
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> Guy Van den Broeck wrote:
>>>
>>>
>>>> Is anyone actually interested in this feature? My GSoC is officially
>>>> over now and I need to decide how much work I want to put in to the
>>>> HTML differ voluntarily.
>>>>
>>>>
>>> Well, I am! :)
>>>
>>> It could make for much more legible RSS feeds, for example.
>>>
>> Maybe someone with a public test wiki could enable this? Maybe
>> test.wikipedia.org could?
>>
>> Roan Kattouw (Catrope)
>>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l[at]lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


dan_the_man at telus

Aug 22, 2008, 2:40 AM

Post #11 of 11 (670 views)
Permalink
Re: HTML/Inline/Visual Diff [In reply to]

Everything in a technical form. If you look at my sandbox, you'll notice
I mainly use it for looking at technical bits of the parser. No real
articles there. Basically I'm saying that you can't get any useful
output out of it for anything but a text article.

~Daniel Friesen(Dantman, Nadir-Seen-Fire) of:
-The Nadir-Point Group (http://nadir-point.com)
--It's Wiki-Tools subgroup (http://wiki-tools.com)
--The ElectronicMe project (http://electronic-me.org)
--Games-G.P.S. (http://ggps.org)
-And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
--Animepedia (http://anime.wikia.com)
--Narutopedia (http://naruto.wikia.com)

Guy Van den Broeck wrote:
> Can you elaborate one what you think is "utterly broken"?
>
> 2008/8/22 Daniel Friesen <dan_the_man[at]telus.net>:
>
>> I did for sandbox.wiki-tools.com, however the whole thing looks utterly
>> broken to me. I've disabled it since it basically breaks my ability to
>> diff things in the sandbox.
>>
>> ~Daniel Friesen(Dantman, Nadir-Seen-Fire) of:
>> -The Nadir-Point Group (http://nadir-point.com)
>> --It's Wiki-Tools subgroup (http://wiki-tools.com)
>> --The ElectronicMe project (http://electronic-me.org)
>> --Games-G.P.S. (http://ggps.org)
>> -And Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
>> --Animepedia (http://anime.wikia.com)
>> --Narutopedia (http://naruto.wikia.com)
>>
>> Roan Kattouw wrote:
>>
>>> Brion Vibber schreef:
>>>
>>>
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>>
>>>> Guy Van den Broeck wrote:
>>>>
>>>>
>>>>
>>>>> Is anyone actually interested in this feature? My GSoC is officially
>>>>> over now and I need to decide how much work I want to put in to the
>>>>> HTML differ voluntarily.
>>>>>
>>>>>
>>>>>
>>>> Well, I am! :)
>>>>
>>>> It could make for much more legible RSS feeds, for example.
>>>>
>>>>
>>> Maybe someone with a public test wiki could enable this? Maybe
>>> test.wikipedia.org could?
>>>
>>> Roan Kattouw (Catrope)
>>>
>>>
>> _______________________________________________
>> Wikitech-l mailing list
>> Wikitech-l[at]lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>>


_______________________________________________
Wikitech-l mailing list
Wikitech-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.