Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech

Patch for gallery tag, review questions.

 

 

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded


kim at heldig

Apr 11, 2012, 12:27 AM

Post #1 of 8 (133 views)
Permalink
Patch for gallery tag, review questions.

I have created a patch for the gallery tag and have been given the
following review.

https://gerrit.wikimedia.org/r/4609

* JavaScript injection: you can inject javascript: URIs which execute
code when clicked
* plain links ("link=Firefox") are taken as relative URLs which will
randomly work or not work depending on where they're viewed from
* need parser test cases to demo it working

So my questions are:

What would be the recommended way of stripping away javascript from
uris? Are there any shared functions which do exactly this?
And how would i solve the plain links problem? do a regex check for an
absolute uri? e.g http://example.org/foo/bar?
And what is "parser test cases", phpunit tests? or some other form of testing?

Thank you!
Kim Eik.

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


sumanah at wikimedia

Apr 11, 2012, 10:51 AM

Post #2 of 8 (126 views)
Permalink
Re: Patch for gallery tag, review questions. [In reply to]

On 04/11/2012 03:27 AM, Kim Eik wrote:
> I have created a patch for the gallery tag and have been given the
> following review.
>
> https://gerrit.wikimedia.org/r/4609
>
> * JavaScript injection: you can inject javascript: URIs which execute
> code when clicked
> * plain links ("link=Firefox") are taken as relative URLs which will
> randomly work or not work depending on where they're viewed from
> * need parser test cases to demo it working
>
> So my questions are:
>
> What would be the recommended way of stripping away javascript from
> uris? Are there any shared functions which do exactly this?
> And how would i solve the plain links problem? do a regex check for an
> absolute uri? e.g http://example.org/foo/bar?
> And what is "parser test cases", phpunit tests? or some other form of testing?
>
> Thank you!
> Kim Eik.

Hi, Kim, and thanks for the patch!

I can answer your last question:
https://www.mediawiki.org/wiki/Parser_tests You might also want to skim
https://www.mediawiki.org/wiki/Testing_portal and
https://www.mediawiki.org/wiki/Manual:MediaWiki_architecture

Thanks!


--
Sumana Harihareswara
Volunteer Development Coordinator
Wikimedia Foundation

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


brion at pobox

Apr 11, 2012, 11:03 AM

Post #3 of 8 (126 views)
Permalink
Re: Patch for gallery tag, review questions. [In reply to]

On Wed, Apr 11, 2012 at 12:27 AM, Kim Eik <kim [at] heldig> wrote:

> I have created a patch for the gallery tag and have been given the
> following review.
>
> https://gerrit.wikimedia.org/r/4609
>
> * JavaScript injection: you can inject javascript: URIs which execute
> code when clicked
> * plain links ("link=Firefox") are taken as relative URLs which will
> randomly work or not work depending on where they're viewed from
> * need parser test cases to demo it working
>
> So my questions are:
>
> What would be the recommended way of stripping away javascript from
> uris? Are there any shared functions which do exactly this?
>

You should check to see how the 'link' parameter is handled on standalone
images.

In Parser::makeImage() look for the "case 'link':"; it uses existing
regexes to check if the link matches allowed URL schemes, and if not tries
to treat it as a page title.

And how would i solve the plain links problem? do a regex check for an
> absolute uri? e.g http://example.org/foo/bar?
> And what is "parser test cases", phpunit tests? or some other form of
> testing?
>

Parser test cases live in tests/parser/parserTests.txt, and can be run both
through the phpunit test suite and through the standalone parserTests.php
-- so a parser test failure should trigger a Jenkins test failure.

Each test case specifies input wikitext and output HTML, to confirm that
things operate as expected.

-- brion
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


hashar+wmf at free

Apr 11, 2012, 11:01 PM

Post #4 of 8 (124 views)
Permalink
Re: Patch for gallery tag, review questions. [In reply to]

Le 11/04/12 09:27, Kim Eik a écrit :
> I have created a patch for the gallery tag and have been given the
> following review.
>
> https://gerrit.wikimedia.org/r/4609
>
> * JavaScript injection: you can inject javascript: URIs which execute
> code when clicked
> * plain links ("link=Firefox") are taken as relative URLs which will
> randomly work or not work depending on where they're viewed from
<snip>
> What would be the recommended way of stripping away javascript from
> uris? Are there any shared functions which do exactly this?
> And how would i solve the plain links problem? do a regex check for an
> absolute uri? e.g http://example.org/foo/bar?

I have added some inline comment on includes/parser/Parser.php patch #7

https://gerrit.wikimedia.org/r/#patch,unified,4609,7,includes/parser/Parser.php

Copy pasting it here for later reference:

----------------------------------------------------------------------
const EXT_URL_REGEX =
'/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+
)?@)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&amp;?
([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';

We would need a parser guru to find out a similar and simpler regex.
Anyway you will find hints in includes/parser/Parser.php
wfUrlProtocols() gives a regex of protocols allowed in URLs.

Parser::EXT_LINK_URL_CLASS is a regex of character allowed and of those
disallowed. That makes sure you find out the end of the URL with various
funny case such as 0+3000 which is an ideographic space and is used on
Chinese wikis.

Since what you are trying to achieve is really similar to the 'link'
parameter handling in parser::makeImage() . Some relevant code:

case 'link':
$chars = self::EXT_LINK_URL_CLASS;
$prots = $this->mUrlProtocols; // which is wfUrlProtocols()
if ( preg_match( "/^($prots)$chars+$/u", $value, $m ) ) {
$paramName = 'link-url';
$this->mOutput->addExternalLink(
$value );
if (
$this->mOptions->getExternalLinkTarget() ) {
$params[$type]['link-target'] =
$this->mOptions->getExternalLinkTarget();
}
Well you get the idea :-)
----------------------------------------------------------------------

Reading my text again I should have reread myself before saving that
comment. Anyway, I am pretty sure we can factor out the code handling
'link' for image and what you are trying to do.


--
Antoine "hashar" Musso


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


roan.kattouw at gmail

Apr 11, 2012, 11:33 PM

Post #5 of 8 (125 views)
Permalink
Re: Patch for gallery tag, review questions. [In reply to]

On Apr 11, 2012 11:01 PM, "Antoine Musso" <hashar+wmf [at] free> wrote:
> const EXT_URL_REGEX =
> '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+
>
)?@)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&amp;?
>
([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';
>
ZOMG. Anyway, what I'd do if I had a MediaWiki clone handy is look through
Sanitizer.php to see if there's anything in there that handles URLs.

Roan
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


lists at nadir-seen-fire

Apr 12, 2012, 12:42 AM

Post #6 of 8 (124 views)
Permalink
Re: Patch for gallery tag, review questions. [In reply to]

On Wed, 11 Apr 2012 23:33:53 -0700, Roan Kattouw <roan.kattouw [at] gmail>
wrote:

> On Apr 11, 2012 11:01 PM, "Antoine Musso" <hashar+wmf [at] free> wrote:
>> const EXT_URL_REGEX =
>> '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+
>>
> )?@)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&amp;?
>>
> ([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';
>>
> ZOMG. Anyway, what I'd do if I had a MediaWiki clone handy is look
> through
> Sanitizer.php to see if there's anything in there that handles URLs.
>
> Roan

There's always gitweb. For the de-jure standard repo viewer its urls and
navigation is awful (though that's not git's fault, as you can see by some
of the other repo viewers that do it better) but it works.
All we've got in Sanitizer is href="" handling in attribute sanitization.
https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=includes/Sanitizer.php;h=a2459c43b5920ee414746aa5053d451d52f04861;hb=HEAD#l750
https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=includes/Sanitizer.php;h=a2459c43b5920ee414746aa5053d451d52f04861;hb=HEAD#l793

This case should really be handled by checking against wfUrlProtocols. And
then anything that doesn't match gets sent thorough Title::newFromText.
And anything that further causes Title to return null/false should be
ignored.

--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


hashar+wmf at free

Apr 12, 2012, 1:43 AM

Post #7 of 8 (126 views)
Permalink
Re: Patch for gallery tag, review questions. [In reply to]

Le 12/04/12 08:33, Roan Kattouw a écrit :
<snip some long regex>
> ZOMG. Anyway, what I'd do if I had a MediaWiki clone handy is look through
> Sanitizer.php to see if there's anything in there that handles URLs.

Parser.php already has some code when dealing with link= parameter for
image tags. That is the code I copy pasted above :-D

--
Antoine "hashar" Musso


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


kim at heldig

Apr 12, 2012, 4:55 AM

Post #8 of 8 (125 views)
Permalink
Re: Patch for gallery tag, review questions. [In reply to]

My latest patch reuses regex already defined in Parser.php. please
review @ https://gerrit.wikimedia.org/r/#change,4609

On Thu, Apr 12, 2012 at 10:43 AM, Antoine Musso <hashar+wmf [at] free> wrote:
> Le 12/04/12 08:33, Roan Kattouw a écrit :
> <snip some long regex>
>> ZOMG. Anyway, what I'd do if I had a MediaWiki clone handy is look through
>> Sanitizer.php to see if there's anything in there that handles URLs.
>
> Parser.php already has some code when dealing with link= parameter for
> image tags.  That is the code I copy pasted above :-D
>
> --
> Antoine "hashar" Musso
>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.