Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech

ParseTree generated by the API and tag extension

 

 

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded


alex.bernier at free

Aug 5, 2009, 1:20 AM

Post #1 of 6 (836 views)
Permalink
ParseTree generated by the API and tag extension

Hello,

I have what I think is strange behaviour with parser called by the API
and tag extensions.
Here is the code of a tag extension :

function efParserInit() {
global $wgParser;
$wgParser->setHook( 'foo', 'effooRender' );
$wgParser->setHook( 'bar', 'efbarRender' );
return true;
}

function effooRender( $input, $args, &$parser )
{
$output = $parser->recursiveTagParse( $input );
return "<div class=\"foo\">" . $output . "</div>";
}

function efbarRender( $input, $args, &$parser )
{
$output = $parser->recursiveTagParse( $input );
return "<div class=\"bar\">" . $output . "</div>";
}

I crate a page on my Wiki with the following text :
<foo>test1<bar>'''test2'''</bar></foo>

When I view the HTML code in my browser, I get :
<div class="foo">test1<div class="bar"><b>test2</b></div></div>

But when I try : wget
"http://mywiki/wiki/api.php?action=query&titles=mypage&prop=revisions&rvprop=content&rvexpandtemplates&rvgeneratexml&format=xml",
I get the following in the "parsetree" attribute of the "rev" element ":
&lt;ext&gt;&lt;name&gt;foo&lt;/name&gt;&lt;attr/&gt;&lt;inner&gt;test1&amp;lt;bar&amp;gt;&#039;&#039;&#039;test2&#039;&#039;&#039;&amp;lt;/bar&amp;gt;&lt;/inner&gt;&lt;close&gt;&amp;lt;/foo&amp;gt;&lt;/close&gt;&lt;/ext&gt;

(sorry for this ugly line...).

My problem is here : there is only one "ext" element which correspond to my
"foo" tag. I was waiting two "ext" elements : one for "foo" tag and one for
"bar" tag.


This is just I want.
Is this behaviour normal ?

A little bit of debug show that the includes/parser/Parser.php, the
"extensionSubstitution" function is only called one time when I access my
page via the API and three times when I access it via the browser. Is the
text parsed not the same in the two cases ? Is it something wrong in my API
call ?

Regards,

Alex

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


brion at wikimedia

Aug 5, 2009, 10:27 AM

Post #2 of 6 (766 views)
Permalink
Re: ParseTree generated by the API and tag extension [In reply to]

On 8/5/09 1:20 AM, Alex Bernier wrote:
> function efParserInit() {
> global $wgParser;
> $wgParser->setHook( 'foo', 'effooRender' );
> $wgParser->setHook( 'bar', 'efbarRender' );
> return true;
> }
[snip]
> I crate a page on my Wiki with the following text :
> <foo>test1<bar>'''test2'''</bar></foo>
[snip]
> My problem is here : there is only one "ext" element which correspond to my
> "foo" tag. I was waiting two "ext" elements : one for "foo" tag and one for
> "bar" tag.
>
>
> This is just I want.
> Is this behaviour normal ?

Looks normal. Since your tag is opaque, the preparser can't descend into
its contents for additional parsing.

I'm not sure offhand whether we have appropriate interfaces already for
declaring a tag hook at setup time as containing wikitext.

-- brion

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


magnusmanske at googlemail

Aug 6, 2009, 5:32 AM

Post #3 of 6 (754 views)
Permalink
Re: ParseTree generated by the API and tag extension [In reply to]

On Wed, Aug 5, 2009 at 9:20 AM, Alex Bernier<alex.bernier [at] free> wrote:
> Hello,
>
> I have what I think is strange behaviour with parser called by the API
> and tag extensions.

<snip/>

> A little bit of debug show that the includes/parser/Parser.php, the
> "extensionSubstitution" function is only called one time when I access my
> page via the API and three times when I access it via the browser. Is the
> text parsed not the same in the two cases ? Is it something wrong in my API
> call ?

If you want XML parsing of wiki text, the closest you can come to tat
is, AFAFIK, still my wiki2xml:
http://toolserver.org/~magnus/wiki2xml/w2x.php

Or as an extension:
http://www.mediawiki.org/wiki/Extension:Wiki2xml

It kinda works for most simple wikitext. Feel free to fix/improve.

Cheers,
Magnus

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


roan.kattouw at gmail

Aug 8, 2009, 4:03 AM

Post #4 of 6 (738 views)
Permalink
Re: ParseTree generated by the API and tag extension [In reply to]

2009/8/5 Brion Vibber <brion [at] wikimedia>:
> Looks normal. Since your tag is opaque, the preparser can't descend into
> its contents for additional parsing.
>
> I'm not sure offhand whether we have appropriate interfaces already for
> declaring a tag hook at setup time as containing wikitext.
>
To clarify this possibly cryptic comment: the preprocessor sees
<foo>some crap</foo> and passes "some crap" to the rendering function
associated to <foo> without looking at it any further. It doesn't care
whether "some crap" contains additional tags or things that have
meaning in wikitext or whatever: that's for the <foo> handler to sort
out.

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


alex.bernier at free

Aug 8, 2009, 10:01 AM

Post #5 of 6 (727 views)
Permalink
Re: ParseTree generated by the API and tag extension [In reply to]

On Sat, Aug 08, 2009 at 01:03:35PM +0200, Roan Kattouw wrote:
> 2009/8/5 Brion Vibber <brion [at] wikimedia>:
> > Looks normal. Since your tag is opaque, the preparser can't descend into
> > its contents for additional parsing.
> >
> > I'm not sure offhand whether we have appropriate interfaces already for
> > declaring a tag hook at setup time as containing wikitext.
> >
> To clarify this possibly cryptic comment: the preprocessor sees
> <foo>some crap</foo> and passes "some crap" to the rendering function
> associated to <foo> without looking at it any further. It doesn't care
> whether "some crap" contains additional tags or things that have
> meaning in wikitext or whatever: that's for the <foo> handler to sort
> out.

Is it possible to call the preprocessor in the <foo> handler ?

Regards,

Alex

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


roan.kattouw at gmail

Aug 8, 2009, 11:04 AM

Post #6 of 6 (735 views)
Permalink
Re: ParseTree generated by the API and tag extension [In reply to]

2009/8/8 Alex Bernier <alex.bernier [at] free>:
> Is it possible to call the preprocessor in the <foo> handler ?
>
Indirectly, that's what's already happening, but in a later stage. The
sequence of events is:

1. Preprocessor recognizes <foo>blah<bar>blah</bar>blah</foo> as an
extension tag <foo> with content "blah<bar>blah</bar>blah"
2. Entry from 1. ends up in the parse tree
3. When parsing, the handler for <foo> is invoked, which does
something with the tag contents ("blah<bar>blah</bar>blah") and calls
recursiveTagParse() on them.
3a. This causes the preprocessor to be run on
"blah<bar>blah</bar>blah", which dissects it into a literal "blah", an
extension tag <bar> with content "blah", and another literal "blah".
3b. Based on this dissection, the parser calls the <bar> handler
3c. recursiveTagParse() returns the return value of the <bar> handler
surrounded with "blah"
3d. The <foo> handler returns something
4. The parser inserts the return value of the <foo> handler in the HTML output

As you can see, there's nested parsing going on here, and the fact
that there's a <bar> tag inside the <foo> tag is only discovered in
the inner parse (step 3a), which means it doesn't end up in the parse
tree of the outer parse (step 2). The reason it works this way is that
some tags don't *want* their contents to be parsed or preprocessed,
because they treat it as plain text, not wikitext.

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.