Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Mediawiki

xml import parse error on plusmn entity

 

 

Wikipedia mediawiki RSS feed   Index | Next | Previous | View Threaded


jimhu at tamu

Jun 20, 2008, 10:26 AM

Post #1 of 3 (235 views)
Permalink
xml import parse error on plusmn entity

I'm thinking this may be a php bug rather than a mw problem - but I'm
wondering how to get around it. I generate MW xml for importing pages
and I use htmlentities to encode things for xml. But I just saw a
problem with the XML parser failing to recognize the ± entity.

Any suggestions?

Jim

=====================================
Jim Hu
Associate Professor
Dept. of Biochemistry and Biophysics
2128 TAMU
Texas A&M Univ.
College Station, TX 77843-2128
979-862-4054


_______________________________________________
MediaWiki-l mailing list
MediaWiki-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l


brion at wikimedia

Jun 20, 2008, 11:15 AM

Post #2 of 3 (222 views)
Permalink
Re: xml import parse error on plusmn entity [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jim Hu wrote:
> I'm thinking this may be a php bug rather than a mw problem - but I'm
> wondering how to get around it. I generate MW xml for importing pages
> and I use htmlentities to encode things for xml. But I just saw a
> problem with the XML parser failing to recognize the ± entity.

± has no inherent meaning in XML; it would have to be defined via
the doctype or directly in a processor directive in the document.

Instead of htmlentities(), use htmlspecialchars() which is safe for XML
by only using the XML-predefined character references &, <, >,
and ".

Ensure your text is properly encoded (eg, UTF-8 unless your XML file is
otherwise marked.)

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkhb89wACgkQwRnhpk1wk46knwCg1RlfJYUT18TEaG3djFCQpKDR
VjkAnR9vMF0r3gWHl3B2cgcrz1RivwTE
=3qsd
-----END PGP SIGNATURE-----

_______________________________________________
MediaWiki-l mailing list
MediaWiki-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l


jimhu at tamu

Jun 23, 2008, 8:52 AM

Post #3 of 3 (196 views)
Permalink
Re: xml import parse error on plusmn entity [In reply to]

Thanks Brion,

On Jun 20, 2008, at 1:15 PM, Brion Vibber wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Jim Hu wrote:
>> I'm thinking this may be a php bug rather than a mw problem - but I'm
>> wondering how to get around it. I generate MW xml for importing
>> pages
>> and I use htmlentities to encode things for xml. But I just saw a
>> problem with the XML parser failing to recognize the ± entity.
>
> ± has no inherent meaning in XML; it would have to be defined
> via
> the doctype or directly in a processor directive in the document.
>
> Instead of htmlentities(), use htmlspecialchars() which is safe for
> XML
> by only using the XML-predefined character references &, <,
> >,
> and ".

Done! I also did something I should have done before I posted - I put
a ± in a Sandbox page and exported it to see how MW handles
it... it turns into a &plusmn, which imports and converts back to
the plus or minus character. Nice!

Jim

>
>
> Ensure your text is properly encoded (eg, UTF-8 unless your XML file
> is
> otherwise marked.)
>
> - -- brion
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.8 (Darwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkhb89wACgkQwRnhpk1wk46knwCg1RlfJYUT18TEaG3djFCQpKDR
> VjkAnR9vMF0r3gWHl3B2cgcrz1RivwTE
> =3qsd
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> MediaWiki-l mailing list
> MediaWiki-l[at]lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

=====================================
Jim Hu
Associate Professor
Dept. of Biochemistry and Biophysics
2128 TAMU
Texas A&M Univ.
College Station, TX 77843-2128
979-862-4054


_______________________________________________
MediaWiki-l mailing list
MediaWiki-l[at]lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

Wikipedia mediawiki RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.