
jimhu at tamu
Jun 23, 2008, 8:52 AM
Post #3 of 3
(196 views)
Permalink
|
|
Re: xml import parse error on plusmn entity
[In reply to]
|
|
Thanks Brion, On Jun 20, 2008, at 1:15 PM, Brion Vibber wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Jim Hu wrote: >> I'm thinking this may be a php bug rather than a mw problem - but I'm >> wondering how to get around it. I generate MW xml for importing >> pages >> and I use htmlentities to encode things for xml. But I just saw a >> problem with the XML parser failing to recognize the ± entity. > > ± has no inherent meaning in XML; it would have to be defined > via > the doctype or directly in a processor directive in the document. > > Instead of htmlentities(), use htmlspecialchars() which is safe for > XML > by only using the XML-predefined character references &, <, > >, > and ". Done! I also did something I should have done before I posted - I put a ± in a Sandbox page and exported it to see how MW handles it... it turns into a &plusmn, which imports and converts back to the plus or minus character. Nice! Jim > > > Ensure your text is properly encoded (eg, UTF-8 unless your XML file > is > otherwise marked.) > > - -- brion > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.8 (Darwin) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iEYEARECAAYFAkhb89wACgkQwRnhpk1wk46knwCg1RlfJYUT18TEaG3djFCQpKDR > VjkAnR9vMF0r3gWHl3B2cgcrz1RivwTE > =3qsd > -----END PGP SIGNATURE----- > > _______________________________________________ > MediaWiki-l mailing list > MediaWiki-l[at]lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/mediawiki-l ===================================== Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ MediaWiki-l mailing list MediaWiki-l[at]lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
|