
david at kineticode
Mar 10, 2008, 3:37 PM
Post #24 of 28
(13729 views)
Permalink
|
On Mar 6, 2008, at 20:11, Schults, Chris wrote: > And according to this list of ANSI characters not in ISO-8859-1: > > http://www.alanwood.net/demos/charsetdiffs.html#a <http://www.alanwood.net/demos/charsetdiffs.html#a > > Yes, and this is why I wrote Encode::ZapCP1252: to convert those bogus characters to ASCII. I need to update it to optionally convert them to UTF-8. > The characters in question are not part of ISO-8859-1, and believe > them to be part of Windows-1252 (CP1252). Thus, I'm guessing that > when converted to ISO-8859-1 in Bricolage, there is no match, so the > Unicode representation is returned in the format '\x{...}'. Does > this make sense to y'all. No, because Bricolage expects UTF-8 to be submitted to the browser, and it stores the data as UTF-8. So it never converts from CP-1252 to ISO-8859-1. It converts from CP-1252 to UTF-8, and then later from UTF-8 to ISO-8859-1. Of course, it only takes that first step if you've set your character set preference in Bricolage to CP-1252. Ah-ha! That's the bit I've been trying to remember for how we've recommended handling this issue in the past. Try changing your character set preference, then create a new story and paste from Word, and then try to preview it with a template that calls $burner- >set_encoding('encoding(iso-8859-1)');' and see if it doesn't properly come out as ISO-8859-1. That should work! Of course, the only thing I cannot understand is why you continue to get "\x{201c}", which is a UTF-8 character > However, this is not usable to me, so how do I convert '\x{...}' to > something useful? I'm a little confused. Are you seeing a curly quote and calling it \x{201c}" (which is how you can represent it in a Perl double-quoted string), or are you seeing the literal string \x{201c}"? > I've have now read more than I've ever wanted or expected to about > Unicode, character sets and character encodings ... and I'm still > confused. Sigh. It's all good stuff to know, and will pay off in the long run, believe me. Best, David
|