Gossamer Forum
Home : Products : DBMan : Discussions :

unicode questions

Quote Reply
unicode questions
Hello again (haven't been here for quite a while ...),

here's a thing that keeps bothering me. I have databases where I need to use characters from the Unicode ranges "Latin Extended" and "Latin Extended Additional". If I enter the characters with the appropriate HTML entity numbers, like ḥ, then they turn out fine with the charset in the HTML pages set to ISO-8859-1. Only, entering characters and especially searching is bothersome - I don't really want to memorize all these numbers, thank you.

So I thought I could just find a way to enter the characters straight into the web form (using a utility like Keyman on Windows, or just configuring my .Xmodmap on my Linux box), and switch the character setting in the html pages to UTF-8.

Fine. Only: Once I've used this setup for a while, things get messy: German umlauts are kind of "melted" together with the three or four letters after them, and in some cases I can only guess what actually was there in the database before. Is it possible that Perl (or dbman) does something weird when adding or modifying records?

Add to which: Linux and Windows seem to have different ways of entering Unicode characters into web forms. As a result, when I enter something into the database on my Windows machine at work, and try to search for it at home with my Linux box, I can't find it.

I guess it would be better to switch back to ISO-8859-1 as encoding in the HTML page, and find some way of entering special characters that leaves them as numerical HTML entities in the database and is operating-system-independent. Yet, I'd really like to understand what's going on here ....

Thanks !
kellner
Subject Author Views Date
Thread unicode questions kellner 3452 Aug 13, 2003, 1:48 PM
Thread Re: [kellner] unicode questions
joematt 3414 Aug 14, 2003, 1:19 PM
Thread Re: [joematt] unicode questions
kellner 3404 Aug 14, 2003, 1:58 PM
Thread Re: [kellner] unicode questions
joematt 3391 Aug 14, 2003, 2:07 PM
Post Re: [joematt] unicode questions
kellner 3398 Aug 14, 2003, 3:15 PM