Hello again (haven't been here for quite a while ...),
here's a thing that keeps bothering me. I have databases where I need to use characters from the Unicode ranges "Latin Extended" and "Latin Extended Additional". If I enter the characters with the appropriate HTML entity numbers, like ḥ, then they turn out fine with the charset in the HTML pages set to ISO-8859-1. Only, entering characters and especially searching is bothersome - I don't really want to memorize all these numbers, thank you.
So I thought I could just find a way to enter the characters straight into the web form (using a utility like Keyman on Windows, or just configuring my .Xmodmap on my Linux box), and switch the character setting in the html pages to UTF-8.
Fine. Only: Once I've used this setup for a while, things get messy: German umlauts are kind of "melted" together with the three or four letters after them, and in some cases I can only guess what actually was there in the database before. Is it possible that Perl (or dbman) does something weird when adding or modifying records?
Add to which: Linux and Windows seem to have different ways of entering Unicode characters into web forms. As a result, when I enter something into the database on my Windows machine at work, and try to search for it at home with my Linux box, I can't find it.
I guess it would be better to switch back to ISO-8859-1 as encoding in the HTML page, and find some way of entering special characters that leaves them as numerical HTML entities in the database and is operating-system-independent. Yet, I'd really like to understand what's going on here ....
Thanks !
kellner
here's a thing that keeps bothering me. I have databases where I need to use characters from the Unicode ranges "Latin Extended" and "Latin Extended Additional". If I enter the characters with the appropriate HTML entity numbers, like ḥ, then they turn out fine with the charset in the HTML pages set to ISO-8859-1. Only, entering characters and especially searching is bothersome - I don't really want to memorize all these numbers, thank you.
So I thought I could just find a way to enter the characters straight into the web form (using a utility like Keyman on Windows, or just configuring my .Xmodmap on my Linux box), and switch the character setting in the html pages to UTF-8.
Fine. Only: Once I've used this setup for a while, things get messy: German umlauts are kind of "melted" together with the three or four letters after them, and in some cases I can only guess what actually was there in the database before. Is it possible that Perl (or dbman) does something weird when adding or modifying records?
Add to which: Linux and Windows seem to have different ways of entering Unicode characters into web forms. As a result, when I enter something into the database on my Windows machine at work, and try to search for it at home with my Linux box, I can't find it.
I guess it would be better to switch back to ISO-8859-1 as encoding in the HTML page, and find some way of entering special characters that leaves them as numerical HTML entities in the database and is operating-system-independent. Yet, I'd really like to understand what's going on here ....
Thanks !
kellner