Gossamer Forum
Home : Products : DBMan : Customization :

Problem with searching DB with content in foreign language.

Quote Reply
Problem with searching DB with content in foreign language.
Hi everyone,

I have DBman working prefectly except the only thing.
My database is in Russian and I found that search options (ma, cs, ww) work not properly.
At the same time sorting performes correctly.

I suspect the problem may be related to encoding, or character table, or stuff like that, not to DBman itself.

Does anybody know how to fix that?
I've checked the entire forums, but failed to get similar topic.

Any idea/advice will appreciated.

Matt_Z

[This message has been edited by Matt_Z (edited December 13, 1999).]
Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
Try looking for Exact Word Match. This may help you.

Regards.

------------------
Eliot Lee
Anthro TECH,L.L.C
www.anthrotech.com
----------------------


Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
Hi Eliot,

Sorry, "Exact Word Match" didn't help. Let me explain.
For example I have several forms of one word in defualt.db file (like it could be in English - working, work, Work). To avoid strange characters I will use the English example below, though the problem is with Russian text only.

Search with "keyword=work&ww=on" returns "no match found".
Search with keyword=work (cs=off&ww=off) returns records only with "work" and "working", but not "Work".
Search with keyword=Work (cs=off&ww=off)returns records only with "Work", but not "work" and/or "working".

At the same time everything perfect with data in English.

What do you think of all that?

TIA
Matt_Z
Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
Welp, that suggestion didn't work. I was trying to help. May be some one else can help you.

Good luck!

Regards.

------------------
Eliot Lee
Anthro TECH,L.L.C
www.anthrotech.com
----------------------


Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
I am sure you're correct in saying it is to do with the character table. In your examples it seems that Perl can't tell what the code for the lower case of an upper case letter nor can it tell the upper case of a lower case. In the ASCII table it is just a matter of adding 32(decimal) to the upper case to get the lower case and subtracting 32 from the lower case to get the upper case. It would be interesting to see what the character table for the Cyrillic code table does.

------------------
JGU



[This message has been edited by jury (edited December 16, 1999).]
Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
Hi juri,

Cyrillic alphabet consists of 32 characters and its ASCII layout -> 192-223 (Uppercase) and 224-255 (lowercase).
However I have no clue what to do Frown with that.
How to push it (Perl? DBman?) working?

Matt_Z
Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
Will the offset between upper and lower case is the same (32). But you are using the high ASCII table. This may be difficult if someone hasn't done it before see :
www.siber.com/sib/russify/
or
http://kulichki-win.rambler.ru/...a/bera/Comp/cyr.html

------------------
JGU



[This message has been edited by jury (edited December 16, 1999).]

[This message has been edited by jury (edited December 16, 1999).]
Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
Hi Eliot,
Hi juri,

Thank you guys for your assistance. Finally I fixed the problem.
It was require to apply proper "locale" for the script in order to push UNIX understand desired character table.

Thanks again. Everything OK now.

Matt_Z
Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
Hi Matt_Z;
I had the similar problem with Links 2.
I suppose changes need to be made to search script only?
I cannot achieve that...
Can you help.

------------------
terryhot
http://lviv.gu.net
Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
Hi taras,

You need to add a couple of lines of code to your db.cgi script in order to use a proper locale (considering you are under UNIX).
What is your language Russian or Ukranian?

Regards
Matt_Z
Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
I'm having problems using caracters like: á, é, ã...

If I search for "árabe", I get nothing, if I search for "rabe" it works, and it shows "árabe" on the screen.

Anybody can help me out on this one??
Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
Hi fgoldin,

Sorry, I did not quite get you.
If you could explain more detailed what seems to the problem?

Matt_Z
Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
I'll will try to be more clear, I live in brazil, and we have characters that the american language dont have. Beside all the other characters (a-z), we have á, é, ã, ó, õ...

The entries in the database are always correct, like the word "árabe", but most of the time, the users will search for "arabe" (it seams strange, but almost no one configures the keyboard right).

I would like dbman to consider the letters á, ã, à, â to be exactly like a. and
ó, õ, ô, ò to be like o. and so on....

It this possible??

If not, I would like to change the "locale" like you did, so dbman will undestand correctly the special characters.

Right now, dbman wont search right if the special character is the first letter on the word, but it searches right if it is on the middle or end of the word... I have no ideia why.
Thanks.
Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
To: fgoldin

Though I'm not an expert I would suggest the following:
1. Set locale. It will allow users with proper language settings perform search and sorting correctly. At the same time other users will see no difference.

2. If you wish DBman treat foreign character(s) as English, I think you need change script code and set necessary substitutions for the desired characters. I am afraid I won't help you with that, sorry.

Not much assistance I guess. But anywhere...
Good luck.
Quote Reply
Re: Problem with searching DB with content in foreign language. In reply to
How do I change the "locale"?? I have no ideia what's that...

thanks.