Gossamer Forum
Home : Products : Links 2.0 : Customization :

Searching in non-english without Case Sensitive

Quote Reply
Searching in non-english without Case Sensitive
Hi all and sorry for my English.
I am using nice Non-english mod from Matthias Berndt and it is run in Russian.
Everything works fine besides searching. All search in foreign language is Case Sensitiv.
For example. I have a word Àâòîáóñ (Bus) in the description of a link. When i type the word Àâòîáóñ (Bus), search.cgi will find this word. But if I type the word àâòîáóñ (bus), search failed - no results.

Can anybody give me some advise? Thanks in advance.

Quote Reply
Re: Searching in non-english without Case Sensitive In reply to
Have you tried deleting lc codes in your search.cgi file?

Regards,

Eliot Lee
Anthro TECH, L.L.C
Web: http://www.anthrotech.com/
Quote Reply
Re: Searching in non-english without Case Sensitive In reply to
Thanks for advice. But I could not find this code <b>lc</b> im my search.cgi.
Maybe, because I am using search.cgi, that come with Nonenglish Mod fom Matthias Berndt.

Any comments?

Thanks in advance.

Quote Reply
Re: Searching in non-english without Case Sensitive In reply to
Hello!
I think with lc Elliot meant "LowerCase". If so, I think this will not work, because as far as I know foreign characters (= nonenglish characters) are not handled as normal characters. More like symbols (don't know the correct English word for it).
Anyway, what I'm trying to say is, PERL doesn't see any connection between Ä and ä for example, whereas A and a can be handled as capital letter and small letter, but both are "A".
If I'm not mistaken.

So, if you need a solution, you'll have to think about a way telling the script to see no difference betwenn Ä and ä. With some kind of regular expression, I'd say.

Denis

Quote Reply
Re: Searching in non-english without Case Sensitive In reply to
OK. Thanks for the answer.
After reviewing this threads http://207.105.53.169/...m3/HTML/003424.htmll
I add the following code bellow %in = &parse_form(); in search.cgi:

$in{'query'} =~s/À/\à/g;
$in{'query'} =~s/Â/\â/g;
$in{'query'} =~s/Ò/\ò/g;
$in{'query'} =~s/Î/\î/g;
$in{'query'} =~s/Á/\á/g;
$in{'query'} =~s/Ó/\ó/g;

and so on. All 33 Character we use in Russian language (inputed in Russian encoding).

Everythings works good now. Then I input in search field âÒîÁóÑ, search returns both "âòîáóñ" and "ÂÒÎÁÓÑ"

But I still have problem with Cyrrilic À and with Cyrillic M. They are
still Case Sensetiv.
I just started to learn Perl and have not enougth knowlege to solve this problem. So, I need your help.

Thanks in advance


Quote Reply
Re: Searching in non-english without Case Sensitive In reply to
Try using Perl escape characer \ before the accent mark in the regular expression statements.

Regards,

Eliot Lee
Anthro TECH, L.L.C
Web: http://www.anthrotech.com/
Quote Reply
Re: Searching in non-english without Case Sensitive In reply to
I have solved this problem and want to post for other nonenglish Links administrators.
It works for me 100% in Russian language and without Case Sensitive
All that you need is localize you server. Add the following in search.cgi just after #!/usr/bin/perl

use locale;
use POSIX qw(setlocale LC_ALL);
setlocale(LC_ALL, "ru_SU.KOI8-R");

You need to edit "ru_SU.KOI8-R" to meet your language demand.

Nice non-english searching!

Quote Reply
Re: Searching in non-english without Case Sensitive In reply to
Hi!
how do i know which is my language demand?
gregor

Quote Reply
Re: Searching in non-english without Case Sensitive In reply to
Hello,

You have to look at Unix (if you have it) documentation to define locale for your language at Unix Server.

Normally, you have to replace ru_SU.KOI8-R with your language abriviation. For exmple,
for British English it will be "en_GB", for German "de_DE", for US English "en_US" and so on.

In my case I used more complicated locale, while in Russian language we have several encoding.

Good luck