Gossamer Forum
Home : Products : Links 2.0 : Customization :

Non-English Mod for search.cgi

Quote Reply
Non-English Mod for search.cgi
In Spanish we use special characters as á, é, í, ó and ú. But we usually do a search without them. For example, if somenone is looking for 'Drácula' in the search box he/she will usually type 'dracula'.

The problem is that in my titles and descriptions I used the special characters and when I search without them, I receive no results.

I would like Links 2.0 to switch titles and descriptions to non-special characters as well as the query term, in order to have results with or without stressed characters (or others).

Thanks ind advance,

Diego Ruiz
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Bobsie,

I tried it but it did not work. Frown

But I have an idea: Did you see when you perform a search that it is not case sensitive? Well, I think somewhere in search.cgi there is an option to find keywords with capital letters or not. I think my mod should be there, don't you?

Thanks,

Diego


[This message has been edited by Diego (edited August 30, 1999).]
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Any ideas?

Thanks,

Diego
Quote Reply
Re: Non-English Mod for search.cgi In reply to
I am not sure if this will work, but in sub main of search.cgi, change this code:

Code:
# Set search type -- either phrase or keyword. Also build keyword list to search on.
my @search_terms = ();
($in{'type'} eq 'phrase') ?
(@search_terms = ($in{'query'})) :
(@search_terms = split (/\s/, $in{'query'}));

to read:

Code:
# Set search type -- either phrase or keyword. Also build keyword list to search on.
my @search_terms = ();
$in{'query'} = (s/\á/a/g, $in{'query'});
$in{'query'} = (s/\é/e/g, $in{'query'});
$in{'query'} = (s/\í/i/g, $in{'query'});
$in{'query'} = (s/\ó/o/g, $in{'query'});
$in{'query'} = (s/\ú/u/g, $in{'query'});
($in{'type'} eq 'phrase') ?
(@search_terms = ($in{'query'})) :
(@search_terms = split (/\s/, $in{'query'}));

You may not need the "\" before each one and there may be a way to combine those into one regexp. I just am not up on regexp's as well as I should be.

I hope this helps.

[This message has been edited by Bobsie (edited August 30, 1999).]
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Nice try, but no cigar. Smile

All search.cgi does is compare the query in a case-insensitive manner using a "i" switch on a match regexp; for example:

Code:
$tmp .= "m/\Q${$search_terms}[$_]\E/io

The "i" part of "io" on the end is what specifies the case-insensitive match, so it actually isn't looking for Capital letters.

Did you try removing the forward slashes ( as in ) on all the letters that have them in the mod I suggested? Same results?
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Bobsie,

Yes, I tried it, with no results too.

But let's think about it: with your mod you were removing stressed characters from the query variable. And, meanwhile, my titles and descriptions kept their special characters. As you can imagine, there is no possible result for my search.

Instead I think I should:

1. Perform a regular search,
2. Remove the stressed characters from the title and description I obtain from the database,
3. Perform a new search.

My query term should keep its stressed characters, if any. Do not you think?



[This message has been edited by Diego (edited August 30, 1999).]

[This message has been edited by Diego (edited August 30, 1999).]
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Ack, I got them backwards! I got what you said reversed in my head. Sorry.

It should be:

Code:
$in{'query'} = (s/a/\á/g, $in{'query'});
$in{'query'} = (s/e/\é/g, $in{'query'});
$in{'query'} = (s/i/\í/g, $in{'query'});
$in{'query'} = (s/o/\ó/g, $in{'query'});
$in{'query'} = (s/u/\ú/g, $in{'query'});

This assumes that all occurances of a, e, i, o, and u, are replaced by the stressed characters, which may not be the case.
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Bobsie

Good try.
But now, in my example, we are searching for 'drácúlá', that is not what I have in my title (I have 'drácula').

I know Bobsie should be out of town by now.
Anybody else can help me, please?

I think the solution would be editing the information in my Title and Description fields.



[This message has been edited by Diego (edited August 31, 1999).]
Quote Reply
Re: Non-English Mod for search.cgi In reply to
I found a temporary solution: I added my title stressed words in my keywords field, but without stresses.
The problem is that in these cases, when I do a search the query term does not appears in bold.
And I had to left my stressed words in the description field for a second stage.

I think it should be a better solution.




[This message has been edited by Diego (edited August 31, 1999).]
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Jesus,

I am already using the Non-English mod. The problem is that this mod does not consider this search problem.

Thanks anyway.


[This message has been edited by Diego (edited September 01, 1999).]
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Any news ideas on this?

Thanks.
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Hi Diego,

Did you try the nonenglish characters mod in the Resource Center....

Using this probably you can do what you want...

Jesus

[This message has been edited by Jesus (edited September 01, 1999).]
Quote Reply
Re: Non-English Mod for search.cgi In reply to
I can search with stressed characters using the non-english mod....

Probably you miss something at the installation.

Good luck!
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Jesus,
Me too. Yo did not understand my problem. I can search stressed words, but only if I use the stressed characters in the search box.
What I want is to have the chance to not use them to perform a search. Following my first example, I would like to search for 'dracula' and find 'drácula'.

Regards,

Diego
Quote Reply
Re: Non-English Mod for search.cgi In reply to
It seems nobody has this problem. But if you are interested in this mod, please give me some idea and I will try to do it.
Thanks,

Diego
Quote Reply
Re: Non-English Mod for search.cgi In reply to
I thought Widgetz could give me a hint on this one. Wink


[This message has been edited by Diego (edited September 04, 1999).]
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Diego,

Encontrastes solucion a este problema. Yo estoy con la misma preocupacion.

Gracias

------------------
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Miguel,
No, aún no lo pude solucionar. Y creo que ni siquiera Alex tiene idea. Fijate http://www.gossamer-threads.com/...um3/HTML/003424.html .
Te pido que si te enterás de alguna manera de solucionar esto, me avises.
Gracias.

--------------------------------------
Miguel,
No, I could not solve this. And I think Alex has no idea about it. See http://www.gossamer-threads.com/...um3/HTML/003424.html .
If you find a solution, please tell me.
Thanks.
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Hi !!!

Can someone in this world solve this bug ???

Eliot, Widgetz ???


Regards,

Marco Aurélio
Quote Reply
Re: Non-English Mod for search.cgi In reply to
Hi everyone,
I've been monitoring this thread and others for some time now, hoping someone would come up with a reasonable solution to the problem. No luck :-)
So, this is the first time I'm posting something in the hope that anyone can solve our problem

First a bit of language knowledge so we can all be tuned to the same frequency.

In latin languages such as French, Spanish, and Portuguese, the only accentuated characters are vowels:
a à à e é, etc.

Apart from these, we have the " ñ " in Spanish and " ç " in Portuguese (don't no if ç is used in Spanish)

So here is what I would try if I was a Perl guru:

Let's concentrate on the English word, CONDUCTION
conducción in ES (and FR if I'm not mistaken)
condução in PT

If someone entered this keyword in a search query, the search script could change all accentuated characters to normal characters by:

$in{'query'} = (s/a/\á/g, $in{'query'});
$in{'query'} = (s/e/\é/g, $in{'query'});
$in{'query'} = (s/i/\í/g, $in{'query'});
$in{'query'} = (s/o/\ó/g, $in{'query'});
$in{'query'} = (s/u/\ú/g, $in{'query'});
etc, etc.

Then, it would do the same thing to the words searched by the script on the database searchable fields so the comparison would be done based on normal ASCII characters, although there are accentuated words inside the database fields.

So searching "conducción or conduccion (without the stress)" would match "conducción - conduccion - CONDUCCION, etc.", because what the script would really be looking for would be lowercase "conduccion", plain and simple.

The results page however, should present the actual stressed words and frases that exist in the database.

Any Perl guru around who is willing to follow this line of thought and give us a solution MOD?

Thanks

Jose
Quote Reply
Re: Non-English Mod for search.cgi In reply to
I understand your frustration...however, you should really reply to the most relevant Topic rather than posting to multiple Topics.

Regards,

------------------
Eliot Lee....
Former Handle: Eliot
Anthro TECH, L.L.C
anthrotech.com
* Check Resource Center
* Search Forums
* Thinking out of the box (codes) is not only fun, but effective.