Gossamer Forum
Home : Products : Gossamer Links : Development, Plugins and Globals :

autosuggest / correct in search results

Quote Reply
autosuggest / correct in search results
Hi,

I just read the discussion http://www.gossamer-threads.com/.../?page=unread#unread here and deciced to share what I found some time ago.
Some little lines of code do impressive things:
http://norvig.com/spell-correct.html
I took the python code and created a source-text file (I am using Internal Index).
correct.py:
Code:
#! ../Python-2_6_4-src/python
# close # -*- coding: utf-8 -*-
import re, collections

def words(text): return re.findall('[^\s]+', text.lower())

def train(features):
model = collections.defaultdict(lambda: 1)
for f in features:
model[f] += 1
return model

NWORDS = train(words(file(path/to/text.txt').read()))

alphabet = 'abcdefghijklmnopqrstuvwxyz'

def edits1(word):
s = [(word[:i], word[i:]) for i in range(len(word) + 1)]
deletes = [a + b[1:] for a, b in s if b]
transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]
replaces = [a + c + b[1:] for a, b in s for c in alphabet if b]
inserts = [a + c + b for a, b in s for c in alphabet]
return set(deletes + transposes + replaces + inserts)

def known_edits2(word):
return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)

def known(words): return set(w for w in words if w in NWORDS)

def correct(word):

candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]
return max(candidates, key=NWORDS.get)

create_textfile.cgi:
Code:
#!/usr/bin/perl -w
use lib 'path/to/cgi-bin/admin';
use Links qw/$DB/;
Links::init('path/to/cgi-bin/admin');
use GT::CGI;
open (CORRECT, "path/to/text.txt") || die "File could not be opened.";
my $search_db = $DB->table('Category_Word_List');
$search_db->select_options ();
my $sth = $search_db->select (['Word']);
while ($Category_Word_List = $sth->fetchrow_hashref) {print CORRECT $Category_Word_List->{'Word'} . " ";}
$search_db = $DB->table('Links_Word_List');
$search_db->select_options ();
$sth = $search_db->select (['Word']);
while ($Links_Word_List = $sth->fetchrow_hashref) {print CORRECT $Links_Word_List->{'Word'} . " ";}
close CORRECT;
exit;
Then I added a Subroutine called correct:
Code:
sub {
my $query = shift;
my @query = split(" ", $query);
my @newquery;
my $command;
foreach (@query) {
$command = `path/to/python path/to/spellcheck.py '$_'`;
$command =~ s/\s+$//g;
push(@newquery, $command);
}
if (join("+",@query) ne join("+",@newquery)) {
my $return = qq~<br />Did you mean: <a href="/cgi-bin/search.cgi?query=~;
map { $return .= GT::CGI::html_escape($_) . "+"; } @newquery;chop $return;
$return .= qq~&bool=or"><span style="font-weight:bold; color:blue">~;
map { $return .= GT::CGI::html_escape($_) . " "; } @newquery;chop $return;
$return .= qq~</span></a>~;
return $return;
}
return;
}
Then it is up to you where and how to add the routine, I just did
Code:
<%if query%><%correct($query)%><%endif%>

I did not spend too much time with this feature but I think there are lots of nice things to be done. I am not familiar with python, so I do not know how to add special characters like á or ä or something like that.
(Just adding them did not work as expected, maybe I have an encoding problem either)
Also changing search from AND to OR is more quick and dirty to prevent failing a search although two or more words exist in the database after correction.
I hope to see some discussion here. Maybe there is a real plugin to be done.

Regards

Niko
Quote Reply
Re: [el noe] autosuggest / correct in search results In reply to
Hi,

Interesting post - will try and have a look at that at some point. Bit too busy atm :(

Cheers

Andy (mod)
andy@ultranerds.co.uk


IMPORTANT: I've now moved to ultranerds.co.uk, and the .com will no longer work!
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package (plugins total "value" $3,325 & rising, for just $350)| GLinks ULTRA Package PRO (plugins total "value" $5,625 & rising, for just $500)
Support Forum | Links SQL Plugins | DMOZ Dumps | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Compare our different Plugin packages *new* Free CSS Templates