Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: kinosearch: discuss

Re: newbie: Indexing and searching text not

 

 

kinosearch discuss RSS feed   Index | Next | Previous | View Threaded


barborak at basikgroup

Aug 25, 2008, 12:56 PM

Post #1 of 1 (3081 views)
Permalink
Re: newbie: Indexing and searching text not

Hi,

There is a utility that comes with the KinoSearch distribution called
dump_index. Running that shows these terms associated with the body field:

Terms:
body:a
Doc 0 (1 occurrences)
body:bodi
Doc 0 (1 occurrences)
body:here
Doc 0 (1 occurrences)
body:is
Doc 0 (1 occurrences)
body:short
Doc 0 (1 occurrences)
body:this
Doc 0 (1 occurrences)
body:veri
Doc 0 (1 occurrences)

So you can see that the PolyAnalyzer converted "very" to "veri." To get your
example to work then, either search for "veri" or run the word "very"
through the PolyAnalyzer first.

Best,
Mike



On Mon, Aug 25, 2008 at 2:58 PM, <kinosearch-request [at] rectangular> wrote:

> Date: Mon, 25 Aug 2008 11:40:10 +0530
> From: ram <ram [at] netcore>
> Subject: Re: [KinoSearch] newbie: Indexing and searching text not
> working
> To: KinoSearch discussion forum <kinosearch [at] rectangular>
> Message-ID: <1219644610.22357.61.camel [at] darkstar>
> Content-Type: text/plain
>
>
> On Sat, 2008-08-23 at 15:22 -0400, Mike Barborak wrote:
> > Hi,
> >
> > After creating your index with PolyAnalyzer, your body field will have
> > the terms "short" and "body" but not "short body." Take a look at
> > KinoSearch::QueryParser::QueryParser as it will likely do what you
> > want.
>
> I think my installation has got some issue. I cant search on a single
> word too
>
>
>
> ---------------------------------------
> use KinoSearch::InvIndexer;
> use KinoSearch::Analysis::PolyAnalyzer;
> use KinoSearch::Searcher;
> use strict;
> #
> # Start on a clean slate
> #
> system("rm -rf /tmp/invindex/*");
> my $analyzer = KinoSearch::Analysis::PolyAnalyzer->new( language =>
> 'en' );
> @gl::headers = qw(from to cc subject body date reply-to message-id
> in-reply-to filename);
> my $invindexer = KinoSearch::InvIndexer->new(
> invindex => '/tmp/invindex',
> create => 1,
> analyzer => $analyzer,
> );
> foreach (@gl::headers) {
> $invindexer->spec_field( name => $_ ,indexed =>1);
> }
> my $doc = $invindexer->new_doc;
> my %mail = (
> 'date' => 'Mon, 07 Jan 2008 14:04:35 +0530',
> 'to' => 'myteam [at] example',
> 'subject' => 'subject test here ',
> 'body' => 'This is a very short body here ',
> 'cc' => 'ram [at] example',
> 'from' => 'sagar [at] example',
> 'message-id' => '<1199694875.14998.392.camel [at] sagar>',
> 'filename'=>'/abc/def'
> );
> foreach (keys %mail) {
> next unless($mail{$_});
> $doc->set_value( $_ => $mail{$_} );
> }
> $invindexer->add_doc($doc);
> $invindexer->finish;
>
>
> $analyzer = KinoSearch::Analysis::PolyAnalyzer->new( language =>
> 'en' );
> my $searcher = KinoSearch::Searcher->new(
> invindex => '/tmp/invindex',
> analyzer => $analyzer,
> );
> #
> # Search on body
> #
> my $term = KinoSearch::Index::Term->new("body","very");
> my $term_query = KinoSearch::Search::TermQuery->new(term => $term);
> my $hits = $searcher->search( query => $term_query );
> while ( my $hit = $hits->fetch_hit_hashref ){
> print "Found HIT in body" . $hit->{body}."\n";
> }
>
> -----------------------------------------------------------------
>
> I am using Fedora-8 and perl-5.10 and latest kinosearch installed via
> CPAN
>

kinosearch discuss RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.