Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: General

How NOT to ignore words normally considered extraneous?

 

 

Lucene general RSS feed   Index | Next | Previous | View Threaded


jturnbul at uow

Jul 24, 2011, 10:15 PM

Post #1 of 2 (338 views)
Permalink
How NOT to ignore words normally considered extraneous?

It seems that Lucene is (by default at least) ignoring and therefore not
indexing words which may usually be considered extraneous such as "not",
"who", "the" etc. For our usage we really need all words to be indexed and
to be searchable. Is it possible to configure the indexing process somehow
so that this can be achieved?

Thanks,

-sbs

--
View this message in context: http://lucene.472066.n3.nabble.com/How-NOT-to-ignore-words-normally-considered-extraneous-tp3196569p3196569.html
Sent from the Lucene - General mailing list archive at Nabble.com.


hossman_lucene at fucit

Jul 25, 2011, 7:55 PM

Post #2 of 2 (320 views)
Permalink
Re: How NOT to ignore words normally considered extraneous? [In reply to]

: It seems that Lucene is (by default at least) ignoring and therefore not
: indexing words which may usually be considered extraneous such as "not",
: "who", "the" etc. For our usage we really need all words to be indexed and
: to be searchable. Is it possible to configure the indexing process somehow
: so that this can be achieved?

You'll need to be specific about how you are using Lucene. Most likely
you are using something that is using the "StopFilter" along with a set of
common english words (possible from the default set in the StopFilter
class)

if you change the Analyzer you use to something that doesn't use
StopFilter, or change the word set used by StopFilter, you can change this
behavior.

how you do that depends on how you are using Lucene.

if you are using hte Lucene java library, please send followup questions
to the java-user [at] lucen mailing list. if you are using solr, please us
hte solr-user [at] lucen mailing list, etc...

http://lucene.apache.org/mail.html

-Hoss

Lucene general RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.