Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

handling token created/deleted events in an Index

 

 

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


mathieu at garambrogne

Jun 16, 2008, 7:55 AM

Post #1 of 1 (240 views)
Permalink
handling token created/deleted events in an Index

With the LUCENE-1297, the SpellChecker will be able to choose how to
estimate distance between two words.

Here are some other enhancement:
* The capacity to synchronize the main Index and the SpellChecker
Index. Handling tokens creation is easy, a simple TokenFilter can do
the work. But for Token deletion, it's a bit harder. Lazy deleted can
be used if each time, token popularity is checked in the main Index.
It's a pull strategy, a push from the Directory should be lighter.
* Choosing the similarity strategy. Now, it's only a Ngram
computation. Homophony can be nice, for example.
* Spell Index can be used for dynamic similarity without disturbing
the main Index. By example, Snowball is nice for grouping words from
its roots, but it disturbs the Index if you wont to make a start with
query.

Some time ago, I suggested a patch LUCENE-1190, but, I guess it's too
monolithic. A more modular way should be better.

Any comments or suggestion?

M.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-dev-help[at]lucene.apache.org

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.