Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

spellcheck - additional setter methods

 

 

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


kayomailer-lucene at yahoo

Mar 27, 2007, 2:38 AM

Post #1 of 3 (2292 views)
Permalink
spellcheck - additional setter methods

Hi spellcheck developers,

as users might not always have the administrative power to change
their ulimit settings (on *nix systems), it might be useful to include
two new parameters
indexMergeFactor and indexMaxBufferedDocs
and their respective setter methods in


org.apache.lucene.search.spell.SpellChecker



Especially, as the spellcheck just finishes with an invalid index without
any further message or warning if the ulimit settings don't suffice.
Below is the code i would suggest.

Thanks,
karin



Code changes:
first the additional attributes and setter methods:
private int indexMergeFactor = 300;
private int indexMaxBufferedDocs = 150;


/**
* Set the merge factor for indexing; default 300
*/
public void setIndexMergeFactor(int mf) {
indexMergeFactor = mf;
}

/**
* Set max no. of docs buffered for indexing; default 150
*/
public void setIndexMaxBufferedDocs(int mbd) {
indexMaxBufferedDocs = mbd;
}

here the changes to public void indexDictionary (Dictionary dict) throws IOException :
replace the hard coded values
writer.setMergeFactor(300);
writer.setMaxBufferedDocs(150);
with:
writer.setMergeFactor(indexMergeFactor);
writer.setMaxBufferedDocs(indexMaxBufferedDocs);











---------------------------------

LLama Gratis a cualquier PC del Mundo.
Llamadas a fijos y móviles desde 1 céntimo por minuto.
http://es.voice.yahoo.com


otis_gospodnetic at yahoo

Mar 27, 2007, 12:31 PM

Post #2 of 3 (2152 views)
Permalink
Re: spellcheck - additional setter methods [In reply to]

Karin,

Could you please add this to Lucene's JIRA as an Enhancement?
I can just add the setters now, but other Lucene developers are making changes to how mergeFactor and maxBufferedDocs work. Those two parameters may also get deprecated in the process, so it would be better to wait for those changes to happen first.

Otis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/ - Tag - Search - Share

----- Original Message ----
From: "kayomailer-lucene [at] yahoo" <kayomailer-lucene [at] yahoo>
To: java-dev [at] lucene
Sent: Tuesday, March 27, 2007 5:38:28 AM
Subject: spellcheck - additional setter methods

Hi spellcheck developers,

as users might not always have the administrative power to change
their ulimit settings (on *nix systems), it might be useful to include
two new parameters
indexMergeFactor and indexMaxBufferedDocs
and their respective setter methods in


org.apache.lucene.search.spell.SpellChecker



Especially, as the spellcheck just finishes with an invalid index without
any further message or warning if the ulimit settings don't suffice.
Below is the code i would suggest.

Thanks,
karin



Code changes:
first the additional attributes and setter methods:
private int indexMergeFactor = 300;
private int indexMaxBufferedDocs = 150;


/**
* Set the merge factor for indexing; default 300
*/
public void setIndexMergeFactor(int mf) {
indexMergeFactor = mf;
}

/**
* Set max no. of docs buffered for indexing; default 150
*/
public void setIndexMaxBufferedDocs(int mbd) {
indexMaxBufferedDocs = mbd;
}

here the changes to public void indexDictionary (Dictionary dict) throws IOException :
replace the hard coded values
writer.setMergeFactor(300);
writer.setMaxBufferedDocs(150);
with:
writer.setMergeFactor(indexMergeFactor);
writer.setMaxBufferedDocs(indexMaxBufferedDocs);











---------------------------------

LLama Gratis a cualquier PC del Mundo.
Llamadas a fijos y móviles desde 1 céntimo por minuto.
http://es.voice.yahoo.com



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


hossman_lucene at fucit

Mar 27, 2007, 6:34 PM

Post #3 of 3 (2149 views)
Permalink
Re: spellcheck - additional setter methods [In reply to]

I would suggest that instead of adding a lot of methods to SpellChecker
that mimic IndexWRiter methods, we instead make SpellChecker easier to
subclass by making "spellIndex" protected, and refactoring the "new
IndexWriter" code sprinkled throughout the class into a new method that
subclasses can override to set whatever setings they want...

protected IndexWriter openWriter() throws IOException { ... }

...that way it doesn't matter what changes are made to IndexWriter in the
long run, SpellChecker doesn't need to be changed for basic functionality.

: I can just add the setters now, but other Lucene developers are making
: changes to how mergeFactor and maxBufferedDocs work. Those two
: parameters may also get deprecated in the process, so it would be better
: to wait for those changes to happen first.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.