Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

Re: indexreader refresh

 

 

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


cutting at apache

Jan 4, 2006, 10:30 AM

Post #1 of 3 (1961 views)
Permalink
Re: indexreader refresh

Amol Bhutada wrote:
> If I have a reader and searcher on a indexdata folder and another
> indexwriter writing documents to the same indexdata folder, do I need to
> close existing reader and searcher and create new so that newly indexed
> data comes into search effect?

[ moved from user to dev list]

This is a frequent request. While opening an all-new IndexReader is
effective, it is not always efficient. It might be nice to support a
more efficient means of re-opening an index.

Perhaps we should add a few new IndexReader methods, as follows:

/** If <code>reader</code>'s index has not been changed, return
* <code>reader</code>, otherwise return a new {@link IndexReader}
* reading the new latest of the index
*/
public static IndexReader open(IndexReader reader) {
if (isCurrent()) {
// unchanged: return existing
return reader;
}

// try to incrementally create new reader
IndexReader result = reader.reopen(reader);
if (result != null) {
return result;
}

// punt, opening an entirely new reader
return IndexReader.open(reader.directory());
}

/** Return a new IndexReader reading the current state
* of the index, re-using reader's resources, or null if this
* is not possible.
*/
protected IndexReader reopen(IndexReader reader) {
return null;
}

Then we can add implementations of reopen to SegmentReader and
MultiReader that attempt to re-use the existing, already opened
segments. This should mostly be simple, but there are a few tricky
issues, like detecting whether an already-open segment has had
deletions, and deciding when to close obsolete segments.

Does this sound like it would make a good addition? Does someone want
to volunteer to implement it?

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


rengels at ix

Jan 4, 2006, 11:10 AM

Post #2 of 3 (1877 views)
Permalink
RE: indexreader refresh [In reply to]

I proposed and posted a patch for this long ago. Only thing missing would be
some sort of reference courting for segments (rather than the 'stayopen'
flag).

/**
* reopens the IndexReader, possibly reusing the segments for greater
efficiency. The original IndexReader instance
* is closed, and the reference is no longer valid
*
* @return the new IndexReader
*/
public IndexReader reopen() throws IOException {
if(!(this instanceof MultiReader))
return IndexReader.open(directory);

MultiReader mr = (MultiReader) this;

final IndexReader[] oldreaders = mr.getReaders();
final boolean[] stayopen = new boolean[oldreaders.length];

synchronized (directory) { // in- & inter-process sync
return (IndexReader)new Lock.With(
directory.makeLock(IndexWriter.COMMIT_LOCK_NAME),
IndexWriter.COMMIT_LOCK_TIMEOUT) {
public Object doBody() throws IOException {
SegmentInfos infos = new SegmentInfos();
infos.read(directory);
if (infos.size() == 1) { // index is optimized
return new SegmentReader(infos, infos.info(0),
closeDirectory);
} else {
IndexReader[] readers = new IndexReader[infos.size()];
for (int i = 0; i < infos.size(); i++) {
for(int j=0;j<oldreaders.length;j++) {
SegmentReader sr = (SegmentReader) oldreaders[j];
if(sr.si.name.equals(infos.info(i).name)) {
readers[i]=sr;
stayopen[j]=true;
}
}
if(readers[i]==null)
readers[i] = new SegmentReader(infos.info(i));
}

for(int i=0;i<stayopen.length;i++)
if(!stayopen[i])
oldreaders[i].close();

return new MultiReader(directory, infos, closeDirectory,
readers);
}
}
}.run();
}
}

-----Original Message-----
From: Doug Cutting [mailto:cutting [at] apache]
Sent: Wednesday, January 04, 2006 12:30 PM
To: java-dev [at] lucene
Subject: Re: indexreader refresh


Amol Bhutada wrote:
> If I have a reader and searcher on a indexdata folder and another
> indexwriter writing documents to the same indexdata folder, do I need to
> close existing reader and searcher and create new so that newly indexed
> data comes into search effect?

[ moved from user to dev list]

This is a frequent request. While opening an all-new IndexReader is
effective, it is not always efficient. It might be nice to support a
more efficient means of re-opening an index.

Perhaps we should add a few new IndexReader methods, as follows:

/** If <code>reader</code>'s index has not been changed, return
* <code>reader</code>, otherwise return a new {@link IndexReader}
* reading the new latest of the index
*/
public static IndexReader open(IndexReader reader) {
if (isCurrent()) {
// unchanged: return existing
return reader;
}

// try to incrementally create new reader
IndexReader result = reader.reopen(reader);
if (result != null) {
return result;
}

// punt, opening an entirely new reader
return IndexReader.open(reader.directory());
}

/** Return a new IndexReader reading the current state
* of the index, re-using reader's resources, or null if this
* is not possible.
*/
protected IndexReader reopen(IndexReader reader) {
return null;
}

Then we can add implementations of reopen to SegmentReader and
MultiReader that attempt to re-use the existing, already opened
segments. This should mostly be simple, but there are a few tricky
issues, like detecting whether an already-open segment has had
deletions, and deciding when to close obsolete segments.

Does this sound like it would make a good addition? Does someone want
to volunteer to implement it?

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


cutting at apache

Jan 4, 2006, 12:39 PM

Post #3 of 3 (1881 views)
Permalink
Re: indexreader refresh [In reply to]

Yes, that's a good start. Your patch does not handle deletions
correctly. If a segment has had deletions since it was opened then its
deletions file needs to be re-read. I also think returning a new
IndexReader is preferable to modifying one, since an IndexReader is
often used as a cache key, and caches should be invalidated when an
IndexReader is re-opened.

Robert Engels wrote:
> I proposed and posted a patch for this long ago. Only thing missing would be
> some sort of reference courting for segments (rather than the 'stayopen'
> flag).
>
> /**
> * reopens the IndexReader, possibly reusing the segments for greater
> efficiency. The original IndexReader instance
> * is closed, and the reference is no longer valid
> *
> * @return the new IndexReader
> */
> public IndexReader reopen() throws IOException {
> if(!(this instanceof MultiReader))
> return IndexReader.open(directory);
>
> MultiReader mr = (MultiReader) this;
>
> final IndexReader[] oldreaders = mr.getReaders();
> final boolean[] stayopen = new boolean[oldreaders.length];
>
> synchronized (directory) { // in- & inter-process sync
> return (IndexReader)new Lock.With(
> directory.makeLock(IndexWriter.COMMIT_LOCK_NAME),
> IndexWriter.COMMIT_LOCK_TIMEOUT) {
> public Object doBody() throws IOException {
> SegmentInfos infos = new SegmentInfos();
> infos.read(directory);
> if (infos.size() == 1) { // index is optimized
> return new SegmentReader(infos, infos.info(0),
> closeDirectory);
> } else {
> IndexReader[] readers = new IndexReader[infos.size()];
> for (int i = 0; i < infos.size(); i++) {
> for(int j=0;j<oldreaders.length;j++) {
> SegmentReader sr = (SegmentReader) oldreaders[j];
> if(sr.si.name.equals(infos.info(i).name)) {
> readers[i]=sr;
> stayopen[j]=true;
> }
> }
> if(readers[i]==null)
> readers[i] = new SegmentReader(infos.info(i));
> }
>
> for(int i=0;i<stayopen.length;i++)
> if(!stayopen[i])
> oldreaders[i].close();
>
> return new MultiReader(directory, infos, closeDirectory,
> readers);
> }
> }
> }.run();
> }
> }
>
> -----Original Message-----
> From: Doug Cutting [mailto:cutting [at] apache]
> Sent: Wednesday, January 04, 2006 12:30 PM
> To: java-dev [at] lucene
> Subject: Re: indexreader refresh
>
>
> Amol Bhutada wrote:
>
>>If I have a reader and searcher on a indexdata folder and another
>>indexwriter writing documents to the same indexdata folder, do I need to
>>close existing reader and searcher and create new so that newly indexed
>>data comes into search effect?
>
>
> [ moved from user to dev list]
>
> This is a frequent request. While opening an all-new IndexReader is
> effective, it is not always efficient. It might be nice to support a
> more efficient means of re-opening an index.
>
> Perhaps we should add a few new IndexReader methods, as follows:
>
> /** If <code>reader</code>'s index has not been changed, return
> * <code>reader</code>, otherwise return a new {@link IndexReader}
> * reading the new latest of the index
> */
> public static IndexReader open(IndexReader reader) {
> if (isCurrent()) {
> // unchanged: return existing
> return reader;
> }
>
> // try to incrementally create new reader
> IndexReader result = reader.reopen(reader);
> if (result != null) {
> return result;
> }
>
> // punt, opening an entirely new reader
> return IndexReader.open(reader.directory());
> }
>
> /** Return a new IndexReader reading the current state
> * of the index, re-using reader's resources, or null if this
> * is not possible.
> */
> protected IndexReader reopen(IndexReader reader) {
> return null;
> }
>
> Then we can add implementations of reopen to SegmentReader and
> MultiReader that attempt to re-use the existing, already opened
> segments. This should mostly be simple, but there are a few tricky
> issues, like detecting whether an already-open segment has had
> deletions, and deciding when to close obsolete segments.
>
> Does this sound like it would make a good addition? Does someone want
> to volunteer to implement it?
>
> Doug
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
> For additional commands, e-mail: java-dev-help [at] lucene
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
> For additional commands, e-mail: java-dev-help [at] lucene
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.