Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Wrapping IndexSearcher so that it is safe?

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


jrhoden at unimelb

Nov 11, 2009, 2:12 PM

Post #1 of 10 (814 views)
Permalink
Wrapping IndexSearcher so that it is safe?

I am pondering a way to allow closing of an index searcher and
releasing the pointer to it so that it automatically cleans up by
itself when all threads stop using the index searcher. Inspired by the
Objective C retain/release model, what do you think about this?

Basically when threads start a search, they increment the "retain"
count, when threads leave the searcher, they decrement the "retain"
count, and close the searcher if requested. I am attracted to this
solution as it seems to simplify things greatly unless I have
overlooked something.


public class SafeIndexSearcher {

private boolean finish = false;
private int retainCount = 0;
private IndexSearcher searcher;

public SafeIndexSearcher(IndexSearcher searcher) {
this.searcher = searcher;
}

public TopDocs search(Query query, int limit) throws IOException {
this.retain();

try {
TopDocs result = searcher.search(query, limit);
this.release();
return result;
} catch (IOException e) {
this.release();
throw e;
}

}

public synchronized void close() {
finish = true;
}

private synchronized void retain() throws IOException {
if(finish)
throw new IOException("SafeIndexSearcher used after close has been
called.");
retainCount++;
}

private synchronized void release() {
retainCount--;
if(finish && retainCount==0)
try {
searcher.close();
} catch (IOException e) {
System.err.println("IndexSearcher.close() unexpected error: " +
e.getMessage());
}
}

}

Thanks!
Jacob

____________________________________
Information Technology Services,
The University of Melbourne

Email: jrhoden [at] unimelb
Phone: +61 3 8344 2884
Mobile: +61 4 1095 7575


uwe at thetaphi

Nov 11, 2009, 2:21 PM

Post #2 of 10 (779 views)
Permalink
RE: Wrapping IndexSearcher so that it is safe? [In reply to]

Looks good. About your code: The searcher will not close if any other
unchecked exception is thrown. Such code should always use finally blocks.
So simply do not catch and rethrow the IOException, instead put release in a
finally block and let the IOException automatically go upwards.


> this.retain();
> try {
> TopDocs result = searcher.search(query, limit);
> return result;
> } finally {
> this.release();
> }

Less code more secure and effective :-)

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe [at] thetaphi


> -----Original Message-----
> From: Jacob Rhoden [mailto:jrhoden [at] unimelb]
> Sent: Wednesday, November 11, 2009 11:12 PM
> To: java-user [at] lucene
> Subject: Wrapping IndexSearcher so that it is safe?
>
> I am pondering a way to allow closing of an index searcher and
> releasing the pointer to it so that it automatically cleans up by
> itself when all threads stop using the index searcher. Inspired by the
> Objective C retain/release model, what do you think about this?
>
> Basically when threads start a search, they increment the "retain"
> count, when threads leave the searcher, they decrement the "retain"
> count, and close the searcher if requested. I am attracted to this
> solution as it seems to simplify things greatly unless I have
> overlooked something.
>
>
> public class SafeIndexSearcher {
>
> private boolean finish = false;
> private int retainCount = 0;
> private IndexSearcher searcher;
>
> public SafeIndexSearcher(IndexSearcher searcher) {
> this.searcher = searcher;
> }
>
> public TopDocs search(Query query, int limit) throws IOException {
> this.retain();
>
> try {
> TopDocs result = searcher.search(query, limit);
> this.release();
> return result;
> } catch (IOException e) {
> this.release();
> throw e;
> }
>
> }
>
> public synchronized void close() {
> finish = true;
> }
>
> private synchronized void retain() throws IOException {
> if(finish)
> throw new IOException("SafeIndexSearcher used after
> close has been
> called.");
> retainCount++;
> }
>
> private synchronized void release() {
> retainCount--;
> if(finish && retainCount==0)
> try {
> searcher.close();
> } catch (IOException e) {
> System.err.println("IndexSearcher.close()
> unexpected error: " +
> e.getMessage());
> }
> }
>
> }
>
> Thanks!
> Jacob
>
> ____________________________________
> Information Technology Services,
> The University of Melbourne
>
> Email: jrhoden [at] unimelb
> Phone: +61 3 8344 2884
> Mobile: +61 4 1095 7575



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


jrhoden at unimelb

Nov 11, 2009, 2:41 PM

Post #3 of 10 (777 views)
Permalink
Re: Wrapping IndexSearcher so that it is safe? [In reply to]

I knew I would have overlooked something, thanks for the help!

On 12/11/2009, at 9:21 AM, Uwe Schindler wrote:

> ....simply do not catch and rethrow the IOException, instead put
> release in a
> finally block and let the IOException automatically go upwards.
>
>> this.retain();
>> try {
>> TopDocs result = searcher.search(query, limit);
>> return result;
>> } finally {
>> this.release();
>> }
>
> Less code more secure and effective :-)


____________________________________
Information Technology Services,
The University of Melbourne

Email: jrhoden [at] unimelb
Phone: +61 3 8344 2884
Mobile: +61 4 1095 7575


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


erickerickson at gmail

Nov 11, 2009, 2:53 PM

Post #4 of 10 (772 views)
Permalink
Re: Wrapping IndexSearcher so that it is safe? [In reply to]

If you want to spend a few bucks, here's part of a reply to a similar
question
from Mike McCandless a day or so ago....

<<<
You can get the book here http://www.manning.com/hatcher3 (NOTE: I'm
one of the authors!).

Chapter 11 in the book has a class called SearcherManager, that
handles the details of reopen/closing the IndexReader while queries
are still in flight, that might be useful here.
>>>

The book is Lucene In Action II. Manning has an "early access program"
(MEAP) that lets you get a PDF version. That class is considerably more
extensive and handles the edge cases as I remember it....

Best
Erick


On Wed, Nov 11, 2009 at 5:41 PM, Jacob Rhoden <jrhoden [at] unimelb>wrote:

> I knew I would have overlooked something, thanks for the help!
>
> On 12/11/2009, at 9:21 AM, Uwe Schindler wrote:
>
> ....simply do not catch and rethrow the IOException, instead put release
>> in a
>>
>> finally block and let the IOException automatically go upwards.
>>
>> this.retain();
>>> try {
>>> TopDocs result = searcher.search(query, limit);
>>> return result;
>>> } finally {
>>> this.release();
>>> }
>>>
>>
>> Less code more secure and effective :-)
>>
>
>
> ____________________________________
> Information Technology Services,
> The University of Melbourne
>
> Email: jrhoden [at] unimelb
> Phone: +61 3 8344 2884
> Mobile: +61 4 1095 7575
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


jrhoden at unimelb

Nov 11, 2009, 4:33 PM

Post #5 of 10 (768 views)
Permalink
Re: Wrapping IndexSearcher so that it is safe? [In reply to]

The source code for SearcherManager is even downloadable for free:
http://www.manning.com/hatcher3/LIAsourcecode.zip

The example source code does some things that is beyond my level of
understanding
of lucene. ie:
1) To me it looks like an IndexSearcher never gets closed.
2) I don't understand what happens if the indexreader is reopened
while a thread
in the middle of a search using an indexsearcher.

So I am going for something a bit simpler:

If a thread wants to use the "SafeIndexSearcher", it first calls
retain() and then calls
release() when its done.

If a thread wants to close the "SafeIndexSearcher" , the close is
deferred until all threads
have called release():


public class SafeIndexSearcher {

private boolean finish = false;
private int retainCount = 0;
private IndexSearcher searcher;

public SafeIndexSearcher(IndexSearcher searcher) {
this.searcher = searcher;
}

public TopDocs search(Query query, int limit) throws IOException {
TopDocs result = searcher.search(query, limit);
return result;
}

public Document doc(int doc) throws CorruptIndexException,
IOException {
return searcher.doc(doc);
}

public synchronized void close() {
finish = true;
}

public synchronized SafeIndexSearcher retain() throws IOException {
if(finish)
throw new IOException("SafeIndexSearcher used after close has been
called.");
retainCount++;
return this;
}

public synchronized SafeIndexSearcher release() {
retainCount--;
if(finish && retainCount==0)
try {
searcher.close();
} catch (IOException e) {
System.err.println("IndexSearcher.close() unexpected error: " +
e.getMessage());
}
return this;
}

}

On 12/11/2009, at 9:53 AM, Erick Erickson wrote:

> If you want to spend a few bucks, here's part of a reply to a similar
> question
> from Mike McCandless a day or so ago....
>
> <<<
> You can get the book here http://www.manning.com/hatcher3 (NOTE: I'm
> one of the authors!).
>
> Chapter 11 in the book has a class called SearcherManager, that
> handles the details of reopen/closing the IndexReader while queries
> are still in flight, that might be useful here.
>>>>
>
> The book is Lucene In Action II. Manning has an "early access program"
> (MEAP) that lets you get a PDF version. That class is considerably
> more
> extensive and handles the edge cases as I remember it....
>
> Best
> Erick
>
>
> On Wed, Nov 11, 2009 at 5:41 PM, Jacob Rhoden
> <jrhoden [at] unimelb>wrote:
>
>> I knew I would have overlooked something, thanks for the help!
>>
>> On 12/11/2009, at 9:21 AM, Uwe Schindler wrote:
>>
>> ....simply do not catch and rethrow the IOException, instead put
>> release
>>> in a
>>>
>>> finally block and let the IOException automatically go upwards.
>>>
>>> this.retain();
>>>> try {
>>>> TopDocs result = searcher.search(query,
>>>> limit);
>>>> return result;
>>>> } finally {
>>>> this.release();
>>>> }
>>>>
>>>
>>> Less code more secure and effective :-)
>>>
>>
>>
>> ____________________________________
>> Information Technology Services,
>> The University of Melbourne
>>
>> Email: jrhoden [at] unimelb
>> Phone: +61 3 8344 2884
>> Mobile: +61 4 1095 7575
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>> For additional commands, e-mail: java-user-help [at] lucene
>>
>>

____________________________________
Information Technology Services,
The University of Melbourne

Email: jrhoden [at] unimelb
Phone: +61 3 8344 2884
Mobile: +61 4 1095 7575


lucene at mikemccandless

Nov 12, 2009, 1:42 AM

Post #6 of 10 (752 views)
Permalink
Re: Wrapping IndexSearcher so that it is safe? [In reply to]

On Wed, Nov 11, 2009 at 7:33 PM, Jacob Rhoden <jrhoden [at] unimelb> wrote:
> The source code for SearcherManager is even downloadable for free:
>   http://www.manning.com/hatcher3/LIAsourcecode.zip
>
> The example source code does some things that is beyond my level of
> understanding
> of lucene. ie:
> 1) To me it looks like an IndexSearcher never gets closed.

It's true that IndexSearcher isn't closed, but it turns out (for now,
at least), it's not necessary to close IndexSearcher if you had passed
it an already open IndexReader.

> 2) I don't understand what happens if the indexreader is reopened while a
> thread
>    in the middle of a search using an indexsearcher.

Any already in-flight searches will continue to use the old
IndexSearcher, while new ones use the new IndexSearcher. Once all
in-flight searches are done against the old IndexSearcher, its
underlying IndexReader is closed.

> So I am going for something a bit simpler:
>
> If a thread wants to use the "SafeIndexSearcher", it first calls retain()
> and then calls
> release() when its done.
>
> If a thread wants to close the "SafeIndexSearcher" , the close is deferred
> until all threads
> have called release():

I think this will work, but you need a central place that has called
retain() and is holding onto the searcher, until it's time to reopen
right? Either that, or your retainCount begins life at 1 instead of
0?

Also, how will you handle reopening (if that's needed in your app)?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


jrhoden at unimelb

Nov 12, 2009, 1:44 PM

Post #7 of 10 (744 views)
Permalink
Re: Wrapping IndexSearcher so that it is safe? [In reply to]

On 12/11/2009, at 8:42 PM, Michael McCandless wrote:

> On Wed, Nov 11, 2009 at 7:33 PM, Jacob Rhoden
> <jrhoden [at] unimelb> wrote:
>> The source code for SearcherManager is even downloadable for free:
>> http://www.manning.com/hatcher3/LIAsourcecode.zip
>>
>> The example source code does some things that is beyond my level of
>> understanding

After staring it it a bit longer, it mostly makes sense to me now, the
example for book
was only hard to read because I don't understand completely what both
IndexWriter.IndexReaderWarmer() and searcher.getIndexReader().decRef()
are supposed to do. (Which I assume can be learnt from the book)

> Any already in-flight searches will continue to use the old
> IndexSearcher, while new ones use the new IndexSearcher. Once all
> in-flight searches are done against the old IndexSearcher, its
> underlying IndexReader is closed.

To test I understand this correctly, the index reader getting closed
automatically
is achieved by the release() method calling
searcher.getIndexReader().decRef()?

>> So I am going for something a bit simpler:
>>
>> If a thread wants to use the "SafeIndexSearcher", it first calls
>> retain()
>> and then calls
>> release() when its done.
>
> I think this will work, but you need a central place that has called
> retain() and is holding onto the searcher, until it's time to reopen
> right? Either that, or your retainCount begins life at 1 instead of
> 0?

In my version, the "SafeIndexSearcher" is a static variable, the retain
count starts at 0. Being at zero simply indicates the object is not
being
used and can safely be closed.

> Also, how will you handle reopening (if that's needed in your app)?


Another thread runs in the background and handles deciding when to do
an index rebuild, at which point it
1) swaps in a new "SafeIndexSearcher"
2) calls close on the old "SafeIndexSearcher" (which will safely close
once the retain count=0)

Thanks,
Jacob

____________________________________
Information Technology Services,
The University of Melbourne

Email: jrhoden [at] unimelb
Phone: +61 3 8344 2884
Mobile: +61 4 1095 7575


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


lucene at mikemccandless

Nov 12, 2009, 2:19 PM

Post #8 of 10 (745 views)
Permalink
Re: Wrapping IndexSearcher so that it is safe? [In reply to]

On Thu, Nov 12, 2009 at 4:44 PM, Jacob Rhoden <jrhoden [at] unimelb> wrote:
>
> On 12/11/2009, at 8:42 PM, Michael McCandless wrote:
>
>> On Wed, Nov 11, 2009 at 7:33 PM, Jacob Rhoden <jrhoden [at] unimelb>
>> wrote:
>>>
>>> The source code for SearcherManager is even downloadable for free:
>>>  http://www.manning.com/hatcher3/LIAsourcecode.zip
>>>
>>> The example source code does some things that is beyond my level of
>>> understanding
>
> After staring it it a bit longer, it mostly makes sense to me now, the
> example for book
> was only hard to read because I don't understand completely what both
> IndexWriter.IndexReaderWarmer() and searcher.getIndexReader().decRef()
> are supposed to do. (Which I assume can be learnt from the book)

SearcherManager can work with a near real-time reader (via
IndexWriter.getReader), or with a standalone reader (via
IndexReader.open), so that's another source of more complexity vs your
use case.

>> Any already in-flight searches will continue to use the old
>> IndexSearcher, while new ones use the new IndexSearcher.  Once all
>> in-flight searches are done against the old IndexSearcher, its
>> underlying IndexReader is closed.
>
> To test I understand this correctly, the index reader getting closed
> automatically
> is achieved by the release() method calling
> searcher.getIndexReader().decRef()?

Right. The decRef that drops the ref count to 0, closes the reader.

>>> So I am going for something a bit simpler:
>>>
>>> If a thread wants to use the "SafeIndexSearcher", it first calls retain()
>>> and then calls
>>> release() when its done.
>>
>> I think this will work, but you need a central place that has called
>> retain() and is holding onto the searcher, until it's time to reopen
>> right?  Either that, or your retainCount begins life at 1 instead of
>> 0?
>
> In my version, the "SafeIndexSearcher" is a static variable, the retain
> count starts at 0. Being at zero simply indicates the object is not being
> used and can safely be closed.

If it starts as 0, then the first search that runs will then close it?
Is that intended.

VS keeping a single searcher alive, until its time to reopen (which
would mean some central place would call retain, and then would call
release when reopen is done).

>> Also, how will you handle reopening (if that's needed in your app)?
>
>
> Another thread runs in the background and handles deciding when to do
> an index rebuild, at which point it
> 1) swaps in a new "SafeIndexSearcher"
> 2) calls close on the old "SafeIndexSearcher" (which will safely close once
> the retain count=0)

OK I think that should work. Just make sure when the search releases
the SafeIndexSearcher, it releases the one it had started with, and
not accidentally a new one which had swapped in while it was
searching.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


jrhoden at unimelb

Nov 12, 2009, 2:45 PM

Post #9 of 10 (744 views)
Permalink
Re: Wrapping IndexSearcher so that it is safe? [In reply to]

On 13/11/2009, at 9:19 AM, Michael McCandless wrote:

>>> On Wed, Nov 11, 2009 at 7:33 PM, Jacob Rhoden <jrhoden [at] unimelb
>>> >
>>> wrote:
>>>>
>>>> The source code for SearcherManager is even downloadable for free:
>>>> http://www.manning.com/hatcher3/LIAsourcecode.zip
>>>>
>>>> The example source code does some things that is beyond my level of
>>>> understanding
>
> SearcherManager can work with a near real-time reader (via
> IndexWriter.getReader), or with a standalone reader (via
> IndexReader.open), so that's another source of more complexity vs your
> use case.

There can be quite a large number of updates going on in peak periods,
ie
10-50 updates per minute, I had assumed (perhaps incorrectly) it would
be
better to not work in this way. Perhaps my assumption is wrong?

I assumed that given there could be heaps of updates going on at once,
the
IndexSearcher should be manually refreshed as a less frequent
interval. ie
update every 5 minutes but only if there has been an edit within the
past
5 minutes.

>>>> So I am going for something a bit simpler: If a thread wants to
>>>> use the
>>>> "SafeIndexSearcher", it first calls retain() and then calls
>>>> release()
>>>> when its done.
>
> If it starts as 0, then the first search that runs will then close it?
> Is that intended.
>
> VS keeping a single searcher alive, until its time to reopen (which
> would mean some central place would call retain, and then would call
> release when reopen is done).

Not quite, object is only released when
(retainCount==0&&finished==true),
ie when there are no active threads AND close has been requested.

Thanks!
Jacob

____________________________________
Information Technology Services,
The University of Melbourne

Email: jrhoden [at] unimelb
Phone: +61 3 8344 2884
Mobile: +61 4 1095 7575


lucene at mikemccandless

Nov 12, 2009, 4:23 PM

Post #10 of 10 (741 views)
Permalink
Re: Wrapping IndexSearcher so that it is safe? [In reply to]

On Thu, Nov 12, 2009 at 5:45 PM, Jacob Rhoden <jrhoden [at] unimelb> wrote:
>> SearcherManager can work with a near real-time reader (via
>> IndexWriter.getReader), or with a standalone reader (via
>> IndexReader.open), so that's another source of more complexity vs your
>> use case.
>
> There can be quite a large number of updates going on in peak periods, ie
> 10-50 updates per minute, I had assumed (perhaps incorrectly) it would be
> better to not work in this way. Perhaps my assumption is wrong?
>
> I assumed that given there could be heaps of updates going on at once, the
> IndexSearcher should be manually refreshed as a less frequent interval. ie
> update every 5 minutes but only if there has been an edit within the past
> 5 minutes.

With SearcherManager you also independently decide how frequently to
reopen. But, yes, the more frequently you reopen the more costly it
is.

If you have access to the writer that's making changes, it's more
efficient to get a near real-time reader from it, than committing from
it and reopen the reader.

You should test performance to what refresh rate works best for your
app (and report back if you can!).

>>>>> So I am going for something a bit simpler: If a thread wants to use the
>>>>> "SafeIndexSearcher", it first calls retain() and then calls release()
>>>>> when its done.
>>
>> If it starts as 0, then the first search that runs will then close it?
>> Is that intended.
>>
>> VS keeping a single searcher alive, until its time to reopen (which
>> would mean some central place would call retain, and then would call
>> release when reopen is done).
>
> Not quite, object is only released when (retainCount==0&&finished==true),
> ie when there are no active threads AND close has been requested.

Ahh I see. OK, that makes sense!

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.