Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

[jira] Commented: (LUCENE-743) IndexReader.reopen()

 

 

First page Previous page 1 2 3 Next page Last page  View All Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


jira at apache

Oct 5, 2007, 12:42 AM

Post #1 of 71 (1465 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen()

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532585 ]

Michael Busch commented on LUCENE-743:
--------------------------------------

I ran new performance tests with the latest patch similar to the tests I explained in an earlier comment on this issue.

I'm using again a 4.5M index. In each round I delete one document and (re)open the IndexReader thereafter. Here are the numbers for 5000 rounds:

|| || Time || Speedup ||
| open | 703s | |
| reopen(closeOldReader==false) | 62s | 11x |
| reopen(closeOldReader==true) |16s | 44x |

Now in each round I delete on document and also set the norm for one random document. Numbers for 1000 rounds:

|| || Time || Speedup ||
| open | 166s | |
| reopen(closeOldReader==false) | 33s | 5.0x |
| reopen(closeOldReader==true) | 29s | 5.7x |

I think these numbers look pretty good. We get a quite decent speedup even if the old readers are not closed.

I would like to commit this in a couple of days to get ready for Lucene 2.3. It would be nice if someone could review the latest patch! Hoss? Others?

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 5, 2007, 5:59 PM

Post #2 of 71 (1412 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532806 ]

Yonik Seeley commented on LUCENE-743:
-------------------------------------

I think this looks pretty good Michael!
Too bad so much needs to be cloned in the case that closeOldReader==false... maybe someday in the future we can have read-only readers.

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 5, 2007, 8:45 PM

Post #3 of 71 (1395 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532817 ]

Michael Busch commented on LUCENE-743:
--------------------------------------

> Too bad so much needs to be cloned in the case that closeOldReader==false... maybe someday in the future we can have read-only readers.

Yeah, the cloning part was kind of tedious. Read-only readers would indeed make our life much easier here. I'm wondering how many people are using the IndexReader to alter the norms anyway?

Well, thanks for reviewing the patch, Yonik!

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 5, 2007, 9:24 PM

Post #4 of 71 (1398 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532819 ]

robert engels commented on LUCENE-743:
--------------------------------------

Nice to see all the good work on this. We are still on a 1.9 derivative.

Hopefully we'll be able to move to stock 2.X release in the future.



> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 5, 2007, 10:38 PM

Post #5 of 71 (1395 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532829 ]

Hoss Man commented on LUCENE-743:
---------------------------------

I haven't looked at the patch yet (i really really plan to this weekend, baring catastrophe) but i'm confused as to why you went the cloning route (along with the complexity of the api changes to indicate when it is/isn't supported) ... based on the comments, it seems to boil down to...

> As an example for how the clone() method is used let me describe how MultiReader.reopen()
> works: it tries to reopen every of its subreaders. If at least one subreader could be reopened
> successfully, then a new MultiReader instance is created and the reopened subreaders are
> added to it. Every of the old MultiReader's subreaders, that was not reopened (because of no
> index changes) is now cloned() and added to the new MultiReader.

that seems like circular logic: the clone method is used so that the sub readers can be cloned (?)

why use clones at all? why not just use the original reader (if the "index" that reader represents hasn't changed, why clone it?

And if (for reasons that aren't clear to me) it is important for MultiReader to use a clone of it's subreaders when their reopen method returns "this" then shouldn't clients do the same thing? ... why not make reopen always return this.clone() if the index hasn't changed (which now that i think about it, would also help by punting on the isCloneSupported issue -- each class would already know if it was clonable.

maybe this will make more sense once i read the patch ... i just wanted to through it out there in case someone had a chance to reply before i get a chance.



> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


raviram at students

Oct 5, 2007, 11:27 PM

Post #6 of 71 (1404 views)
Permalink
Re: [jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

>
Is there any way that we can modify lucene,in terms of performance.
Can somebody tell some loopholes in lucene where we can improve further.

> [
> https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532829
> ]
>
> Hoss Man commented on LUCENE-743:
> ---------------------------------
>
> I haven't looked at the patch yet (i really really plan to this weekend,
> baring catastrophe) but i'm confused as to why you went the cloning route
> (along with the complexity of the api changes to indicate when it is/isn't
> supported) ... based on the comments, it seems to boil down to...
>
>> As an example for how the clone() method is used let me describe how
>> MultiReader.reopen()
>> works: it tries to reopen every of its subreaders. If at least one
>> subreader could be reopened
>> successfully, then a new MultiReader instance is created and the
>> reopened subreaders are
>> added to it. Every of the old MultiReader's subreaders, that was not
>> reopened (because of no
>> index changes) is now cloned() and added to the new MultiReader.
>
> that seems like circular logic: the clone method is used so that the sub
> readers can be cloned (?)
>
> why use clones at all? why not just use the original reader (if the
> "index" that reader represents hasn't changed, why clone it?
>
> And if (for reasons that aren't clear to me) it is important for
> MultiReader to use a clone of it's subreaders when their reopen method
> returns "this" then shouldn't clients do the same thing? ... why not make
> reopen always return this.clone() if the index hasn't changed (which now
> that i think about it, would also help by punting on the isCloneSupported
> issue -- each class would already know if it was clonable.
>
> maybe this will make more sense once i read the patch ... i just wanted to
> through it out there in case someone had a chance to reply before i get a
> chance.
>
>
>
>> IndexReader.reopen()
>> --------------------
>>
>> Key: LUCENE-743
>> URL: https://issues.apache.org/jira/browse/LUCENE-743
>> Project: Lucene - Java
>> Issue Type: Improvement
>> Components: Index
>> Reporter: Otis Gospodnetic
>> Assignee: Michael Busch
>> Priority: Minor
>> Fix For: 2.3
>>
>> Attachments: IndexReaderUtils.java, lucene-743-take2.patch,
>> lucene-743.patch, lucene-743.patch, lucene-743.patch,
>> MyMultiReader.java, MySegmentReader.java
>>
>>
>> This is Robert Engels' implementation of IndexReader.reopen()
>> functionality, as a set of 3 new classes (this was easier for him to
>> implement, but should probably be folded into the core, if this looks
>> good).
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
> For additional commands, e-mail: java-dev-help [at] lucene
>
>
>




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 5, 2007, 11:56 PM

Post #7 of 71 (1398 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532833 ]

Michael Busch commented on LUCENE-743:
--------------------------------------

> why use clones at all? why not just use the original reader (if the "index" that reader represents hasn't changed, why clone it?

Let's say you have a MultiReader with two subreaders:
{code:java}
IndexReader ir1 = IndexReader.open(index1);
IndexReader ir2 = IndexReader.open(index2);
IndexReader mr = new MultiReader(new IndexReader[] {ir1, ir2});
{code}

Now index1 changes and you reopen the MultiReader and keep the old one open:

{code:java}
IndexReader mr2 = mr.reopen();
{code}

ir1 would now be reopened and let's assume we wouldn't clone ir2. If you use mr2 now to e.g. delete a doc and that doc happens to be in index2, then mr1 would also see the changes because both MultiReaders share the same subreader ir2 and are thus not independent from each other.

> why not make reopen always return this.clone()

clone() might be an expensive operation. We only need to clone if at least one of the subreaders has changed.


> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 6, 2007, 6:27 AM

Post #8 of 71 (1398 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532867 ]

Michael McCandless commented on LUCENE-743:
-------------------------------------------

> > Too bad so much needs to be cloned in the case that
> > closeOldReader==false... maybe someday in the future we can have
> > read-only readers.
>
> Yeah, the cloning part was kind of tedious. Read-only readers would
> indeed make our life much easier here. I'm wondering how many people
> are using the IndexReader to alter the norms anyway?

I think the closeOldReader=false case is actually quite important.

Because in production, I think you'd have to use that, so that your
old reader stays alive and is used to service incoming queries, up
until the point where the re-opened reader is "fully warmed".

Since fully warming could take a long time (minutes?) you need that
old reader to stay open.

Can we take a copy-on-write approach? EG, this is how OS's handle the
virtual memory pages when forking a process. This would mean whenever
a clone has been made of a SegmentReader, they cross-reference one
another. Then whenever either needs to alter deleted docs, one of them
makes a copy then. Likewise for the norms.

This would mean that "read-only" uses of the cloned reader never
pay the cost of copying the deleted docs bit vector nor norms.

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 6, 2007, 7:16 AM

Post #9 of 71 (1401 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532875 ]

Michael McCandless commented on LUCENE-743:
-------------------------------------------


Actually if we went back to the sharing (not cloning) approach, could
we insert a check for any writing operation against the re-opened
reader that throws an exception if the original reader is not yet
closed?

In Michael's example above, on calling mr2.deleteDoc, you would hit an
exception because mr2 would check and see that it's "original" reader
mr is not yet closed. But once you've closed mr, then the call
succeeds.

I think this would let us have our cake and eat it too: re-opening
would be very low cost for unchanged readers (no clones created), and,
you can still do deletes, etc, after you have closed your prior
reader. You control when your prior reader gets closed, to allow for
warming time and for queries to finish on the old reader.

Would this work?

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 6, 2007, 7:29 AM

Post #10 of 71 (1402 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532877 ]

Michael McCandless commented on LUCENE-743:
-------------------------------------------


A couple other questions/points:

* Just to verify: if you have a DirectoryIndexReader that is holding
the write lock (because it has made changes to deletes/norms) then
calling reopen on this reader should always return 'this', right?
Because it has the write lock, it must be (better be!) current.

This might be a good place to add an assert: if you are not
current, assert that you don't have the write lock, and if you
have the write lock, assert that you are current.

* I think you should add "ensureOpen()" at the top of
MultiReader.reopen(...)?

* If an Exception is hit during reopen, what is the resulting state
of your original reader? I think, ideally, it is unaffected by
the attempt to re-open? EG if you're on the hairy edge of file
descriptor limits, and the attempt to re-open failed because you
hit the limit, ideally your original reader is unaffected (even if
you specified closeOldReader=true).


> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 7, 2007, 11:23 AM

Post #11 of 71 (1393 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532990 ]

Hoss Man commented on LUCENE-743:
---------------------------------

Okay, read the patch. I'm on board with the need for Clonable now ... it's all about isolation. if "r.reopen(false) == r" then the client is responsible for recognizing that it's got the same instance and can make the judgement call about reusing the instance or cloning depending on it's needs ... internally in things like MultiReader we have to assume we need a clone for isolation.

Questions and comments...

1. CompoundFileReader, FieldsReader, IndexReader, and BitVector all have clone methods where they silently catch and ignore CloneNotSupportedExceptions from super.clone()... if we don't expect the exception to ever happen, we should just let the exception propogate up the stack (there is no down side to declaring that clone() throws CloneNotSupportedException). If we think the exception might happen, but it's not a big deal if it does and we can ignore it, then we should put a comment in the catch block to that effect. i'm not clear which are the cases here.

2. i don't remember if the old patches had this issue as well, but having "return new FilterIndexReader(reopened);" in FilterIndexReader doesn't really help anyone using FilterIndexReader -- they're going to want an instance of their own subclass. to meet the contract of FilterIndexReader, we should implement all "abstract" methods and delegate to the inner reader - so in theory do don't need a new instance of FIlterIndexReader, we could just return in.reopen(closeold) ... my gut says it would be better to leave the method out entirely so that the default impl which throws UnSupOpEx is used --- that way subclasses who want to use reopen *must* define their own (instead of getting confused when their filtering logic stops working after they reopen for the first time)

3. instead of having an isClonable() method, why don't we just remove the "implements Clonable" declaration from IndexReader and put it on the concrete subclasses -- then use "instanceof Cloneable" to determine if something is clonable? ... for that matter, the only place isCloneSupported is really used is in IndexReader.clone where an exception is thrown if Clone is not supported ... subclasses which know they are Clonable don't need this from the base class, subclasses which don't implement Clonable but are used in combination with core classes whose reopen method requires it could be skiped by the caller (since they don't implement Clonable) ..

4. javadocs frequently refer to "reopened successfully" and "refreshed successfully" when what it really means is "reopen() returned a newer/fresher reader" ... this may confuse people who are use to thinking of "successfully" a "without error"

5. random thought: are their any synchronization issues we're missing here? I'm wondering about the case where once thread calls reopen while another thread is updating norms or deleting docs. is there any chance for inconsistent state?



> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 7, 2007, 1:09 PM

Post #12 of 71 (1391 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533000 ]

Yonik Seeley commented on LUCENE-743:
-------------------------------------

> I'm wondering about the case where once thread calls reopen while another thread is updating norms or deleting docs.

Hmmm there is cause for concern (and I should have had my mt-safe hat on :-)
Reopen is synchronized on the reader, and so are norms access and docs, but from a quick look:
- reopen() may be synchronized, but clone() called on sub-readers isn't in a synchronized context that the sub-reader cares about. For example, MultiReader.reopen has the lock on the multireader, but calles subreader.clone() which iterates over the norms in a non thread-safe way.
- IndexInput objects that are in use should never be cloned... (like what is done in FieldsReader.clone())

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 8, 2007, 10:57 AM

Post #13 of 71 (1382 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533172 ]

Michael Busch commented on LUCENE-743:
--------------------------------------

Thanks all for the reviews and comments!

There seem to be some issues/open questions concerning the cloning.
Before I comment on them I think it would make sense to decide whether
we want to stick with the cloning approach or not. Mike suggests this
approach:

> Actually if we went back to the sharing (not cloning) approach, could
> we insert a check for any writing operation against the re-opened
> reader that throws an exception if the original reader is not yet
> closed?

Interesting, yes that should work in case we have two readers (the
original one and the re-opened one). But what happens if the user calls
reopen twice to get two re-opened instances back? Then there would be
three instances, and without cloning the two re-opened ones would also
share the same resources. Is this a desirable use case or would it be
okay to restrict reopen() so that it can only create one new instance?

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 8, 2007, 11:45 AM

Post #14 of 71 (1383 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533186 ]

Michael McCandless commented on LUCENE-743:
-------------------------------------------

> > Actually if we went back to the sharing (not cloning) approach,
> > could we insert a check for any writing operation against the
> > re-opened reader that throws an exception if the original reader
> > is not yet closed?
>
> Interesting, yes that should work in case we have two readers (the
> original one and the re-opened one). But what happens if the user
> calls reopen twice to get two re-opened instances back? Then there
> would be three instances, and without cloning the two re-opened ones
> would also share the same resources. Is this a desirable use case or
> would it be okay to restrict reopen() so that it can only create one
> new instance?

Hmmm good point.

Actually, we could allow more then one re-open call if we take the
following approach: every time a cloned Reader "borrows" a reference
to a sub-reader, it increments a counter (call it the "referrer
count"). When the Reader is closed, it decrements the count (by 1)
for each of the sub-readers.

Then, any reader should refuse to do a writing operation if its
"referrer" count is greater than 1, because it's being shared across
more than one referrer.

This way if you have a reader X and you did reopen to get Y and did
reopen again to get Z then the shared sub-readers between X, Y and Z
would not allow any write operations until 2 of the three had been
closed. I think that would work?

BTW this would also allow for very efficient "snapshots" during
searching: keeping multiple readers "alive", each searching a
different point-in-time commit of the index, because they would all
share the underlying segment readers that they have in common. Vs
cloning which would have to make many copies of each segment reader.


> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 8, 2007, 2:43 PM

Post #15 of 71 (1375 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533205 ]

Michael Busch commented on LUCENE-743:
--------------------------------------

> This way if you have a reader X and you did reopen to get Y and did
> reopen again to get Z then the shared sub-readers between X, Y and Z
> would not allow any write operations until 2 of the three had been
> closed. I think that would work?

Yes I think it would work. However, this approach has two downside IMO:
- reopen() becomes more complicated and restricted for the user. With
the cloning approach the user doesn't have to care about when index
modifications are not allowed. IndexReader instances returned by open()
or reopen() can be used exactly the same without any restrictions.

- We have to be very careful about cross-referencing multiple readers.
If for some reason any references between two or more readers are not
cleared after one was closed, then that reader might not become GC'd.
I'm not saying it's not possible or even very hard, we just have to
make sure those things can't ever happen.

Of course the cloning approach has disadvantages too. For custom
readers clone() has to be implemented in order to make reopen() work
correctly. Also reopen() is more expensive in case of
closeOldReader=false. Well we could certainly consider the lazy clone
approach that you suggested, Mike, but we have to be careful about the
cross-referencing issue again.

So I'm really not sure which way the better one is. I think I'm slightly
in favor for the cloning approach, so that reopen() returns instances
that are completely independant from each other, which seems cleaner IMO.
What do others think?

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 8, 2007, 3:15 PM

Post #16 of 71 (1374 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533213 ]

Michael McCandless commented on LUCENE-743:
-------------------------------------------


> We have to be very careful about cross-referencing multiple readers.
> If for some reason any references between two or more readers are
> not cleared after one was closed, then that reader might not become
> GC'd. I'm not saying it's not possible or even very hard, we just
> have to make sure those things can't ever happen.

One correction: there should be no cross-referencing, right? The only
thing that's happening is incrementing & decrementing the "referrer
count" for a reader? (But, you're right: the "copy on write" approach
to cloning would have cross-referencing).

I think the downside of this proposed "shared readers" (non-cloning)
approach is that you can't delete/setNorm on the clone until you close
the original. But I think that's fairly minor? Also if you really
need a full deep copy of your reader you could just open a new one
(ie, not use "reopen")?

I think the big upside of "shared readers" is reopening becomes quite
a bit more resource efficient: the cost of re-opening a reader would
be in proportion to what has actually changed in the index. EG, if
your index has added a few tiny segments since you last opened then
the reopen would be extremely fas but with cloning you are forced
to make a full deep copy of all other [unchanged] segments.


> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 11, 2007, 11:43 AM

Post #17 of 71 (1333 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534123 ]

Hoss Man commented on LUCENE-743:
---------------------------------


in deciding "to clone or not clone" for the purposes of implementing reopen, it may make sense to step back and ask a two broader questions...

1) what was the motivation for approaching reopen using clones in the first place
2) what does it inherently mean to clone an indexreader.

the answer to the first question, as i understand it, relates to the issue of indexreaders not being "read only" object ... operations can be taken which modify the readers state, and those operations can be flushed to disk when the reader is closed. so readerB = readerA.reopen(closeOld=false) and then readerA.delete(...) is called, there is ambiguity as to what should happen in readerB. so the clone route seems pretty straight forward if and only if we have an unambiguous concept of cloning a reader (because then the clone approach to reopen becomes unambiguous as well). alternately, since reopen is a new method, we can resolve the ambiguity anyway we want, as long as it's documented ... in theory we should pick a solution that seems to serve the most benefit ... perhaps that solution is to document reopen with "if reopen(closeOld=false) returns a new reader it will share state with the current reader, attempting to do the following operations on this new reader will result in undefined behavior"

the answer the the second is only important if we want to go the cloning route ... but it is pretty important in a larger sense then just reopening ... f we start to say that any of the IndexReader classes are Clonable we have to be very clear about what that means in *all* cases where someone might clone it in addition to reopen ... in particular, i worry about what it means to clone a reader which has already had "write" operations performed on it (something the clone based version of reopen will never do because a write operation indicates the reader must be current), but some client code might as soon as we add the Clonable interface to a class.



> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 16, 2007, 1:34 PM

Post #18 of 71 (1248 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535335 ]

Michael Busch commented on LUCENE-743:
--------------------------------------

> in deciding "to clone or not clone" for the purposes of implementing
> reopen, it may make sense to step back and ask a two broader questions...

I agree!

> 1) what was the motivation for approaching reopen using clones in the
> first place

Good summarization! You are right, I started the clone approach because
IndexReaders are not "read only" objects.

> "if reopen(closeOld=false) returns a new reader it will share state with
> the current reader, attempting to do the following operations on this
> new reader will result in undefined behavior"

This would mean, that we simply warn the user that performing write
operations with the re-opened indexreader will result in undefined behavior,
whereas with Mike's approach we would prevent such an undefined behavior by
using reference counting.

Hmm, so what are our long-term plans for indexreaders? If our goal is to
make them read-only (we can delete via IndexWriter already), then I think
adding those warning comments to reopen(), as you suggest Hoss, would be
sufficient.

If everybody likes this approach I'll go ahead and submit a new patch.



> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 16, 2007, 1:52 PM

Post #19 of 71 (1250 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535342 ]

Doug Cutting commented on LUCENE-743:
-------------------------------------

{quote}
Hmm, so what are our long-term plans for indexreaders? If our goal is to
make them read-only (we can delete via IndexWriter already), then I think
adding those warning comments to reopen(), as you suggest Hoss, would be
sufficient.
{quote}

I think that's a fine direction. Note however that IndexWriter implements delete by calling IndexReader.delete(). That method could be made package-private, so that users cannot call it, but then this makes it impossible for someone to subclass IndexReader from a different package. So perhaps delete() needs to move to a subclass of IndexReader? That gets messy...

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 16, 2007, 2:25 PM

Post #20 of 71 (1257 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535350 ]

Michael Busch commented on LUCENE-743:
--------------------------------------

{quote}
I think that's a fine direction. Note however that IndexWriter implements delete by calling IndexReader.delete(). That method could be made package-private, so that users cannot call it, but then this makes it impossible for someone to subclass IndexReader from a different package. So perhaps delete() needs to move to a subclass of IndexReader? That gets messy...
{quote}

Actually all of the lock/commit logic moved from IndexReader to DirectoryIndexReader already, and the delete logic is in SegmentReader, which subclasses DirectoryIndexReader. So we could remove the deleteDocument() API from IndexReader but leave it in DirectoryIndexReader. Then it would still be possible to extend IndexReader from a different package just as today, and IndexWriter could use DirectoryIndexReader for performing deletes. These changes should be trivial.

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 16, 2007, 2:34 PM

Post #21 of 71 (1254 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535352 ]

Doug Cutting commented on LUCENE-743:
-------------------------------------

Got it. IndexWriter only works with SegmentReaders anyway, not with an arbitrary IndexReader implementation: IndexReader is extensible, but IndexWriter is not. I'd (momentarily) forgotten that. Nevermind.

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 16, 2007, 2:40 PM

Post #22 of 71 (1251 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535354 ]

Yonik Seeley commented on LUCENE-743:
-------------------------------------

Having a read-only IndexReader would (should?) mean being able to remove "synchronized" from some things like isDeleted()... a nice performance win for multi-processor systems for things that didn't have access to the deleted-docs bitvec.

> If our goal is to make them read-only (we can delete via IndexWriter already)

But you can only delete-by-term.
It's more powerful to be able to delete by docid, however I manage to come up with it.
So I think deleteDocument(int id) should either be moved to a subclass. same with setNorms?

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 16, 2007, 3:29 PM

Post #23 of 71 (1246 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535362 ]

Michael Busch commented on LUCENE-743:
--------------------------------------

{quote}
So I think deleteDocument(int id) should either be moved to a subclass. same with setNorms?
{quote}

Or we could take the approach you suggested in http://www.gossamer-threads.com/lists/lucene/java-dev/52017.

That would mean to add a callback after flush to get a current IndexReader to get the docids and to use the IndexWriter then to perform deleteDocument(docId) or setNorm(). These methods could also take an IndexReader as argument, e. g. deleteDocument(IndexReader reader, int docId), which would throw an IOException if the passed in reader is stale (i. e. docids have changed since the reader was opened). Just as IndexReader does it today. Does this make sense or am I missing something?

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 17, 2007, 11:36 AM

Post #24 of 71 (1266 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535669 ]

Michael Busch commented on LUCENE-743:
--------------------------------------

I just opened LUCENE-1030 and would like to move discussions related to "read-only" IndexReaders to that issue.

As for reopen() I'd like to go with Hoss' suggestion for now and add warning comments to reopen() saying that using an re-opened IndexReader with closeOldReader==false for write operations will result in an undefined behavior. Any objections?

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Oct 17, 2007, 2:02 PM

Post #25 of 71 (1248 views)
Permalink
[jira] Commented: (LUCENE-743) IndexReader.reopen() [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535743 ]

Yonik Seeley commented on LUCENE-743:
-------------------------------------

{quote}
As for reopen() I'd like to go with Hoss' suggestion for now and add warning comments to reopen() saying that using an re-opened IndexReader with closeOldReader==false for write operations will result in an undefined behavior.
{quote}

How about just defining the behavior such that any pending changes are flushed. That would make it more useful because you could then reopen readers you used for deletes. An alternative would be a method to explicitly flush changes on a reader, giving one the ability to then reopen it, but I like the former better since it avoids adding another API call.

> IndexReader.reopen()
> --------------------
>
> Key: LUCENE-743
> URL: https://issues.apache.org/jira/browse/LUCENE-743
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Otis Gospodnetic
> Assignee: Michael Busch
> Priority: Minor
> Fix For: 2.3
>
> Attachments: IndexReaderUtils.java, lucene-743-take2.patch, lucene-743.patch, lucene-743.patch, lucene-743.patch, MyMultiReader.java, MySegmentReader.java, varient-no-isCloneSupported.BROKEN.patch
>
>
> This is Robert Engels' implementation of IndexReader.reopen() functionality, as a set of 3 new classes (this was easier for him to implement, but should probably be folded into the core, if this looks good).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene

First page Previous page 1 2 3 Next page Last page  View All Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.