Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

IndexReader.deleteDocument in Lucene 3.6

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


nikolazius at gmail

May 25, 2012, 2:23 AM

Post #1 of 3 (270 views)
Permalink
IndexReader.deleteDocument in Lucene 3.6

Hi everyone. We are using IndexReader.deleteDocument(Term) method to
delete documents, since it returns the number of deleted documents.
This is used to be sure that some docs were removed. We must know for
sure if documents were deleted. But in lucene 3.6 this method is final
and can't be overridden in our codebase anymore. Method
IndexWriter.deleteDocument(..) is not final and possibly can be used
in our project, but doesn't return any value so we can't be sure
whether ant documents were deleted. So briefly
IndexReader.deleteDocument(Term) is a final but returns number of
deletions performed and IndexWriter.deleteDocument(..) is not final,
but doesn't return any result. Our functionality requires overriding
and result value.

Can anyone please suggest how to solve this issue? Can simply run term
query before, but it seems to be absolutely inefficient.

--
Best regards, Nikolay Zamosenchuk

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


yonik at lucidimagination

May 25, 2012, 9:40 AM

Post #2 of 3 (263 views)
Permalink
Re: IndexReader.deleteDocument in Lucene 3.6 [In reply to]

On Fri, May 25, 2012 at 5:23 AM, Nikolay Zamosenchuk
<nikolazius [at] gmail> wrote:
> IndexWriter.deleteDocument(..) is not final,
> but doesn't return any result.

Deleted terms are buffered for good performance, so at the time of
IndexWriter.deleteDocument(Term) we don't know how many documents
match the term.

> Can anyone please suggest how to solve this issue? Can simply run term
> query before, but it seems to be absolutely inefficient.

You could switch to an asynchronous design and use a custom query that
keeps track of how many (or which) documents it matched.

-Yonik
http://lucidimagination.com




> --
> Best regards, Nikolay Zamosenchuk
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


erouse at comsquared

May 25, 2012, 10:07 AM

Post #3 of 3 (268 views)
Permalink
RE: IndexReader.deleteDocument in Lucene 3.6 [In reply to]

To ensure deletion I use a while loop with a counter (to prevent an endless
loop if there's a problem)

Term term = this.createIdTerm(id);
Int count = 0;
while(readDocument(indexName, id) != null)
{
count++;
log.debug("deleting document " + id + " from index " + indexName);
writer.deleteDocuments(term);
writer.commit();
if(count > 10)
{
failed = true;
break;
}
}
If(failed) throw DeleteFailedException("Failed to delete document " + id
+ " from index " + indexName);

And readDocument does this:

IndexReader reader = this.getReader(indexName);
Document doc = null;
TermDocs td = reader.termDocs(this.createIdTerm(id));
if(td.next())
{
int d = td.doc();
doc = reader.document(d);
}
this.returnReader(reader);
return doc;

Because IndexReader.termDocs doesn't return deleted documents, once the
deletion is successful, readDocument returns a null.

I'm not even sure if I need to make the writer.commit() call, but the load
for us is small enough that performance isn't an issue. If performance does
become an issue I might need to tweak this a bit, but it does ensure that a
deletion is successful or it throws an exception.

> -----Original Message-----
> From: yseeley [at] gmail [mailto:yseeley [at] gmail] On Behalf Of Yonik
> Seeley
> Sent: Friday, May 25, 2012 12:40 PM
> To: java-user [at] lucene
> Subject: Re: IndexReader.deleteDocument in Lucene 3.6
>
> On Fri, May 25, 2012 at 5:23 AM, Nikolay Zamosenchuk
> <nikolazius [at] gmail> wrote:
> > IndexWriter.deleteDocument(..) is not final,
> > but doesn't return any result.
>
> Deleted terms are buffered for good performance, so at the time of
> IndexWriter.deleteDocument(Term) we don't know how many documents
> match the term.
>
> > Can anyone please suggest how to solve this issue? Can simply run
> term
> > query before, but it seems to be absolutely inefficient.
>
> You could switch to an asynchronous design and use a custom query that
> keeps track of how many (or which) documents it matched.
>
> -Yonik
> http://lucidimagination.com
>
>
>
>
> > --
> > Best regards, Nikolay Zamosenchuk
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > For additional commands, e-mail: java-user-help [at] lucene
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.