Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

don't allow negatives in the positions file

 

 

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


rcmuir at gmail

Aug 11, 2012, 5:59 AM

Post #1 of 2 (69 views)
Permalink
don't allow negatives in the positions file

Hello, see the linked patch:

http://pastebin.com/7JAaJ3EN

Because of an ancient bug in lucene 2.4.0, we still allow -1 as a
position. But this doesnt even work today (i created such an index,
and phrase queries etc dont work because tons of lucene code assumes
positions are >= 0).
Additionally, these wont be compressible with bulk compression
algorithms that assume positive integers.

So I think we should throw exception in checkindex if someone has
these negative positions.

Just in case someone has a 2.4.0 index they migrated all the way up to
4.0, the patch contains code in preflex's reader to correct the -1
delta to 0. This is no worse than today, in that phrase queries etc
still wont work on these corrumpt positions, however the rest of the
index will continue to work fine.

--
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


lucene at mikemccandless

Aug 11, 2012, 6:04 AM

Post #2 of 2 (66 views)
Permalink
Re: don't allow negatives in the positions file [In reply to]

+1, patch looks great.

Mike McCandless

http://blog.mikemccandless.com

On Sat, Aug 11, 2012 at 8:59 AM, Robert Muir <rcmuir [at] gmail> wrote:
> Hello, see the linked patch:
>
> http://pastebin.com/7JAaJ3EN
>
> Because of an ancient bug in lucene 2.4.0, we still allow -1 as a
> position. But this doesnt even work today (i created such an index,
> and phrase queries etc dont work because tons of lucene code assumes
> positions are >= 0).
> Additionally, these wont be compressible with bulk compression
> algorithms that assume positive integers.
>
> So I think we should throw exception in checkindex if someone has
> these negative positions.
>
> Just in case someone has a 2.4.0 index they migrated all the way up to
> 4.0, the patch contains code in preflex's reader to correct the -1
> delta to 0. This is no worse than today, in that phrase queries etc
> still wont work on these corrumpt positions, however the rest of the
> index will continue to work fine.
>
> --
> lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe [at] lucene
> For additional commands, e-mail: dev-help [at] lucene
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.