Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)

 

 

First page Previous page 1 2 3 4 Next page Last page  View All Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


jira at apache

Aug 9, 2012, 10:10 AM

Post #76 of 94 (85 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431985#comment-13431985 ]

Michael McCandless commented on LUCENE-3892:
--------------------------------------------

bq. I had been doing some tests with the bulk version of PackedInts.get (which uses the same methods that we use for BlockPacked) while working on LUCENE-4098 and it seemed that the bottleneck was more memory bandwidth than CPU (for large arrays at least).

Ahh, interesting...

So I think we should test different acceptableOverheadRatios to find the best ... it could be it's 0!

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 9, 2012, 10:18 AM

Post #77 of 94 (85 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431991#comment-13431991 ]

Michael McCandless commented on LUCENE-3892:
--------------------------------------------

{quote}
bq. Do we really need to write/write the 32 format.getId(), numBits into the postings file header? I guess it's either that or ... store the float acceptableOverheadRatio (eg using Float.floatToIntBits I guess) and have some back-compat enforced in the logic in PackedInts.fastestFormatAndBits... hmm.

I hesitated between these two approaches but I think writing all cases to the header is less error-prone? Moreover it would allow us to change the logic of fastestFormatAndBits without having to bump the version number.
{quote}

Maybe for starters we should just hardwire acceptableOverheadRatio at
0 ... then we simplify this back-compat until/unless we really need to
make this configurable.


> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 9, 2012, 10:23 AM

Post #78 of 94 (86 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432001#comment-13432001 ]

Michael McCandless commented on LUCENE-3892:
--------------------------------------------

bq. The other problem is that we are also storing these unnecessary 19 values (but it is not easy to fix since PACKED_SINGLE_BLOCK writes values in the low-order long bits first (little endian)). Maybe we should make PACKED_SINGLE_BLOCK write values in the high-order bits first and split byte encoders and decoders from the long ones (so that they have a lower valueCount()).

OK, we can explore that later (another reason to simply always use Format.PACKED for now...).

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 9, 2012, 10:33 AM

Post #79 of 94 (85 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432009#comment-13432009 ]

Robert Muir commented on LUCENE-3892:
-------------------------------------

{quote}
OK indeed PFOR is slower for me too:
{quote}

I think for starters since you guys have gotten FOR pretty nice we should just focus on that one?

We could later see if PFOR could get additional wins as a second step: getting FOR working nice and fast
is awesome on its own!

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 9, 2012, 3:31 PM

Post #80 of 94 (85 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432235#comment-13432235 ]

Michael McCandless commented on LUCENE-3892:
--------------------------------------------

bq. I think for starters since you guys have gotten FOR pretty nice we should just focus on that one?

Yeah I think we should do that. I think the branch is nearly ready to land!

I just replaced Block with BlockPacked ...

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 10, 2012, 5:14 AM

Post #81 of 94 (84 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432716#comment-13432716 ]

Adrien Grand commented on LUCENE-3892:
--------------------------------------

I ran the comparison between acceptableOverheadRatio=PackedInts.COMPACT (0%) and PackedInts.DEFAULT (20%) and it seems to be much faster with PackedInts.COMPACT:

{noformat}
base=COMPACT, challenger=DEFAULT
Task QPS base StdDev base QPS def StdDev def Pct diff
IntNRQ 81.83 5.43 74.14 2.94 -18% - 0%
HighTerm 146.55 10.34 133.57 9.02 -20% - 4%
LowPhrase 93.91 1.63 86.90 1.67 -10% - -4%
MedTerm 824.58 43.48 766.35 38.78 -16% - 3%
LowSloppyPhrase 83.29 1.99 77.65 1.18 -10% - -3%
OrHighMed 94.15 5.28 88.34 4.54 -15% - 4%
OrHighHigh 100.63 5.42 94.57 4.20 -14% - 3%
OrHighLow 128.62 7.21 120.92 6.07 -15% - 4%
HighPhrase 13.05 0.45 12.29 0.39 -11% - 0%
Prefix3 217.06 6.82 205.05 4.62 -10% - 0%
MedPhrase 27.50 0.97 26.33 0.79 -10% - 2%
Wildcard 183.20 4.87 175.58 3.89 -8% - 0%
LowTerm 1763.31 43.24 1693.31 39.29 -8% - 0%
HighSloppyPhrase 10.05 0.48 9.67 0.40 -11% - 5%
AndHighHigh 111.59 1.15 107.45 1.66 -6% - -1%
LowSpanNear 56.16 1.32 54.25 1.01 -7% - 0%
AndHighMed 423.44 7.40 409.32 5.10 -6% - 0%
MedSpanNear 33.14 0.91 32.32 0.74 -7% - 2%
AndHighLow 2177.50 30.79 2134.05 28.64 -4% - 0%
Fuzzy1 95.34 2.41 93.66 2.32 -6% - 3%
HighSpanNear 5.28 0.17 5.21 0.11 -6% - 3%
MedSloppyPhrase 18.41 0.72 18.19 0.70 -8% - 6%
Fuzzy2 37.73 1.31 37.31 1.14 -7% - 5%
Respell 109.71 3.09 108.64 2.76 -6% - 4%
PKLookup 257.32 6.64 260.00 7.15 -4% - 6%
{noformat}

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 13, 2012, 6:56 AM

Post #82 of 94 (84 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433143#comment-13433143 ]

Adrien Grand commented on LUCENE-3892:
--------------------------------------

bq. (From mailing-list) So I think if its this ambiguous for wikipedia we should shoot for the most COMPACT form as a safe default.

+1 too. I just committed the change.

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 15, 2012, 5:51 AM

Post #83 of 94 (83 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435033#comment-13435033 ]

Michael McCandless commented on LUCENE-3892:
--------------------------------------------

Uwe just started builds for this branch (thanks!): http://jenkins.sd-datasolutions.de/job/pforcodec-3892-branch

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 20, 2012, 3:06 AM

Post #84 of 94 (68 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437770#comment-13437770 ]

Adrien Grand commented on LUCENE-3892:
--------------------------------------

From r1373332:
{quote}
- private static final int PACKED_INTS_VERSION = 0; // nocommit: encode in the stream?
+ private static final int PACKED_INTS_VERSION_START = 0;
+ private static final int PACKED_INTS_VERSION_CURRENT = PACKED_INTS_VERSION_START;
{quote}

Mike, is there any reason why you didn't use {{PackedInts.VERSION_START}} and {{PackedInts.VERSION_CURRENT}} instead?

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-javadocs.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch, LUCENE-3892-trunk.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 20, 2012, 4:20 AM

Post #85 of 94 (67 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437789#comment-13437789 ]

Michael McCandless commented on LUCENE-3892:
--------------------------------------------

bq. Mike, is there any reason why you didn't use PackedInts.VERSION_START and PackedInts.VERSION_CURRENT instead?

Woops, no, I forgot we had version info in PackedInts! I'll switch it over.

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-javadocs.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch, LUCENE-3892-trunk.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 20, 2012, 5:14 AM

Post #86 of 94 (66 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437806#comment-13437806 ]

Michael McCandless commented on LUCENE-3892:
--------------------------------------------

OK I committed that, and also added version checking in getEncoder/Decoder, and I now loop over all versions when computing MAX_DATA_SIZE -- can you double check Adrien? Thanks!

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-javadocs.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch, LUCENE-3892-trunk.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 20, 2012, 5:32 AM

Post #87 of 94 (67 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437818#comment-13437818 ]

Adrien Grand commented on LUCENE-3892:
--------------------------------------

You toasted me, I was just doing exactly the same change! :-) The diff looks good to me.

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-javadocs.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch, LUCENE-3892-trunk.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 20, 2012, 5:40 AM

Post #88 of 94 (66 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437825#comment-13437825 ]

Michael McCandless commented on LUCENE-3892:
--------------------------------------------

Woops sorry!

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-javadocs.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch, LUCENE-3892-trunk.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 20, 2012, 5:58 AM

Post #89 of 94 (66 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437834#comment-13437834 ]

Robert Muir commented on LUCENE-3892:
-------------------------------------

{quote}
The branch builds look stable ... I think this is ready to land on trunk!

I think we should leave Lucene40 as default PF for now, until we BlockPF bakes on trunk for a while, but as some point (maybe for 4.1?) I think we should cutover to BlockPF as the default.
{quote}

+1

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-javadocs.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch, LUCENE-3892-trunk.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 20, 2012, 7:19 AM

Post #90 of 94 (67 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437889#comment-13437889 ]

Adrien Grand commented on LUCENE-3892:
--------------------------------------

bq. Woops sorry!

NP, good to know we had planned the same changes.

bq. The branch builds look stable ... I think this is ready to land on trunk!

+1 too

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-javadocs.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch, LUCENE-3892-trunk.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 21, 2012, 5:36 AM

Post #91 of 94 (67 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438653#comment-13438653 ]

Han Jiang commented on LUCENE-3892:
-----------------------------------

Thank you Mike! And thanks to all of you! I learnt really much this summer!

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 5.0, 4.0
>
> Attachments: LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-javadocs.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch, LUCENE-3892-trunk.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 21, 2012, 6:43 AM

Post #92 of 94 (68 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438728#comment-13438728 ]

Robert Muir commented on LUCENE-3892:
-------------------------------------

Thanks Billy for all the hard work and endless benchmarking, so nice to have a block codec that is simple and clean and reuses our packed ints optimizations.


> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 5.0, 4.0
>
> Attachments: LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-javadocs.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch, LUCENE-3892-trunk.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 21, 2012, 6:45 AM

Post #93 of 94 (67 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438731#comment-13438731 ]

Adrien Grand commented on LUCENE-3892:
--------------------------------------

bq. Thanks Billy for all the hard work and endless benchmarking, so nice to have a block codec that is simple and clean and reuses our packed ints optimizations.

+1

> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 5.0, 4.0
>
> Attachments: LUCENE-3892-blockFor&hardcode(base).patch, LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892_for_unfold_method.patch, LUCENE-3892-handle_open_files.patch, LUCENE-3892-javadocs.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch, LUCENE-3892_settings.patch, LUCENE-3892-trunk.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Aug 24, 2012, 9:53 AM

Post #94 of 94 (65 views)
Permalink
[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.) [In reply to]

https://issues.apache.org/jira/secure/ViewProfile.jspa?name=thetaphi"]Uwe Schindler commented on https://issues.apache.org/jira/browse/LUCENE-3892"]LUCENE-3892 https://issues.apache.org/jira/browse/LUCENE-3892"]Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)

We should keep the size of methods small, as bigger methods work against the code cache of hotspot and if Lucene is not used alone, may get de-optimized. This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira"]http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe [at] lucene For additional commands, e-mail: dev-help [at] lucene

First page Previous page 1 2 3 4 Next page Last page  View All Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.