
serera at gmail
Aug 11, 2012, 11:29 PM
Post #7 of 8
(140 views)
Permalink
|
|
Re: revisit payloads API in DocsAndPositionsEnum
[In reply to]
|
|
I agree Shai On Sun, Aug 12, 2012 at 9:18 AM, Robert Muir <rcmuir [at] gmail> wrote: > That would definitely be separate. I looked into this: the problem is > things like LUCENE-4219. > > So the current behavior for payload-using span queries (at least > span-near with payloads) is wrong, they score differently depending > upon whether you next() or advance() them (which is horrible!), so I > completely don't trust any of the existing tests. > > I think we need to either first fix the bugs in spans or do something > else with them before trying to simplify their APIs. > > On Sun, Aug 12, 2012 at 2:10 AM, Shai Erera <serera [at] gmail> wrote: > > Looks good. > > > > Perhaps separately, what do you think about doing the same to > > Spans.isPayloadAvailable/getPayload? > > > > Shai > > > > > > On Sun, Aug 12, 2012 at 8:56 AM, Robert Muir <rcmuir [at] gmail> wrote: > >> > >> Here's a patch: http://pastebin.com/d2DdWxJp > >> > >> On Sat, Aug 11, 2012 at 1:35 PM, Simon Willnauer > >> <simon.willnauer [at] gmail> wrote: > >> > +1 this makes lots of sense > >> > > >> > simon > >> > > >> > On Sat, Aug 11, 2012 at 7:28 PM, Michael McCandless > >> > <lucene [at] mikemccandless> wrote: > >> >> +1 > >> >> > >> >> Mike McCandless > >> >> > >> >> http://blog.mikemccandless.com > >> >> > >> >> > >> >> On Sat, Aug 11, 2012 at 10:08 AM, Robert Muir <rcmuir [at] gmail> > wrote: > >> >>> The payloads api is really confusing: > >> >>> > >> >>> /** Returns the payload at this position, or null if no > >> >>> * payload was indexed. Only call this once per > >> >>> * position. You should not modify anything (neither > >> >>> * members of the returned BytesRef nor bytes in the > >> >>> * byte[]). */ > >> >>> public abstract BytesRef getPayload() throws IOException; > >> >>> > >> >>> public abstract boolean hasPayload(); > >> >>> > >> >>> 1. is it ok for the consumer to call getPayload() [checking for > null], > >> >>> and never call hasPayload? It seems so, so why do we have > hasPayload? > >> >>> can we remove it? > >> >>> The current situation requires impls to handle 'hasPayload' > logic > >> >>> twice: in hasPayload itself and also in getPayload. > >> >>> > >> >>> 2. You should not modify anything (neither members of the returned > >> >>> BytesRef nor bytes in the byte[]). Then why do we have this in > >> >>> TestPayloads.java: > >> >>> // Just to ensure all codecs can > >> >>> // handle a caller that mucks with the > >> >>> // returned payload: > >> >>> if (rarely()) { > >> >>> br.bytes = new byte[random().nextInt(5)]; > >> >>> } > >> >>> br.length = 0; > >> >>> br.offset = 0; > >> >>> > >> >>> Testing for this totally disagrees with the javadocs. > >> >>> > >> >>> 3. 'Only call this once per position'. This is totally different > than > >> >>> any of our other 'attributes' on the enums (freq(), startOffset(), > >> >>> endOffset(), etc). I think we should > >> >>> remove this. > >> >>> > >> >>> So I want to propose we remove hasPayload(), remove 'only call once > >> >>> per position', and remove this test in TestPayloads.java. If we want > >> >>> to make sure none > >> >>> of lucene's code itself is mucking with the returned BytesRef, we > can > >> >>> add such a check in AssertingCodec. > >> >>> > >> >>> /** Returns the payload at this position, or null if no > >> >>> * payload was indexed. You should not modify anything (neither > >> >>> * members of the returned BytesRef nor bytes in the > >> >>> * byte[]). */ > >> >>> public abstract BytesRef getPayload() throws IOException; > >> >>> > >> >>> -- > >> >>> lucidimagination.com > >> >>> > >> >>> > --------------------------------------------------------------------- > >> >>> To unsubscribe, e-mail: dev-unsubscribe [at] lucene > >> >>> For additional commands, e-mail: dev-help [at] lucene > >> >>> > >> >> > >> >> --------------------------------------------------------------------- > >> >> To unsubscribe, e-mail: dev-unsubscribe [at] lucene > >> >> For additional commands, e-mail: dev-help [at] lucene > >> >> > >> > > >> > --------------------------------------------------------------------- > >> > To unsubscribe, e-mail: dev-unsubscribe [at] lucene > >> > For additional commands, e-mail: dev-help [at] lucene > >> > > >> > >> > >> > >> -- > >> lucidworks.com > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: dev-unsubscribe [at] lucene > >> For additional commands, e-mail: dev-help [at] lucene > >> > > > > > > -- > lucidworks.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe [at] lucene > For additional commands, e-mail: dev-help [at] lucene > >
|