Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

NearSpansUnordered payloads

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


jason.rutherglen at gmail

Nov 20, 2009, 3:49 PM

Post #1 of 5 (659 views)
Permalink
NearSpansUnordered payloads

I'm interested in getting the payload information from the
matching span, however it's unclear from the javadocs why
NearSpansUnordered is different than NearSpansOrdered in this
regard.

NearSpansUnordered returns payloads in a hash set that's
computed each method call by iterating over the SpanCell as a
linked list, whereas NearSpansOrdered stores the payloads in a
list (which is ordered) only when collectPayloads is true.

At first glance I'm not sure how to correlate the payload with
the span match using NSU, nor why they're different.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


ctignor at thinkmap

Nov 24, 2009, 7:55 AM

Post #2 of 5 (572 views)
Permalink
Re: NearSpansUnordered payloads [In reply to]

I am also having a hard time understanding the NewSpansUnordered
isPayloadAvilable() method.

For my test case where 2 tokens are at the same position, the code below
seems to be failing in traversing the 2 SpansCells. The first SpansCell it
retrieves has its next field set to null so it cannot find the second one.
Is this normal behavior?

// TODO: Remove warning after API has been finalized
public boolean isPayloadAvailable() {
SpansCell pointer = min();
while (pointer != null) {
if (pointer.isPayloadAvailable()) {
return true;
}
pointer = pointer.next;
}

return false;
}

When the linked list of SpanCells is first created they are linked together
normally but their order is reversed when adding them to the queue in list
toQueue() such that the last SpansCell with it's next field set o to null is
retrieved first.


On Fri, Nov 20, 2009 at 6:49 PM, Jason Rutherglen <
jason.rutherglen [at] gmail> wrote:

> I'm interested in getting the payload information from the
> matching span, however it's unclear from the javadocs why
> NearSpansUnordered is different than NearSpansOrdered in this
> regard.
>
> NearSpansUnordered returns payloads in a hash set that's
> computed each method call by iterating over the SpanCell as a
> linked list, whereas NearSpansOrdered stores the payloads in a
> list (which is ordered) only when collectPayloads is true.
>
> At first glance I'm not sure how to correlate the payload with
> the span match using NSU, nor why they're different.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


--
TH!NKMAP

Christopher Tignor | Senior Software Architect
155 Spring Street NY, NY 10012
p.212-285-8600 x385 f.212-285-8999


gsingers at apache

Nov 25, 2009, 5:14 AM

Post #3 of 5 (565 views)
Permalink
Re: NearSpansUnordered payloads [In reply to]

On Nov 20, 2009, at 6:49 PM, Jason Rutherglen wrote:

> I'm interested in getting the payload information from the
> matching span, however it's unclear from the javadocs why
> NearSpansUnordered is different than NearSpansOrdered in this
> regard.
>
> NearSpansUnordered returns payloads in a hash set that's
> computed each method call by iterating over the SpanCell as a
> linked list, whereas NearSpansOrdered stores the payloads in a
> list (which is ordered) only when collectPayloads is true.
>
> At first glance I'm not sure how to correlate the payload with
> the span match using NSU, nor why they're different.
>


I'll take a stab at this, but I am not 100% certain. I seem to recall in the implementation (and then in subsequent fixes by Mark) that we ultimately decided, due to the way Unordered is implemented, that it was too difficult to put the payloads in order, so we more or less punted and decided that perhaps it would be fine to deal w/ them in the aggregate anyway. Perhaps this needs to be revisited.


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


markrmiller at gmail

Nov 25, 2009, 5:36 AM

Post #4 of 5 (571 views)
Permalink
Re: NearSpansUnordered payloads [In reply to]

Grant Ingersoll wrote:
> On Nov 20, 2009, at 6:49 PM, Jason Rutherglen wrote:
>
>
>> I'm interested in getting the payload information from the
>> matching span, however it's unclear from the javadocs why
>> NearSpansUnordered is different than NearSpansOrdered in this
>> regard.
>>
>> NearSpansUnordered returns payloads in a hash set that's
>> computed each method call by iterating over the SpanCell as a
>> linked list, whereas NearSpansOrdered stores the payloads in a
>> list (which is ordered) only when collectPayloads is true.
>>
>> At first glance I'm not sure how to correlate the payload with
>> the span match using NSU, nor why they're different.
>>
>>
>
>
> I'll take a stab at this, but I am not 100% certain. I seem to recall in the implementation (and then in subsequent fixes by Mark) that we ultimately decided, due to the way Unordered is implemented, that it was too difficult to put the payloads in order, so we more or less punted and decided that perhaps it would be fine to deal w/ them in the aggregate anyway. Perhaps this needs to be revisited.
>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
> http://www.lucidimagination.com/search
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>
Right - we would have had to sort them - but not every case needed them
sorted, so it didn't make sense to always pay for that - so we decided
if a user needed it, they could encode the order in the payload and sort
themselves externally.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


jason.rutherglen at gmail

Nov 25, 2009, 11:29 AM

Post #5 of 5 (563 views)
Permalink
Re: NearSpansUnordered payloads [In reply to]

I don't mind adding the "positions" of the payloads in them. However,
maybe we can be little more clear in the javadocs what's going on
underneath?

On Wed, Nov 25, 2009 at 5:36 AM, Mark Miller <markrmiller [at] gmail> wrote:
> Grant Ingersoll wrote:
>> On Nov 20, 2009, at 6:49 PM, Jason Rutherglen wrote:
>>
>>
>>> I'm interested in getting the payload information from the
>>> matching span, however it's unclear from the javadocs why
>>> NearSpansUnordered is different than NearSpansOrdered in this
>>> regard.
>>>
>>> NearSpansUnordered returns payloads in a hash set that's
>>> computed each method call by iterating over the SpanCell as a
>>> linked list, whereas NearSpansOrdered stores the payloads in a
>>> list (which is ordered) only when collectPayloads is true.
>>>
>>> At first glance I'm not sure how to correlate the payload with
>>> the span match using NSU, nor why they're different.
>>>
>>>
>>
>>
>> I'll take a stab at this, but I am not 100% certain.  I seem to recall in the implementation (and then in subsequent fixes by Mark) that we ultimately decided, due to the way Unordered is implemented, that it was too difficult to put the payloads in order, so we more or less punted and decided that perhaps it would be fine to deal w/ them in the aggregate anyway.  Perhaps this needs to be revisited.
>>
>>
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com/
>>
>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
>> http://www.lucidimagination.com/search
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>> For additional commands, e-mail: java-user-help [at] lucene
>>
>>
> Right - we would have had to sort them - but not every case needed them
> sorted, so it didn't make sense to always pay for that - so we decided
> if a user needed it, they could encode the order in the payload and sort
> themselves externally.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.