Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

Class in Lucene that Perform Search

 

 

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


blazingwolf7 at gmail

Jul 2, 2008, 7:30 PM

Post #1 of 5 (289 views)
Permalink
Class in Lucene that Perform Search

Hi,

I am currently using Lucene to build a search engine and is trying to
understand better so I am going through its source code. I track it all the
way from the beginning till end, and has managed to located all the class
that calculate the score and return the results.

What I am missing is that I fail to locate the class that perform the actual
comparison to determine if a query match any term in a document. I also fail
to locate the class that is responsible for retrieving the document that
contains the term specify. Can anyone help me with this? Maybe just tell me
the class related. Thanks
--
View this message in context: http://www.nabble.com/Class-in-Lucene-that-Perform-Search-tp18250664p18250664.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-dev-help[at]lucene.apache.org


yonik at apache

Jul 2, 2008, 8:02 PM

Post #2 of 5 (279 views)
Permalink
Re: Class in Lucene that Perform Search [In reply to]

On Wed, Jul 2, 2008 at 10:30 PM, blazingwolf7 <blazingwolf7[at]gmail.com> wrote:
> What I am missing is that I fail to locate the class that perform the actual
> comparison to determine if a query match any term in a document.

You need to understand the inverted index format. Documents that
match a term is determined at index time, not at query time. The .frq
file lists all documents that match each term.

TermDocs iterates over all documents that match the term by reading
the .frq file.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-dev-help[at]lucene.apache.org


blazingwolf7 at gmail

Jul 3, 2008, 1:03 AM

Post #3 of 5 (268 views)
Permalink
Re: Class in Lucene that Perform Search [In reply to]

Ah, thanks! I am clear now. Have to change tactics to achieve what I need.
Which class during indexing time will create the .frq file?

If possible, I want to add an extra value into it so that I can retrieve the
information during the searching process. Thank


Yonik Seeley wrote:
>
> On Wed, Jul 2, 2008 at 10:30 PM, blazingwolf7 <blazingwolf7[at]gmail.com>
> wrote:
>> What I am missing is that I fail to locate the class that perform the
>> actual
>> comparison to determine if a query match any term in a document.
>
> You need to understand the inverted index format. Documents that
> match a term is determined at index time, not at query time. The .frq
> file lists all documents that match each term.
>
> TermDocs iterates over all documents that match the term by reading
> the .frq file.
>
> -Yonik
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe[at]lucene.apache.org
> For additional commands, e-mail: java-dev-help[at]lucene.apache.org
>
>
>

--
View this message in context: http://www.nabble.com/Class-in-Lucene-that-Perform-Search-tp18250664p18253813.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-dev-help[at]lucene.apache.org


yonik at apache

Jul 3, 2008, 9:05 AM

Post #4 of 5 (257 views)
Permalink
Re: Class in Lucene that Perform Search [In reply to]

On Thu, Jul 3, 2008 at 4:03 AM, blazingwolf7 <blazingwolf7[at]gmail.com> wrote:
> Ah, thanks! I am clear now. Have to change tactics to achieve what I need.
> Which class during indexing time will create the .frq file?

DocumentsWriter (called from IndexWriter).

> If possible, I want to add an extra value into it so that I can retrieve the
> information during the searching process. Thank

Look at payloads first.
What problem are you trying to solve? Someone may have an easier
approach for you if payloads doesn't work.

-Yonik



>
> Yonik Seeley wrote:
>>
>> On Wed, Jul 2, 2008 at 10:30 PM, blazingwolf7 <blazingwolf7[at]gmail.com>
>> wrote:
>>> What I am missing is that I fail to locate the class that perform the
>>> actual
>>> comparison to determine if a query match any term in a document.
>>
>> You need to understand the inverted index format. Documents that
>> match a term is determined at index time, not at query time. The .frq
>> file lists all documents that match each term.
>>
>> TermDocs iterates over all documents that match the term by reading
>> the .frq file.
>>
>> -Yonik
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe[at]lucene.apache.org
>> For additional commands, e-mail: java-dev-help[at]lucene.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Class-in-Lucene-that-Perform-Search-tp18250664p18253813.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe[at]lucene.apache.org
> For additional commands, e-mail: java-dev-help[at]lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-dev-help[at]lucene.apache.org


blazingwolf7 at gmail

Jul 3, 2008, 7:37 PM

Post #5 of 5 (250 views)
Permalink
Re: Class in Lucene that Perform Search [In reply to]

I am trying to retrieve the contentLength and the URL of each document from
the index without continuously using IndexReader, eg:
reader.document.get("ur");

I am trying to find a way to retrieve all this value and stored it into an
array by using the IndexReader only once or twice. I thought maybe I can
store some extra value into the .frq file then I will have no need to
continuously use the reader. Anyone can provide other suggestion? Thanks


Yonik Seeley wrote:
>
> On Thu, Jul 3, 2008 at 4:03 AM, blazingwolf7 <blazingwolf7[at]gmail.com>
> wrote:
>> Ah, thanks! I am clear now. Have to change tactics to achieve what I
>> need.
>> Which class during indexing time will create the .frq file?
>
> DocumentsWriter (called from IndexWriter).
>
>> If possible, I want to add an extra value into it so that I can retrieve
>> the
>> information during the searching process. Thank
>
> Look at payloads first.
> What problem are you trying to solve? Someone may have an easier
> approach for you if payloads doesn't work.
>
> -Yonik
>
>
>
>>
>> Yonik Seeley wrote:
>>>
>>> On Wed, Jul 2, 2008 at 10:30 PM, blazingwolf7 <blazingwolf7[at]gmail.com>
>>> wrote:
>>>> What I am missing is that I fail to locate the class that perform the
>>>> actual
>>>> comparison to determine if a query match any term in a document.
>>>
>>> You need to understand the inverted index format. Documents that
>>> match a term is determined at index time, not at query time. The .frq
>>> file lists all documents that match each term.
>>>
>>> TermDocs iterates over all documents that match the term by reading
>>> the .frq file.
>>>
>>> -Yonik
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe[at]lucene.apache.org
>>> For additional commands, e-mail: java-dev-help[at]lucene.apache.org
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Class-in-Lucene-that-Perform-Search-tp18250664p18253813.html
>> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe[at]lucene.apache.org
>> For additional commands, e-mail: java-dev-help[at]lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe[at]lucene.apache.org
> For additional commands, e-mail: java-dev-help[at]lucene.apache.org
>
>
>

--
View this message in context: http://www.nabble.com/Class-in-Lucene-that-Perform-Search-tp18250664p18271691.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-dev-help[at]lucene.apache.org

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.