Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

regarding FieldSelector

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


mnrz57 at gmail

Sep 12, 2007, 2:13 AM

Post #1 of 12 (3026 views)
Permalink
regarding FieldSelector

Hi all,

Can anyone explain what is the FieldSelector and the usage or benefits of
this structure? I read the javadocs but I can't get for what goal it is
provided in Lucene.

Thanks in advance

--
Regards,
Mohammad
--------------------------
see my blog: http://brainable.blogspot.com/
another in Persian: http://fekre-motefavet.blogspot.com/


gsingers at apache

Sep 12, 2007, 3:49 AM

Post #2 of 12 (2931 views)
Permalink
Re: regarding FieldSelector [In reply to]

Hi Mohammad,

The typical use cases are:
1. You have several small fields used in a results display and one or
two large fields (i.e. the original document) and you don't want to
pay the cost of loading the large fields for results display because
most of them won't be chosen. When a result is chosen, the lazily
loaded field will be retrieved.

2. You only want to load certain fields, or the first field, or you
just want to know the size of a field.

Basically, it gives you control over how fields are loaded from disk
in Lucene.

See my ApacheCon Europe presentation http://cnlp.org/presentations/
slides/AdvancedLuceneEU.pdf for a few slides (towards the end) on
FieldSelector.

On Sep 12, 2007, at 5:13 AM, Mohammad Norouzi wrote:

> Hi all,
>
> Can anyone explain what is the FieldSelector and the usage or
> benefits of
> this structure? I read the javadocs but I can't get for what goal
> it is
> provided in Lucene.
>
> Thanks in advance
>
> --
> Regards,
> Mohammad
> --------------------------
> see my blog: http://brainable.blogspot.com/
> another in Persian: http://fekre-motefavet.blogspot.com/

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


mnrz57 at gmail

Sep 12, 2007, 4:40 AM

Post #3 of 12 (2911 views)
Permalink
Re: regarding FieldSelector [In reply to]

Hi Grant,
Really thanks for your nice document about advanced Lucene. it was very
useful for me.

as I understand, we can set some large fields to be lazily loading, now my
question is when it will be loaded? it make sense when we call
doc.get("field_name")
it will load from the index, Am I right?

in my application, I've provided a result set structure to navigate between
results and documents and provide a get(String fieldname) method just like
java.sql.ResultSet.getString() method, and also this result set implements
HitCollector in order to collect my own ID rather than Lucene's document id,
so I think I can set my field ID to be loaded always and the other fields to
be lazily loading, Does this improve the search process?

again, thank you very much indeed.


On 9/12/07, Grant Ingersoll <gsingers [at] apache> wrote:
>
> Hi Mohammad,
>
> The typical use cases are:
> 1. You have several small fields used in a results display and one or
> two large fields (i.e. the original document) and you don't want to
> pay the cost of loading the large fields for results display because
> most of them won't be chosen. When a result is chosen, the lazily
> loaded field will be retrieved.
>
> 2. You only want to load certain fields, or the first field, or you
> just want to know the size of a field.
>
> Basically, it gives you control over how fields are loaded from disk
> in Lucene.
>
> See my ApacheCon Europe presentation http://cnlp.org/presentations/
> slides/AdvancedLuceneEU.pdf for a few slides (towards the end) on
> FieldSelector.
>
> On Sep 12, 2007, at 5:13 AM, Mohammad Norouzi wrote:
>
> > Hi all,
> >
> > Can anyone explain what is the FieldSelector and the usage or
> > benefits of
> > this structure? I read the javadocs but I can't get for what goal
> > it is
> > provided in Lucene.
> >
> > Thanks in advance
> >
> > --
> > Regards,
> > Mohammad
> > --------------------------
> > see my blog: http://brainable.blogspot.com/
> > another in Persian: http://fekre-motefavet.blogspot.com/
>
> --------------------------
> Grant Ingersoll
> http://lucene.grantingersoll.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


--
Regards,
Mohammad
--------------------------
see my blog: http://brainable.blogspot.com/
another in Persian: http://fekre-motefavet.blogspot.com/


erickerickson at gmail

Sep 12, 2007, 6:53 AM

Post #4 of 12 (2911 views)
Permalink
Re: regarding FieldSelector [In reply to]

Well, it depends on what "improve the search process" means
in your context <G>..

But I had a case similar to yours that I wrote up in the Wiki where
my search times improved about 10X by using lazy loading. You
might want to read that entry here...

http://wiki.apache.org/lucene-java/FieldSelectorPerformance

Note the peculiar characteristics of my data set, I really suspect
that a 10x improvement in retrieval speed is atypical...

As for when lazily-loaded fields actually get loaded, I didn't really
have to explore it very fully, but a short experiment should do it
for you.....

Best
Erick

On 9/12/07, Mohammad Norouzi <mnrz57 [at] gmail> wrote:
>
> Hi Grant,
> Really thanks for your nice document about advanced Lucene. it was very
> useful for me.
>
> as I understand, we can set some large fields to be lazily loading, now my
> question is when it will be loaded? it make sense when we call
> doc.get("field_name")
> it will load from the index, Am I right?
>
> in my application, I've provided a result set structure to navigate
> between
> results and documents and provide a get(String fieldname) method just like
> java.sql.ResultSet.getString() method, and also this result set implements
> HitCollector in order to collect my own ID rather than Lucene's document
> id,
> so I think I can set my field ID to be loaded always and the other fields
> to
> be lazily loading, Does this improve the search process?
>
> again, thank you very much indeed.
>
>
> On 9/12/07, Grant Ingersoll <gsingers [at] apache> wrote:
> >
> > Hi Mohammad,
> >
> > The typical use cases are:
> > 1. You have several small fields used in a results display and one or
> > two large fields (i.e. the original document) and you don't want to
> > pay the cost of loading the large fields for results display because
> > most of them won't be chosen. When a result is chosen, the lazily
> > loaded field will be retrieved.
> >
> > 2. You only want to load certain fields, or the first field, or you
> > just want to know the size of a field.
> >
> > Basically, it gives you control over how fields are loaded from disk
> > in Lucene.
> >
> > See my ApacheCon Europe presentation http://cnlp.org/presentations/
> > slides/AdvancedLuceneEU.pdf for a few slides (towards the end) on
> > FieldSelector.
> >
> > On Sep 12, 2007, at 5:13 AM, Mohammad Norouzi wrote:
> >
> > > Hi all,
> > >
> > > Can anyone explain what is the FieldSelector and the usage or
> > > benefits of
> > > this structure? I read the javadocs but I can't get for what goal
> > > it is
> > > provided in Lucene.
> > >
> > > Thanks in advance
> > >
> > > --
> > > Regards,
> > > Mohammad
> > > --------------------------
> > > see my blog: http://brainable.blogspot.com/
> > > another in Persian: http://fekre-motefavet.blogspot.com/
> >
> > --------------------------
> > Grant Ingersoll
> > http://lucene.grantingersoll.com
> >
> > Lucene Helpful Hints:
> > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > http://wiki.apache.org/lucene-java/LuceneFAQ
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > For additional commands, e-mail: java-user-help [at] lucene
> >
> >
>
>
> --
> Regards,
> Mohammad
> --------------------------
> see my blog: http://brainable.blogspot.com/
> another in Persian: http://fekre-motefavet.blogspot.com/
>


mnrz57 at gmail

Sep 13, 2007, 1:50 AM

Post #5 of 12 (2919 views)
Permalink
Re: regarding FieldSelector [In reply to]

Thanks
as I saw the documents, we can only use this great field selector in
IndexReader.document() method the problem is I have a Searcher in my result
set structure and when the client calls getString("a_field_name") at that
time I invoke the searcher.doc(current_doc_id).get("a_field_name),
I already collected the result IDs. so in my case, I can't use
FieldSelector.

Do I have to revise the way of retrieving documents in my code?



On 9/12/07, Erick Erickson <erickerickson [at] gmail> wrote:
>
> Well, it depends on what "improve the search process" means
> in your context <G>..
>
> But I had a case similar to yours that I wrote up in the Wiki where
> my search times improved about 10X by using lazy loading. You
> might want to read that entry here...
>
> http://wiki.apache.org/lucene-java/FieldSelectorPerformance
>
> Note the peculiar characteristics of my data set, I really suspect
> that a 10x improvement in retrieval speed is atypical...
>
> As for when lazily-loaded fields actually get loaded, I didn't really
> have to explore it very fully, but a short experiment should do it
> for you.....
>
> Best
> Erick
>
> On 9/12/07, Mohammad Norouzi <mnrz57 [at] gmail> wrote:
> >
> > Hi Grant,
> > Really thanks for your nice document about advanced Lucene. it was very
> > useful for me.
> >
> > as I understand, we can set some large fields to be lazily loading, now
> my
> > question is when it will be loaded? it make sense when we call
> > doc.get("field_name")
> > it will load from the index, Am I right?
> >
> > in my application, I've provided a result set structure to navigate
> > between
> > results and documents and provide a get(String fieldname) method just
> like
> > java.sql.ResultSet.getString() method, and also this result set
> implements
> > HitCollector in order to collect my own ID rather than Lucene's document
> > id,
> > so I think I can set my field ID to be loaded always and the other
> fields
> > to
> > be lazily loading, Does this improve the search process?
> >
> > again, thank you very much indeed.
> >
> >
> > On 9/12/07, Grant Ingersoll <gsingers [at] apache> wrote:
> > >
> > > Hi Mohammad,
> > >
> > > The typical use cases are:
> > > 1. You have several small fields used in a results display and one or
> > > two large fields (i.e. the original document) and you don't want to
> > > pay the cost of loading the large fields for results display because
> > > most of them won't be chosen. When a result is chosen, the lazily
> > > loaded field will be retrieved.
> > >
> > > 2. You only want to load certain fields, or the first field, or you
> > > just want to know the size of a field.
> > >
> > > Basically, it gives you control over how fields are loaded from disk
> > > in Lucene.
> > >
> > > See my ApacheCon Europe presentation http://cnlp.org/presentations/
> > > slides/AdvancedLuceneEU.pdf for a few slides (towards the end) on
> > > FieldSelector.
> > >
> > > On Sep 12, 2007, at 5:13 AM, Mohammad Norouzi wrote:
> > >
> > > > Hi all,
> > > >
> > > > Can anyone explain what is the FieldSelector and the usage or
> > > > benefits of
> > > > this structure? I read the javadocs but I can't get for what goal
> > > > it is
> > > > provided in Lucene.
> > > >
> > > > Thanks in advance
> > > >
> > > > --
> > > > Regards,
> > > > Mohammad
> > > > --------------------------
> > > > see my blog: http://brainable.blogspot.com/
> > > > another in Persian: http://fekre-motefavet.blogspot.com/
> > >
> > > --------------------------
> > > Grant Ingersoll
> > > http://lucene.grantingersoll.com
> > >
> > > Lucene Helpful Hints:
> > > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > > http://wiki.apache.org/lucene-java/LuceneFAQ
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > > For additional commands, e-mail: java-user-help [at] lucene
> > >
> > >
> >
> >
> > --
> > Regards,
> > Mohammad
> > --------------------------
> > see my blog: http://brainable.blogspot.com/
> > another in Persian: http://fekre-motefavet.blogspot.com/
> >
>



--
Regards,
Mohammad
--------------------------
see my blog: http://brainable.blogspot.com/
another in Persian: http://fekre-motefavet.blogspot.com/


erickerickson at gmail

Sep 13, 2007, 7:01 AM

Post #6 of 12 (2908 views)
Permalink
Re: regarding FieldSelector [In reply to]

Do you have any evidence that you're having a performance issue? If
not, I'd just do the simple thing and ignore the rest. The performance
issues I found were because I was spinning through many, many
documents. If you're only worrying about one document at a time,
it may not be an issue.

If you *are* having performance issues, I'd *strongly* recommend
that you measure to find out where the problem is before trying
a solution. Otherwise you'll optimize code that isn't the problem.

Best
Erick

On 9/13/07, Mohammad Norouzi <mnrz57 [at] gmail> wrote:
>
> Thanks
> as I saw the documents, we can only use this great field selector in
> IndexReader.document() method the problem is I have a Searcher in my
> result
> set structure and when the client calls getString("a_field_name") at that
> time I invoke the searcher.doc(current_doc_id).get("a_field_name),
> I already collected the result IDs. so in my case, I can't use
> FieldSelector.
>
> Do I have to revise the way of retrieving documents in my code?
>
>
>
> On 9/12/07, Erick Erickson <erickerickson [at] gmail> wrote:
> >
> > Well, it depends on what "improve the search process" means
> > in your context <G>..
> >
> > But I had a case similar to yours that I wrote up in the Wiki where
> > my search times improved about 10X by using lazy loading. You
> > might want to read that entry here...
> >
> > http://wiki.apache.org/lucene-java/FieldSelectorPerformance
> >
> > Note the peculiar characteristics of my data set, I really suspect
> > that a 10x improvement in retrieval speed is atypical...
> >
> > As for when lazily-loaded fields actually get loaded, I didn't really
> > have to explore it very fully, but a short experiment should do it
> > for you.....
> >
> > Best
> > Erick
> >
> > On 9/12/07, Mohammad Norouzi <mnrz57 [at] gmail> wrote:
> > >
> > > Hi Grant,
> > > Really thanks for your nice document about advanced Lucene. it was
> very
> > > useful for me.
> > >
> > > as I understand, we can set some large fields to be lazily loading,
> now
> > my
> > > question is when it will be loaded? it make sense when we call
> > > doc.get("field_name")
> > > it will load from the index, Am I right?
> > >
> > > in my application, I've provided a result set structure to navigate
> > > between
> > > results and documents and provide a get(String fieldname) method just
> > like
> > > java.sql.ResultSet.getString() method, and also this result set
> > implements
> > > HitCollector in order to collect my own ID rather than Lucene's
> document
> > > id,
> > > so I think I can set my field ID to be loaded always and the other
> > fields
> > > to
> > > be lazily loading, Does this improve the search process?
> > >
> > > again, thank you very much indeed.
> > >
> > >
> > > On 9/12/07, Grant Ingersoll <gsingers [at] apache> wrote:
> > > >
> > > > Hi Mohammad,
> > > >
> > > > The typical use cases are:
> > > > 1. You have several small fields used in a results display and one
> or
> > > > two large fields (i.e. the original document) and you don't want to
> > > > pay the cost of loading the large fields for results display because
> > > > most of them won't be chosen. When a result is chosen, the lazily
> > > > loaded field will be retrieved.
> > > >
> > > > 2. You only want to load certain fields, or the first field, or you
> > > > just want to know the size of a field.
> > > >
> > > > Basically, it gives you control over how fields are loaded from disk
> > > > in Lucene.
> > > >
> > > > See my ApacheCon Europe presentation http://cnlp.org/presentations/
> > > > slides/AdvancedLuceneEU.pdf for a few slides (towards the end) on
> > > > FieldSelector.
> > > >
> > > > On Sep 12, 2007, at 5:13 AM, Mohammad Norouzi wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Can anyone explain what is the FieldSelector and the usage or
> > > > > benefits of
> > > > > this structure? I read the javadocs but I can't get for what goal
> > > > > it is
> > > > > provided in Lucene.
> > > > >
> > > > > Thanks in advance
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Mohammad
> > > > > --------------------------
> > > > > see my blog: http://brainable.blogspot.com/
> > > > > another in Persian: http://fekre-motefavet.blogspot.com/
> > > >
> > > > --------------------------
> > > > Grant Ingersoll
> > > > http://lucene.grantingersoll.com
> > > >
> > > > Lucene Helpful Hints:
> > > > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > > > http://wiki.apache.org/lucene-java/LuceneFAQ
> > > >
> > > >
> > > >
> > > >
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > > > For additional commands, e-mail: java-user-help [at] lucene
> > > >
> > > >
> > >
> > >
> > > --
> > > Regards,
> > > Mohammad
> > > --------------------------
> > > see my blog: http://brainable.blogspot.com/
> > > another in Persian: http://fekre-motefavet.blogspot.com/
> > >
> >
>
>
>
> --
> Regards,
> Mohammad
> --------------------------
> see my blog: http://brainable.blogspot.com/
> another in Persian: http://fekre-motefavet.blogspot.com/
>


mnrz57 at gmail

Sep 13, 2007, 7:14 AM

Post #7 of 12 (2916 views)
Permalink
Re: regarding FieldSelector [In reply to]

well, actually, I have 5 index directory and it will increase in future. and
the thing is that each document about 20 fields on average. considering many
users may connect to the system (we anticipate 500 users at this time) I
want to know whether this will make performance issue or not.

we provided a feature to select which fields they want to be displayed so I
know that only 5 or 6 fields are important to my users. I don't know the way
I stated in my last email, I mean searcher.doc(doc_id).get("field_name"),
make the Lucene to load all fields of the document or only the given name?
if yes, I mean if all the fields are loaded I think it's better to make them
lazy.

what do you suggest?

thanks


On 9/13/07, Erick Erickson <erickerickson [at] gmail> wrote:
>
> Do you have any evidence that you're having a performance issue? If
> not, I'd just do the simple thing and ignore the rest. The performance
> issues I found were because I was spinning through many, many
> documents. If you're only worrying about one document at a time,
> it may not be an issue.
>
> If you *are* having performance issues, I'd *strongly* recommend
> that you measure to find out where the problem is before trying
> a solution. Otherwise you'll optimize code that isn't the problem.
>
> Best
> Erick
>
> On 9/13/07, Mohammad Norouzi <mnrz57 [at] gmail> wrote:
> >
> > Thanks
> > as I saw the documents, we can only use this great field selector in
> > IndexReader.document() method the problem is I have a Searcher in my
> > result
> > set structure and when the client calls getString("a_field_name") at
> that
> > time I invoke the searcher.doc(current_doc_id).get("a_field_name),
> > I already collected the result IDs. so in my case, I can't use
> > FieldSelector.
> >
> > Do I have to revise the way of retrieving documents in my code?
> >
> >
> >
> > On 9/12/07, Erick Erickson <erickerickson [at] gmail> wrote:
> > >
> > > Well, it depends on what "improve the search process" means
> > > in your context <G>..
> > >
> > > But I had a case similar to yours that I wrote up in the Wiki where
> > > my search times improved about 10X by using lazy loading. You
> > > might want to read that entry here...
> > >
> > > http://wiki.apache.org/lucene-java/FieldSelectorPerformance
> > >
> > > Note the peculiar characteristics of my data set, I really suspect
> > > that a 10x improvement in retrieval speed is atypical...
> > >
> > > As for when lazily-loaded fields actually get loaded, I didn't really
> > > have to explore it very fully, but a short experiment should do it
> > > for you.....
> > >
> > > Best
> > > Erick
> > >
> > > On 9/12/07, Mohammad Norouzi <mnrz57 [at] gmail> wrote:
> > > >
> > > > Hi Grant,
> > > > Really thanks for your nice document about advanced Lucene. it was
> > very
> > > > useful for me.
> > > >
> > > > as I understand, we can set some large fields to be lazily loading,
> > now
> > > my
> > > > question is when it will be loaded? it make sense when we call
> > > > doc.get("field_name")
> > > > it will load from the index, Am I right?
> > > >
> > > > in my application, I've provided a result set structure to navigate
> > > > between
> > > > results and documents and provide a get(String fieldname) method
> just
> > > like
> > > > java.sql.ResultSet.getString() method, and also this result set
> > > implements
> > > > HitCollector in order to collect my own ID rather than Lucene's
> > document
> > > > id,
> > > > so I think I can set my field ID to be loaded always and the other
> > > fields
> > > > to
> > > > be lazily loading, Does this improve the search process?
> > > >
> > > > again, thank you very much indeed.
> > > >
> > > >
> > > > On 9/12/07, Grant Ingersoll <gsingers [at] apache> wrote:
> > > > >
> > > > > Hi Mohammad,
> > > > >
> > > > > The typical use cases are:
> > > > > 1. You have several small fields used in a results display and one
> > or
> > > > > two large fields (i.e. the original document) and you don't want
> to
> > > > > pay the cost of loading the large fields for results display
> because
> > > > > most of them won't be chosen. When a result is chosen, the lazily
> > > > > loaded field will be retrieved.
> > > > >
> > > > > 2. You only want to load certain fields, or the first field, or
> you
> > > > > just want to know the size of a field.
> > > > >
> > > > > Basically, it gives you control over how fields are loaded from
> disk
> > > > > in Lucene.
> > > > >
> > > > > See my ApacheCon Europe presentation
> http://cnlp.org/presentations/
> > > > > slides/AdvancedLuceneEU.pdf for a few slides (towards the end) on
> > > > > FieldSelector.
> > > > >
> > > > > On Sep 12, 2007, at 5:13 AM, Mohammad Norouzi wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Can anyone explain what is the FieldSelector and the usage or
> > > > > > benefits of
> > > > > > this structure? I read the javadocs but I can't get for what
> goal
> > > > > > it is
> > > > > > provided in Lucene.
> > > > > >
> > > > > > Thanks in advance
> > > > > >
> > > > > > --
> > > > > > Regards,
> > > > > > Mohammad
> > > > > > --------------------------
> > > > > > see my blog: http://brainable.blogspot.com/
> > > > > > another in Persian: http://fekre-motefavet.blogspot.com/
> > > > >
> > > > > --------------------------
> > > > > Grant Ingersoll
> > > > > http://lucene.grantingersoll.com
> > > > >
> > > > > Lucene Helpful Hints:
> > > > > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > > > > http://wiki.apache.org/lucene-java/LuceneFAQ
> > > > >
> > > > >
> > > > >
> > > > >
> > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > > > > For additional commands, e-mail: java-user-help [at] lucene
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > Mohammad
> > > > --------------------------
> > > > see my blog: http://brainable.blogspot.com/
> > > > another in Persian: http://fekre-motefavet.blogspot.com/
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> > Mohammad
> > --------------------------
> > see my blog: http://brainable.blogspot.com/
> > another in Persian: http://fekre-motefavet.blogspot.com/
> >
>



--
Regards,
Mohammad
--------------------------
see my blog: http://brainable.blogspot.com/
another in Persian: http://fekre-motefavet.blogspot.com/


erickerickson at gmail

Sep 13, 2007, 11:45 AM

Post #8 of 12 (2904 views)
Permalink
Re: regarding FieldSelector [In reply to]

I'm not entirely sure. So what I'd do if I were you is write a
little test program and step through it in the debugger and
see <G>....

But, if you're only allowing the user to fetch a single document
at a time, I don't think it matters enough to worry about. If, on the
other hand, you're allowing the user to display some combination
of, say, 5 fields for a *list* of documents, I'd make them all lazy
and then you can write a HitCollector to get the list "lazily".

Best
Erick

On 9/13/07, Mohammad Norouzi <mnrz57 [at] gmail> wrote:
>
> well, actually, I have 5 index directory and it will increase in future.
> and
> the thing is that each document about 20 fields on average. considering
> many
> users may connect to the system (we anticipate 500 users at this time) I
> want to know whether this will make performance issue or not.
>
> we provided a feature to select which fields they want to be displayed so
> I
> know that only 5 or 6 fields are important to my users. I don't know the
> way
> I stated in my last email, I mean searcher.doc(doc_id).get("field_name"),
> make the Lucene to load all fields of the document or only the given name?
> if yes, I mean if all the fields are loaded I think it's better to make
> them
> lazy.
>
> what do you suggest?
>
> thanks
>
>
> On 9/13/07, Erick Erickson <erickerickson [at] gmail> wrote:
> >
> > Do you have any evidence that you're having a performance issue? If
> > not, I'd just do the simple thing and ignore the rest. The performance
> > issues I found were because I was spinning through many, many
> > documents. If you're only worrying about one document at a time,
> > it may not be an issue.
> >
> > If you *are* having performance issues, I'd *strongly* recommend
> > that you measure to find out where the problem is before trying
> > a solution. Otherwise you'll optimize code that isn't the problem.
> >
> > Best
> > Erick
> >
> > On 9/13/07, Mohammad Norouzi <mnrz57 [at] gmail> wrote:
> > >
> > > Thanks
> > > as I saw the documents, we can only use this great field selector in
> > > IndexReader.document() method the problem is I have a Searcher in my
> > > result
> > > set structure and when the client calls getString("a_field_name") at
> > that
> > > time I invoke the searcher.doc(current_doc_id).get("a_field_name),
> > > I already collected the result IDs. so in my case, I can't use
> > > FieldSelector.
> > >
> > > Do I have to revise the way of retrieving documents in my code?
> > >
> > >
> > >
> > > On 9/12/07, Erick Erickson <erickerickson [at] gmail> wrote:
> > > >
> > > > Well, it depends on what "improve the search process" means
> > > > in your context <G>..
> > > >
> > > > But I had a case similar to yours that I wrote up in the Wiki where
> > > > my search times improved about 10X by using lazy loading. You
> > > > might want to read that entry here...
> > > >
> > > > http://wiki.apache.org/lucene-java/FieldSelectorPerformance
> > > >
> > > > Note the peculiar characteristics of my data set, I really suspect
> > > > that a 10x improvement in retrieval speed is atypical...
> > > >
> > > > As for when lazily-loaded fields actually get loaded, I didn't
> really
> > > > have to explore it very fully, but a short experiment should do it
> > > > for you.....
> > > >
> > > > Best
> > > > Erick
> > > >
> > > > On 9/12/07, Mohammad Norouzi <mnrz57 [at] gmail> wrote:
> > > > >
> > > > > Hi Grant,
> > > > > Really thanks for your nice document about advanced Lucene. it was
> > > very
> > > > > useful for me.
> > > > >
> > > > > as I understand, we can set some large fields to be lazily
> loading,
> > > now
> > > > my
> > > > > question is when it will be loaded? it make sense when we call
> > > > > doc.get("field_name")
> > > > > it will load from the index, Am I right?
> > > > >
> > > > > in my application, I've provided a result set structure to
> navigate
> > > > > between
> > > > > results and documents and provide a get(String fieldname) method
> > just
> > > > like
> > > > > java.sql.ResultSet.getString() method, and also this result set
> > > > implements
> > > > > HitCollector in order to collect my own ID rather than Lucene's
> > > document
> > > > > id,
> > > > > so I think I can set my field ID to be loaded always and the other
> > > > fields
> > > > > to
> > > > > be lazily loading, Does this improve the search process?
> > > > >
> > > > > again, thank you very much indeed.
> > > > >
> > > > >
> > > > > On 9/12/07, Grant Ingersoll <gsingers [at] apache> wrote:
> > > > > >
> > > > > > Hi Mohammad,
> > > > > >
> > > > > > The typical use cases are:
> > > > > > 1. You have several small fields used in a results display and
> one
> > > or
> > > > > > two large fields (i.e. the original document) and you don't want
> > to
> > > > > > pay the cost of loading the large fields for results display
> > because
> > > > > > most of them won't be chosen. When a result is chosen, the
> lazily
> > > > > > loaded field will be retrieved.
> > > > > >
> > > > > > 2. You only want to load certain fields, or the first field, or
> > you
> > > > > > just want to know the size of a field.
> > > > > >
> > > > > > Basically, it gives you control over how fields are loaded from
> > disk
> > > > > > in Lucene.
> > > > > >
> > > > > > See my ApacheCon Europe presentation
> > http://cnlp.org/presentations/
> > > > > > slides/AdvancedLuceneEU.pdf for a few slides (towards the end)
> on
> > > > > > FieldSelector.
> > > > > >
> > > > > > On Sep 12, 2007, at 5:13 AM, Mohammad Norouzi wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > Can anyone explain what is the FieldSelector and the usage or
> > > > > > > benefits of
> > > > > > > this structure? I read the javadocs but I can't get for what
> > goal
> > > > > > > it is
> > > > > > > provided in Lucene.
> > > > > > >
> > > > > > > Thanks in advance
> > > > > > >
> > > > > > > --
> > > > > > > Regards,
> > > > > > > Mohammad
> > > > > > > --------------------------
> > > > > > > see my blog: http://brainable.blogspot.com/
> > > > > > > another in Persian: http://fekre-motefavet.blogspot.com/
> > > > > >
> > > > > > --------------------------
> > > > > > Grant Ingersoll
> > > > > > http://lucene.grantingersoll.com
> > > > > >
> > > > > > Lucene Helpful Hints:
> > > > > > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > > > > > http://wiki.apache.org/lucene-java/LuceneFAQ
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > > > > > For additional commands, e-mail:
> java-user-help [at] lucene
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Mohammad
> > > > > --------------------------
> > > > > see my blog: http://brainable.blogspot.com/
> > > > > another in Persian: http://fekre-motefavet.blogspot.com/
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Mohammad
> > > --------------------------
> > > see my blog: http://brainable.blogspot.com/
> > > another in Persian: http://fekre-motefavet.blogspot.com/
> > >
> >
>
>
>
> --
> Regards,
> Mohammad
> --------------------------
> see my blog: http://brainable.blogspot.com/
> another in Persian: http://fekre-motefavet.blogspot.com/
>


mnrz57 at gmail

Sep 14, 2007, 12:48 AM

Post #9 of 12 (2912 views)
Permalink
Re: regarding FieldSelector [In reply to]

actually, I show the result with pagination support, and they have option to
choose the number of records per page. and yes, I should provide a test
program, but about the HitCollector, I already created one, and collect all
lucene's document id and also my needed ID that stored in the index

>> you can write a HitCollector to get the list "lazily".

do you mean by writing a HitCollector, all the list will load lazily and no
need to use FieldSelector?


thanks

On 9/13/07, Erick Erickson <erickerickson [at] gmail> wrote:
>
> I'm not entirely sure. So what I'd do if I were you is write a
> little test program and step through it in the debugger and
> see <G>....
>
> But, if you're only allowing the user to fetch a single document
> at a time, I don't think it matters enough to worry about. If, on the
> other hand, you're allowing the user to display some combination
> of, say, 5 fields for a *list* of documents, I'd make them all lazy
> and then you can write a HitCollector to get the list "lazily".
>
> Best
> Erick
>
>
--
Regards,
Mohammad
--------------------------
see my blog: http://brainable.blogspot.com/
another in Persian: http://fekre-motefavet.blogspot.com/


grant.ingersoll at gmail

Sep 14, 2007, 6:39 AM

Post #10 of 12 (2903 views)
Permalink
Re: regarding FieldSelector [In reply to]

Searcher is a Searchable and Searchable defines the doc() method with
FieldSelector, but I suppose we could add an abstract declaration of
it to Searcher, since it has to be implemented on all derived
classes anyway due to it being on the Searchable interface.

So, you can either cast to a known Searcher or I suppose you can
figure out a way to get the IndexReader. What kind of Searcher are
you using?

-Grant

On Sep 13, 2007, at 4:50 AM, Mohammad Norouzi wrote:

> Thanks
> as I saw the documents, we can only use this great field selector in
> IndexReader.document() method the problem is I have a Searcher in
> my result
> set structure and when the client calls getString("a_field_name")
> at that
> time I invoke the searcher.doc(current_doc_id).get("a_field_name),
> I already collected the result IDs. so in my case, I can't use
> FieldSelector.
>
> Do I have to revise the way of retrieving documents in my code?
>
>
>
> On 9/12/07, Erick Erickson <erickerickson [at] gmail> wrote:
>>
>> Well, it depends on what "improve the search process" means
>> in your context <G>..
>>
>> But I had a case similar to yours that I wrote up in the Wiki where
>> my search times improved about 10X by using lazy loading. You
>> might want to read that entry here...
>>
>> http://wiki.apache.org/lucene-java/FieldSelectorPerformance
>>
>> Note the peculiar characteristics of my data set, I really suspect
>> that a 10x improvement in retrieval speed is atypical...
>>
>> As for when lazily-loaded fields actually get loaded, I didn't really
>> have to explore it very fully, but a short experiment should do it
>> for you.....
>>
>> Best
>> Erick
>>
>> On 9/12/07, Mohammad Norouzi <mnrz57 [at] gmail> wrote:
>>>
>>> Hi Grant,
>>> Really thanks for your nice document about advanced Lucene. it
>>> was very
>>> useful for me.
>>>
>>> as I understand, we can set some large fields to be lazily
>>> loading, now
>> my
>>> question is when it will be loaded? it make sense when we call
>>> doc.get("field_name")
>>> it will load from the index, Am I right?
>>>
>>> in my application, I've provided a result set structure to navigate
>>> between
>>> results and documents and provide a get(String fieldname) method
>>> just
>> like
>>> java.sql.ResultSet.getString() method, and also this result set
>> implements
>>> HitCollector in order to collect my own ID rather than Lucene's
>>> document
>>> id,
>>> so I think I can set my field ID to be loaded always and the other
>> fields
>>> to
>>> be lazily loading, Does this improve the search process?
>>>
>>> again, thank you very much indeed.
>>>
>>>
>>> On 9/12/07, Grant Ingersoll <gsingers [at] apache> wrote:
>>>>
>>>> Hi Mohammad,
>>>>
>>>> The typical use cases are:
>>>> 1. You have several small fields used in a results display and
>>>> one or
>>>> two large fields (i.e. the original document) and you don't want to
>>>> pay the cost of loading the large fields for results display
>>>> because
>>>> most of them won't be chosen. When a result is chosen, the lazily
>>>> loaded field will be retrieved.
>>>>
>>>> 2. You only want to load certain fields, or the first field, or you
>>>> just want to know the size of a field.
>>>>
>>>> Basically, it gives you control over how fields are loaded from
>>>> disk
>>>> in Lucene.
>>>>
>>>> See my ApacheCon Europe presentation http://cnlp.org/presentations/
>>>> slides/AdvancedLuceneEU.pdf for a few slides (towards the end) on
>>>> FieldSelector.
>>>>
>>>> On Sep 12, 2007, at 5:13 AM, Mohammad Norouzi wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Can anyone explain what is the FieldSelector and the usage or
>>>>> benefits of
>>>>> this structure? I read the javadocs but I can't get for what goal
>>>>> it is
>>>>> provided in Lucene.
>>>>>
>>>>> Thanks in advance
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Mohammad
>>>>> --------------------------
>>>>> see my blog: http://brainable.blogspot.com/
>>>>> another in Persian: http://fekre-motefavet.blogspot.com/
>>>>
>>>> --------------------------
>>>> Grant Ingersoll
>>>> http://lucene.grantingersoll.com
>>>>
>>>> Lucene Helpful Hints:
>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>>
>>>>
>>>>
>>>> -------------------------------------------------------------------
>>>> --
>>>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>>>> For additional commands, e-mail: java-user-help [at] lucene
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Mohammad
>>> --------------------------
>>> see my blog: http://brainable.blogspot.com/
>>> another in Persian: http://fekre-motefavet.blogspot.com/
>>>
>>
>
>
>
> --
> Regards,
> Mohammad
> --------------------------
> see my blog: http://brainable.blogspot.com/
> another in Persian: http://fekre-motefavet.blogspot.com/

------------------------------------------------------
Grant Ingersoll
http://www.grantingersoll.com/
http://lucene.grantingersoll.com
http://www.paperoftheweek.com/



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


mnrz57 at gmail

Sep 14, 2007, 9:53 PM

Post #11 of 12 (2899 views)
Permalink
Re: regarding FieldSelector [In reply to]

well, I can't see any doc() method with FieldSelector argument, perhaps this
is provided in nightly builds of Lucene, currently I am using Lucene v2.1.0
I am using org.apache.lucene.search.Searcher and new
IndexSearcher(a_directory) to instantiate an instance of it


On 9/14/07, Grant Ingersoll <grant.ingersoll [at] gmail> wrote:
>
> Searcher is a Searchable and Searchable defines the doc() method with
> FieldSelector, but I suppose we could add an abstract declaration of
> it to Searcher, since it has to be implemented on all derived
> classes anyway due to it being on the Searchable interface.
>
> So, you can either cast to a known Searcher or I suppose you can
> figure out a way to get the IndexReader. What kind of Searcher are
> you using?
>
> -Grant
>
> On Sep 13, 2007, at 4:50 AM, Mohammad Norouzi wrote:
>
> > Thanks
> > as I saw the documents, we can only use this great field selector in
> > IndexReader.document() method the problem is I have a Searcher in
> > my result
> > set structure and when the client calls getString("a_field_name")
> > at that
> > time I invoke the searcher.doc(current_doc_id).get("a_field_name),
> > I already collected the result IDs. so in my case, I can't use
> > FieldSelector.
> >
> > Do I have to revise the way of retrieving documents in my code?
> >
> >
> >
> > On 9/12/07, Erick Erickson <erickerickson [at] gmail> wrote:
> >>
> >> Well, it depends on what "improve the search process" means
> >> in your context <G>..
> >>
> >> But I had a case similar to yours that I wrote up in the Wiki where
> >> my search times improved about 10X by using lazy loading. You
> >> might want to read that entry here...
> >>
> >> http://wiki.apache.org/lucene-java/FieldSelectorPerformance
> >>
> >> Note the peculiar characteristics of my data set, I really suspect
> >> that a 10x improvement in retrieval speed is atypical...
> >>
> >> As for when lazily-loaded fields actually get loaded, I didn't really
> >> have to explore it very fully, but a short experiment should do it
> >> for you.....
> >>
> >> Best
> >> Erick
> >>
> >> On 9/12/07, Mohammad Norouzi <mnrz57 [at] gmail> wrote:
> >>>
> >>> Hi Grant,
> >>> Really thanks for your nice document about advanced Lucene. it
> >>> was very
> >>> useful for me.
> >>>
> >>> as I understand, we can set some large fields to be lazily
> >>> loading, now
> >> my
> >>> question is when it will be loaded? it make sense when we call
> >>> doc.get("field_name")
> >>> it will load from the index, Am I right?
> >>>
> >>> in my application, I've provided a result set structure to navigate
> >>> between
> >>> results and documents and provide a get(String fieldname) method
> >>> just
> >> like
> >>> java.sql.ResultSet.getString() method, and also this result set
> >> implements
> >>> HitCollector in order to collect my own ID rather than Lucene's
> >>> document
> >>> id,
> >>> so I think I can set my field ID to be loaded always and the other
> >> fields
> >>> to
> >>> be lazily loading, Does this improve the search process?
> >>>
> >>> again, thank you very much indeed.
> >>>
> >>>
> >>> On 9/12/07, Grant Ingersoll <gsingers [at] apache> wrote:
> >>>>
> >>>> Hi Mohammad,
> >>>>
> >>>> The typical use cases are:
> >>>> 1. You have several small fields used in a results display and
> >>>> one or
> >>>> two large fields (i.e. the original document) and you don't want to
> >>>> pay the cost of loading the large fields for results display
> >>>> because
> >>>> most of them won't be chosen. When a result is chosen, the lazily
> >>>> loaded field will be retrieved.
> >>>>
> >>>> 2. You only want to load certain fields, or the first field, or you
> >>>> just want to know the size of a field.
> >>>>
> >>>> Basically, it gives you control over how fields are loaded from
> >>>> disk
> >>>> in Lucene.
> >>>>
> >>>> See my ApacheCon Europe presentation http://cnlp.org/presentations/
> >>>> slides/AdvancedLuceneEU.pdf for a few slides (towards the end) on
> >>>> FieldSelector.
> >>>>
> >>>> On Sep 12, 2007, at 5:13 AM, Mohammad Norouzi wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> Can anyone explain what is the FieldSelector and the usage or
> >>>>> benefits of
> >>>>> this structure? I read the javadocs but I can't get for what goal
> >>>>> it is
> >>>>> provided in Lucene.
> >>>>>
> >>>>> Thanks in advance
> >>>>>
> >>>>> --
> >>>>> Regards,
> >>>>> Mohammad
> >>>>> --------------------------
> >>>>> see my blog: http://brainable.blogspot.com/
> >>>>> another in Persian: http://fekre-motefavet.blogspot.com/
> >>>>
> >>>> --------------------------
> >>>> Grant Ingersoll
> >>>> http://lucene.grantingersoll.com
> >>>>
> >>>> Lucene Helpful Hints:
> >>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> >>>> http://wiki.apache.org/lucene-java/LuceneFAQ
> >>>>
> >>>>
> >>>>
> >>>> -------------------------------------------------------------------
> >>>> --
> >>>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> >>>> For additional commands, e-mail: java-user-help [at] lucene
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Regards,
> >>> Mohammad
> >>> --------------------------
> >>> see my blog: http://brainable.blogspot.com/
> >>> another in Persian: http://fekre-motefavet.blogspot.com/
> >>>
> >>
> >
> >
> >
> > --
> > Regards,
> > Mohammad
> > --------------------------
> > see my blog: http://brainable.blogspot.com/
> > another in Persian: http://fekre-motefavet.blogspot.com/
>
> ------------------------------------------------------
> Grant Ingersoll
> http://www.grantingersoll.com/
> http://lucene.grantingersoll.com
> http://www.paperoftheweek.com/
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


--
Regards,
Mohammad
--------------------------
see my blog: http://brainable.blogspot.com/
another in Persian: http://fekre-motefavet.blogspot.com/


hossman_lucene at fucit

Sep 14, 2007, 10:27 PM

Post #12 of 12 (2907 views)
Permalink
Re: regarding FieldSelector [In reply to]

: well, I can't see any doc() method with FieldSelector argument, perhaps this
: is provided in nightly builds of Lucene, currently I am using Lucene v2.1.0

2.2 was released in June, in it the Searchable interface defines a doc
method which takes a FieldSelector...

http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/search/Searchable.html#doc(int,%20org.apache.lucene.document.FieldSelector)


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.