Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

caching fields for query performance

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


Robert.Stewart at INFONGEN

Jul 25, 2008, 4:36 AM

Post #1 of 2 (215 views)
Permalink
caching fields for query performance

If I have a frequently queried field, which has a single value per document (such as language), how can I pre-cache all field values, such that the underlying query processing always uses memory cache (never disk i/o) for that particular field? I don't know if it is possible without some custom query processing, which may be difficult in my case.

We have incoming ad-hoc Boolean queries, for instance:

+topic:m&a +topic:earn +company:MSFT +language:(ENG FRA RUS)


We have translated contents per-language, so frequently, the language clause may contain up to 30 language field values. The performance differences between:

+topic:m&a +topic:earn +company:MSFT +language:(ENG FRA RUS)

And

+topic:m&a +topic:earn +company:MSFT

Is very significant (mostly since language is OR clause).

So is there some way to optimize query processing on a per-field basis, such that language clauses are processed more efficiently?

Thanks
Bob


yonik at apache

Jul 25, 2008, 7:46 AM

Post #2 of 2 (198 views)
Permalink
Re: caching fields for query performance [In reply to]

Yes, pull out language:ENG language:FRA language:RUS into filters,
cache them, and re-use them across all queries.

In Lucene, see CachingWrapperFilter()
In Solr, use a separate fq (for filter query) parameter...
q=+topic:m&a +topic:earn +company:MSFT&fq=language:ENG

-Yonik

On Fri, Jul 25, 2008 at 7:36 AM, Robert Stewart
<Robert.Stewart[at]infongen.com> wrote:
> If I have a frequently queried field, which has a single value per document (such as language), how can I pre-cache all field values, such that the underlying query processing always uses memory cache (never disk i/o) for that particular field? I don't know if it is possible without some custom query processing, which may be difficult in my case.
>
> We have incoming ad-hoc Boolean queries, for instance:
>
> +topic:m&a +topic:earn +company:MSFT +language:(ENG FRA RUS)
>
>
> We have translated contents per-language, so frequently, the language clause may contain up to 30 language field values. The performance differences between:
>
> +topic:m&a +topic:earn +company:MSFT +language:(ENG FRA RUS)
>
> And
>
> +topic:m&a +topic:earn +company:MSFT
>
> Is very significant (mostly since language is OR clause).
>
> So is there some way to optimize query processing on a per-field basis, such that language clauses are processed more efficiently?
>
> Thanks
> Bob
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-user-help[at]lucene.apache.org

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.