Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Why Lucene takes longer time for the first query and less for subsequent ones

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


pcdinh at gmail

Nov 17, 2009, 9:39 AM

Post #1 of 5 (645 views)
Permalink
Why Lucene takes longer time for the first query and less for subsequent ones

Hi all,

I made a list of 4 simple, singe term queries and do 4 searches via Lucene
and find that if the term is used for search in the first time, Lucene takes
quite a bit time to handle it.

- Query A
00:27:28,781 INFO LuceneSearchService:151 - Internal search took
328.21463ms
00:27:28,781 INFO SearchController:86 - Page rendered in 338.29553ms

- Query B
00:27:39,171 INFO LuceneSearchService:151 - Internal search took
480.30908ms
00:27:39,187 INFO SearchController:86 - Page rendered in 493.07327ms

- Query C
00:27:46,765 INFO LuceneSearchService:151 - Internal search took
189.33635ms
00:27:46,765 INFO SearchController:86 - Page rendered in 195.43823ms

- Query D
00:28:00,312 INFO LuceneSearchService:151 - Internal search took 330.3596ms
00:28:00,328 INFO SearchController:86 - Page rendered in 347.34747ms


It looks no good at the first glance because I have only 500 000 indexed
documents. However, when I searched them again I found that Lucene run much
faster.

- Query A
00:28:04,046 INFO LuceneSearchService:151 - Internal search took 3.90301ms
00:28:04,062 INFO SearchController:86 - Page rendered in 15.694173ms

- Query C
00:28:15,390 INFO LuceneSearchService:151 - Internal search took 1.425879ms
00:28:15,390 INFO SearchController:86 - Page rendered in 7.946541ms

- Query D
00:28:26,031 INFO LuceneSearchService:151 - Internal search took 1.849956ms
00:28:26,046 INFO SearchController:86 - Page rendered in 12.023037ms

- Query B
00:28:31,609 INFO LuceneSearchService:151 - Internal search took 1.668648ms
00:28:31,625 INFO SearchController:86 - Page rendered in 15.57237ms

Why does it happens? Does it mean that Lucene has an internal cache engine,
just like MySQL query result cache or Oracle query execution plan cache?

Thanks

Dinh


otis_gospodnetic at yahoo

Nov 17, 2009, 11:37 AM

Post #2 of 5 (612 views)
Permalink
Re: Why Lucene takes longer time for the first query and less for subsequent ones [In reply to]

Hello,

Most likely due to the operating system caching the relevant portions of the index after the first set of queries.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Dinh <pcdinh [at] gmail>
> To: java-user [at] lucene
> Sent: Tue, November 17, 2009 12:39:14 PM
> Subject: Why Lucene takes longer time for the first query and less for subsequent ones
>
> Hi all,
>
> I made a list of 4 simple, singe term queries and do 4 searches via Lucene
> and find that if the term is used for search in the first time, Lucene takes
> quite a bit time to handle it.
>
> - Query A
> 00:27:28,781 INFO LuceneSearchService:151 - Internal search took
> 328.21463ms
> 00:27:28,781 INFO SearchController:86 - Page rendered in 338.29553ms
>
> - Query B
> 00:27:39,171 INFO LuceneSearchService:151 - Internal search took
> 480.30908ms
> 00:27:39,187 INFO SearchController:86 - Page rendered in 493.07327ms
>
> - Query C
> 00:27:46,765 INFO LuceneSearchService:151 - Internal search took
> 189.33635ms
> 00:27:46,765 INFO SearchController:86 - Page rendered in 195.43823ms
>
> - Query D
> 00:28:00,312 INFO LuceneSearchService:151 - Internal search took 330.3596ms
> 00:28:00,328 INFO SearchController:86 - Page rendered in 347.34747ms
>
>
> It looks no good at the first glance because I have only 500 000 indexed
> documents. However, when I searched them again I found that Lucene run much
> faster.
>
> - Query A
> 00:28:04,046 INFO LuceneSearchService:151 - Internal search took 3.90301ms
> 00:28:04,062 INFO SearchController:86 - Page rendered in 15.694173ms
>
> - Query C
> 00:28:15,390 INFO LuceneSearchService:151 - Internal search took 1.425879ms
> 00:28:15,390 INFO SearchController:86 - Page rendered in 7.946541ms
>
> - Query D
> 00:28:26,031 INFO LuceneSearchService:151 - Internal search took 1.849956ms
> 00:28:26,046 INFO SearchController:86 - Page rendered in 12.023037ms
>
> - Query B
> 00:28:31,609 INFO LuceneSearchService:151 - Internal search took 1.668648ms
> 00:28:31,625 INFO SearchController:86 - Page rendered in 15.57237ms
>
> Why does it happens? Does it mean that Lucene has an internal cache engine,
> just like MySQL query result cache or Oracle query execution plan cache?
>
> Thanks
>
> Dinh


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


scott_ribe at killerbytes

Nov 17, 2009, 11:43 AM

Post #3 of 5 (616 views)
Permalink
Re: Why Lucene takes longer time for the first query and less for subsequent ones [In reply to]

> Most likely due to the operating system caching the relevant portions of the
> index after the first set of queries.

I have enough RAM to keep the Lucene indexes in memory all the time, so I
"dd ... > /dev/null" the files at boot. And also perform a single query to
force JIT of the query code. Then first queries are fast.

--
Scott Ribe
scott_ribe [at] killerbytes
http://www.killerbytes.com/
(303) 722-0567 voice



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


erickerickson at gmail

Nov 17, 2009, 12:28 PM

Post #4 of 5 (610 views)
Permalink
Re: Why Lucene takes longer time for the first query and less for subsequent ones [In reply to]

The "usual" recommendation is just to fire up a series of warmup
queries at startup if you really require the first queries to be fast.

Best
Erick

On Tue, Nov 17, 2009 at 2:43 PM, Scott Ribe <scott_ribe [at] killerbytes>wrote:

> > Most likely due to the operating system caching the relevant portions of
> the
> > index after the first set of queries.
>
> I have enough RAM to keep the Lucene indexes in memory all the time, so I
> "dd ... > /dev/null" the files at boot. And also perform a single query to
> force JIT of the query code. Then first queries are fast.
>
> --
> Scott Ribe
> scott_ribe [at] killerbytes
> http://www.killerbytes.com/
> (303) 722-0567 voice
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


pcdinh at gmail

Nov 17, 2009, 8:31 PM

Post #5 of 5 (600 views)
Permalink
Re: Why Lucene takes longer time for the first query and less for subsequent ones [In reply to]

Hi,

Thanks for your feedbacks. I have checked it again and found that this
behavior is rather consistent. So may be OS cache and Lucene warm up have
big impact.

Regards,

Dinh

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.