Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: General

aggregation in lucene

 

 

Lucene general RSS feed   Index | Next | Previous | View Threaded


rvasilou at gmail

Dec 16, 2008, 12:53 PM

Post #1 of 6 (2284 views)
Permalink
aggregation in lucene

Does anyone know how to retrieve aggregated results from lucene? Is is
possible to do something similar to this SQL statement below, which returns
the numbers of books for each author for books published in 2007-2008?

select count(*), author_name
from book
where published_date >= '2007-01-01'
and published_date <= '2008-12-31'
group by author_name

Note: I do NOT want to loop through every document to get these totals.
Also note the "group by" and I am using two fields (author_name and
published_date) so a simple TermEnum does not cut it.

Thanks.
--
View this message in context: http://www.nabble.com/aggregation-in-lucene-tp21041575p21041575.html
Sent from the Lucene - General mailing list archive at Nabble.com.


gsingers at apache

Dec 16, 2008, 7:01 PM

Post #2 of 6 (2187 views)
Permalink
Re: aggregation in lucene [In reply to]

See Solr's faceting capabilities. http://lucene.apache.org/solr and
the wiki from there. Faceting is built right in.


On Dec 16, 2008, at 3:53 PM, RobertV wrote:

>
> Does anyone know how to retrieve aggregated results from lucene? Is is
> possible to do something similar to this SQL statement below, which
> returns
> the numbers of books for each author for books published in 2007-2008?
>
> select count(*), author_name
> from book
> where published_date >= '2007-01-01'
> and published_date <= '2008-12-31'
> group by author_name
>
> Note: I do NOT want to loop through every document to get these
> totals.
> Also note the "group by" and I am using two fields (author_name and
> published_date) so a simple TermEnum does not cut it.
>
> Thanks.
> --
> View this message in context: http://www.nabble.com/aggregation-in-lucene-tp21041575p21041575.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ


rvasilou at gmail

Dec 17, 2008, 7:43 AM

Post #3 of 6 (2201 views)
Permalink
Re: aggregation in lucene [In reply to]

Can I obtain counts for terms and multi-word phrases within a set of
documents? E.g. can I use Solr to retrieve the number of times the phrase
"Wall Street Journal" was mentioned in documents where fieldA = 'abc' and
fieldB = 'xyz' etc..?


Grant Ingersoll-6 wrote:
>
> See Solr's faceting capabilities. http://lucene.apache.org/solr and
> the wiki from there. Faceting is built right in.
>

--
View this message in context: http://www.nabble.com/aggregation-in-lucene-tp21041575p21054569.html
Sent from the Lucene - General mailing list archive at Nabble.com.


gsingers at apache

Dec 17, 2008, 8:35 AM

Post #4 of 6 (2181 views)
Permalink
Re: aggregation in lucene [In reply to]

On Dec 17, 2008, at 10:43 AM, RobertV wrote:

>
> Can I obtain counts for terms and multi-word phrases within a set of
> documents? E.g. can I use Solr to retrieve the number of times the
> phrase
> "Wall Street Journal" was mentioned in documents where fieldA =
> 'abc' and
> fieldB = 'xyz' etc..?
>

I believe so. Isn't that just the number of docs returned for that
query? You can pass arbitrary queries to Solr to facet on, too. I'd
have a look at http://wiki.apache.org/solr/SimpleFacetParameters



>
> Grant Ingersoll-6 wrote:
>>
>> See Solr's faceting capabilities. http://lucene.apache.org/solr and
>> the wiki from there. Faceting is built right in.
>>
>
> --
> View this message in context: http://www.nabble.com/aggregation-in-lucene-tp21041575p21054569.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ


rvasilou at gmail

Dec 17, 2008, 9:18 AM

Post #5 of 6 (2187 views)
Permalink
Re: aggregation in lucene [In reply to]

No, this is not the number of docs returned by the query. The phrase "Wall
Street Journal" might occur 25 times across 10 documents, so the count would
be 25 in this example even though there are 10 associated documents
represented.


Grant Ingersoll-6 wrote:
>
>
> I believe so. Isn't that just the number of docs returned for that
> query? You can pass arbitrary queries to Solr to facet on, too. I'd
> have a look at http://wiki.apache.org/solr/SimpleFacetParameters
>
>

--
View this message in context: http://www.nabble.com/aggregation-in-lucene-tp21041575p21057383.html
Sent from the Lucene - General mailing list archive at Nabble.com.


gsingers at apache

Dec 17, 2008, 9:46 AM

Post #6 of 6 (2177 views)
Permalink
Re: aggregation in lucene [In reply to]

That functionality exists in Lucene (SpanQuery), but is not currently
exposed in Solr. It would be a welcome addition, though, in Solr.

-Grant

On Dec 17, 2008, at 12:18 PM, RobertV wrote:

>
> No, this is not the number of docs returned by the query. The phrase
> "Wall
> Street Journal" might occur 25 times across 10 documents, so the
> count would
> be 25 in this example even though there are 10 associated documents
> represented.
>
>
> Grant Ingersoll-6 wrote:
>>
>>
>> I believe so. Isn't that just the number of docs returned for that
>> query? You can pass arbitrary queries to Solr to facet on, too. I'd
>> have a look at http://wiki.apache.org/solr/SimpleFacetParameters
>>
>>
>
> --
> View this message in context: http://www.nabble.com/aggregation-in-lucene-tp21041575p21057383.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ

Lucene general RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.