Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Bizarre Search order request

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


ssmith at mainstreamdata

May 25, 2012, 10:00 AM

Post #1 of 3 (154 views)
Permalink
Bizarre Search order request

I really need this on Solr, but thought I would start here as I suspect that, if it's possible, it's some kind of custom relevancy ranking that would need to be done in lucene and then used in SOLR. I will simplify the actual problem somewhat, but I think it will have the gist of what I want to do.

Assume I have a bunch of documents which have a text field and a keyword field. The text field is just ordinary text. The keyword field has a limited number of values (may be in the hundreds) and there is only a single keyword per document. For discussion purposes, let's assume that the keyword is a document type like "mail", "blog", "website", etc.

So someone searches for "dog". I want to display a list of documents with "dog" in the text field. That's easy of course. But, I want to limit it so that for each page of results displayed, if there are multiple document types for the search result, then only a certain number of documents of each type get displayed.

For example, if I display of 20 results, I might want to limit it to a maximum of 10 "mail", 10 "blog" and 10 "website" documents. Which ones get displayed and how they were ordered would depend on the normal relevancy ranking, but, for example, once I had 10 "mail" objects to display on the page, the effect would be that other "mail" objects relevancy would drop below "blog" and "website". If there aren't 10 of one of these, then the I'm allowed to exceed the maximum of 10 so that I get 20 results. What I don't want is 20 "mail" documents if there are "blog" and/or "website" documents to display.

Is something like this even possible? Any thoughts would be appreciated.

Scott


chris.lu at gmail

May 25, 2012, 10:11 AM

Post #2 of 3 (150 views)
Permalink
Re: Bizarre Search order request [In reply to]

Nothing like this yet. But you don't need to do everything in one search
request.

You can send one search request to know that the match distribution for
each document type, and then send 3 requests for 3 document types each.

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got
2.6 Million Euro funding!

On Fri, May 25, 2012 at 10:00 AM, Scott Smith <ssmith [at] mainstreamdata>wrote:

> I really need this on Solr, but thought I would start here as I suspect
> that, if it's possible, it's some kind of custom relevancy ranking that
> would need to be done in lucene and then used in SOLR. I will simplify the
> actual problem somewhat, but I think it will have the gist of what I want
> to do.
>
> Assume I have a bunch of documents which have a text field and a keyword
> field. The text field is just ordinary text. The keyword field has a
> limited number of values (may be in the hundreds) and there is only a
> single keyword per document. For discussion purposes, let's assume that
> the keyword is a document type like "mail", "blog", "website", etc.
>
> So someone searches for "dog". I want to display a list of documents with
> "dog" in the text field. That's easy of course. But, I want to limit it
> so that for each page of results displayed, if there are multiple document
> types for the search result, then only a certain number of documents of
> each type get displayed.
>
> For example, if I display of 20 results, I might want to limit it to a
> maximum of 10 "mail", 10 "blog" and 10 "website" documents. Which ones get
> displayed and how they were ordered would depend on the normal relevancy
> ranking, but, for example, once I had 10 "mail" objects to display on the
> page, the effect would be that other "mail" objects relevancy would drop
> below "blog" and "website". If there aren't 10 of one of these, then the
> I'm allowed to exceed the maximum of 10 so that I get 20 results. What I
> don't want is 20 "mail" documents if there are "blog" and/or "website"
> documents to display.
>
> Is something like this even possible? Any thoughts would be appreciated.
>
> Scott
>
>
>


hossman_lucene at fucit

May 25, 2012, 10:40 AM

Post #3 of 3 (148 views)
Permalink
Re: Bizarre Search order request [In reply to]

: For example, if I display of 20 results, I might want to limit it to a
: maximum of 10 "mail", 10 "blog" and 10 "website" documents. Which ones
: get displayed and how they were ordered would depend on the normal
: relevancy ranking, but, for example, once I had 10 "mail" objects to
: display on the page, the effect would be that other "mail" objects
: relevancy would drop below "blog" and "website". If there aren't 10 of
: one of these, then the I'm allowed to exceed the maximum of 10 so that I
: get 20 results. What I don't want is 20 "mail" documents if there are
: "blog" and/or "website" documents to display.

Most of what you're asking about is a straight forward use of
Result Grouping...

http://wiki.apache.org/solr/FieldCollapsing

...the nit is your statement 'the effect would be that other "mail"
objects relevancy would drop below "blog" and "website"' ... grouping
doesn't change the relevancy scores, it just limits the number of results
per field value.

the canonical UI is to show users the top N per group on the main page
(you can interleave them if you want, or leave them grouped), but give
them them links to see all (ie: redo the search with no grouping and allow
pagination) or see all of a particular type (ie: redo the search with no
grouping and an fq; allow pagination)

-Hoss

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.