Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Nested BlockJoinQuery

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


hasghari at gmail

Feb 10, 2012, 2:31 PM

Post #1 of 2 (240 views)
Permalink
Nested BlockJoinQuery

I'm trying to learn more about using BlockJoinQuery in our search application
and I came across this blog post by Mike McCandless:
http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html

The blog post mentions that it is possible to do joins that can be nested
(parent to child to grandchild) but does not elaborate further.

Could someone please explain how to formulate such a query for the following
use case?

Let's say we want to create a music search application where the lucene
index documents are nested as such:

music genre -> band -> band members

Some sample data:

Rock -> Pink Floyd -> Roger Waters, David Gilmour, Richard Wright, Nick
Mason

Pop -> Michael Jackson -> Michael Jackson

Alternative/Indie -> Waters -> Van Pierszalowski

We would like to search for the term "waters" and be able to find out what
the genre/band are. In the case of the sample data above, we would expect
the result set to include 'Rock/Pink Floyd' because of Roger Waters and
'Alternative/Indie' because of the Waters band name.

It seems like this would be a good candidate for using nested BlockJoinQuery
queries.

Thanks,
Hamed

--
View this message in context: http://lucene.472066.n3.nabble.com/Nested-BlockJoinQuery-tp3733885p3733885.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


markharw00d at yahoo

Feb 11, 2012, 6:45 AM

Post #2 of 2 (223 views)
Permalink
Re: Nested BlockJoinQuery [In reply to]

Your requirement does not sound like a good fit for the nested stuff but is probably more one for conventional faceting.

I would characterise the uses for Nested as follows:

1) The parent of a nested block is typically the "item of interest" that is returned i.e. the search results are a list of the parent items
2) The children (and grandchildren) of the parent all must fit comfortably into RAM (an index-time restriction)
3) There is typically more than one child doc of each child type (otherwise we could happily accommodate the single child's fields on the parent document)
4) The set of children for a parent is typically not updated frequently as any change to the membership of the set requires rewriting the whole block of parent plus children.

Examples of things that fit this model are:
a) Resumes of people with many sections on work and education
b) Books with many chapters
c) Products with many components.
d) XML documents

Your example is not a good fit because it breaks several of the characteristics I outlined. A "genre" is an expansive item so would not fit in RAM and undergoes constant change as new "children" are added to the set. Check out Solr faceting for your requirement

Cheers,
Mark





On 10 Feb 2012, at 22:31, hasghari wrote:

> I'm trying to learn more about using BlockJoinQuery in our search application
> and I came across this blog post by Mike McCandless:
> http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html
>
> The blog post mentions that it is possible to do joins that can be nested
> (parent to child to grandchild) but does not elaborate further.
>
> Could someone please explain how to formulate such a query for the following
> use case?
>
> Let's say we want to create a music search application where the lucene
> index documents are nested as such:
>
> music genre -> band -> band members
>
> Some sample data:
>
> Rock -> Pink Floyd -> Roger Waters, David Gilmour, Richard Wright, Nick
> Mason
>
> Pop -> Michael Jackson -> Michael Jackson
>
> Alternative/Indie -> Waters -> Van Pierszalowski
>
> We would like to search for the term "waters" and be able to find out what
> the genre/band are. In the case of the sample data above, we would expect
> the result set to include 'Rock/Pink Floyd' because of Roger Waters and
> 'Alternative/Indie' because of the Waters band name.
>
> It seems like this would be a good candidate for using nested BlockJoinQuery
> queries.
>
> Thanks,
> Hamed
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Nested-BlockJoinQuery-tp3733885p3733885.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.