Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Wikia search goes live today

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


lukas.vlcek at gmail

Jan 7, 2008, 4:48 AM

Post #1 of 17 (4136 views)
Permalink
Wikia search goes live today

Hi,

I noticed that Wikia search goes live today (see
http://www.devxnews.com/article.php/3719906).
Does anybody know where I could find more technical information about their
solution? Are they going to contribute their enhancements back to
Lucene/Nutch/Hadoop code? My understanding is that as long as they claim
they want to build their solution on top of open source technology they
should be contributing back.

http://re.search.wikia.com/search#+wikia%20+lucene%20+nutch%20+hadoop does
not return anything :-)

Regards,
Lukas
--
http://blog.lukas-vlcek.com/


gsingers at apache

Jan 7, 2008, 5:13 AM

Post #2 of 17 (4064 views)
Permalink
Re: Wikia search goes live today [In reply to]

On Jan 7, 2008, at 7:48 AM, Lukas Vlcek wrote:

> Hi,
>
> I noticed that Wikia search goes live today (see
> http://www.devxnews.com/article.php/3719906).
> Does anybody know where I could find more technical information
> about their
> solution? Are they going to contribute their enhancements back to
> Lucene/Nutch/Hadoop code? My understanding is that as long as they
> claim
> they want to build their solution on top of open source technology
> they
> should be contributing back.

Not sure what they have done, but nothing in the Apache license
requires contribution back, even if it would be appreciated.

Cheers,
Grant

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com
http://www.lucenebootcamp.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


gsingers at apache

Jan 7, 2008, 8:21 AM

Post #3 of 17 (4053 views)
Permalink
Re: Wikia search goes live today [In reply to]

One other thing to note, you can definitely see Lucene in action (or
Nutch, that is) by clicking on the score returned for a given document
(try searching for Lucene) and you see, in all it's glory, the Lucene
explain results... It even displays the Nutch logo, which makes me
wonder if they are misusing an ASF trademark (but, IANAL, so I don't
know) since they don't state that Nutch is a trademark of the ASF.
But, that is a discussion for somewhere else...


On Jan 7, 2008, at 8:13 AM, Grant Ingersoll wrote:

>
> On Jan 7, 2008, at 7:48 AM, Lukas Vlcek wrote:
>
>> Hi,
>>
>> I noticed that Wikia search goes live today (see
>> http://www.devxnews.com/article.php/3719906).
>> Does anybody know where I could find more technical information
>> about their
>> solution? Are they going to contribute their enhancements back to
>> Lucene/Nutch/Hadoop code? My understanding is that as long as they
>> claim
>> they want to build their solution on top of open source technology
>> they
>> should be contributing back.
>
> Not sure what they have done, but nothing in the Apache license
> requires contribution back, even if it would be appreciated.
>
> Cheers,
> Grant
>
> --------------------------
> Grant Ingersoll
> http://lucene.grantingersoll.com
> http://www.lucenebootcamp.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com
http://www.lucenebootcamp.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


otis_gospodnetic at yahoo

Jan 7, 2008, 2:14 PM

Post #4 of 17 (4055 views)
Permalink
Re: Wikia search goes live today [In reply to]

See my comment (around #45-50) on Techcrunch about that from late last night. There is actually one Wikia guy helping Nutch - Dennis Kubes. He must have been hitting reload on that TC post, because he IMed me quickly after I posted my comment and clarified that he is that Wikia developer I was referring to in my comment.... so I'm looking forward to more contributions from Dennis and his coworkers! :)

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Grant Ingersoll <gsingers [at] apache>
To: java-user [at] lucene
Sent: Monday, January 7, 2008 11:21:33 AM
Subject: Re: Wikia search goes live today

One other thing to note, you can definitely see Lucene in action (or
Nutch, that is) by clicking on the score returned for a given document

(try searching for Lucene) and you see, in all it's glory, the Lucene
explain results... It even displays the Nutch logo, which makes me
wonder if they are misusing an ASF trademark (but, IANAL, so I don't
know) since they don't state that Nutch is a trademark of the ASF.
But, that is a discussion for somewhere else...


On Jan 7, 2008, at 8:13 AM, Grant Ingersoll wrote:

>
> On Jan 7, 2008, at 7:48 AM, Lukas Vlcek wrote:
>
>> Hi,
>>
>> I noticed that Wikia search goes live today (see
>> http://www.devxnews.com/article.php/3719906).
>> Does anybody know where I could find more technical information
>> about their
>> solution? Are they going to contribute their enhancements back to
>> Lucene/Nutch/Hadoop code? My understanding is that as long as they
>> claim
>> they want to build their solution on top of open source technology
>> they
>> should be contributing back.
>
> Not sure what they have done, but nothing in the Apache license
> requires contribution back, even if it would be appreciated.
>
> Cheers,
> Grant
>
> --------------------------
> Grant Ingersoll
> http://lucene.grantingersoll.com
> http://www.lucenebootcamp.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com
http://www.lucenebootcamp.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


lukas.vlcek at gmail

Jan 7, 2008, 11:49 PM

Post #5 of 17 (4044 views)
Permalink
Re: Wikia search goes live today [In reply to]

This would be great!

I am particularly interested how they are going about customized search (if
they have a plan to do it). I mean if they can reorder raw search results
based on some kind of collective knowledge (which is probably kept outside
of Lucene index - at least that is what I can see from Nutch score
explanations).

Regards,
Lukas

On Jan 7, 2008 11:14 PM, Otis Gospodnetic <otis_gospodnetic [at] yahoo>
wrote:

> See my comment (around #45-50) on Techcrunch about that from late last
> night. There is actually one Wikia guy helping Nutch - Dennis Kubes. He
> must have been hitting reload on that TC post, because he IMed me quickly
> after I posted my comment and clarified that he is that Wikia developer I
> was referring to in my comment.... so I'm looking forward to more
> contributions from Dennis and his coworkers! :)
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
> ----- Original Message ----
> From: Grant Ingersoll <gsingers [at] apache>
> To: java-user [at] lucene
> Sent: Monday, January 7, 2008 11:21:33 AM
> Subject: Re: Wikia search goes live today
>
> One other thing to note, you can definitely see Lucene in action (or
> Nutch, that is) by clicking on the score returned for a given document
>
> (try searching for Lucene) and you see, in all it's glory, the Lucene
> explain results... It even displays the Nutch logo, which makes me
> wonder if they are misusing an ASF trademark (but, IANAL, so I don't
> know) since they don't state that Nutch is a trademark of the ASF.
> But, that is a discussion for somewhere else...
>
>
> On Jan 7, 2008, at 8:13 AM, Grant Ingersoll wrote:
>
> >
> > On Jan 7, 2008, at 7:48 AM, Lukas Vlcek wrote:
> >
> >> Hi,
> >>
> >> I noticed that Wikia search goes live today (see
> >> http://www.devxnews.com/article.php/3719906).
> >> Does anybody know where I could find more technical information
> >> about their
> >> solution? Are they going to contribute their enhancements back to
> >> Lucene/Nutch/Hadoop code? My understanding is that as long as they
> >> claim
> >> they want to build their solution on top of open source technology
> >> they
> >> should be contributing back.
> >
> > Not sure what they have done, but nothing in the Apache license
> > requires contribution back, even if it would be appreciated.
> >
> > Cheers,
> > Grant
> >
> > --------------------------
> > Grant Ingersoll
> > http://lucene.grantingersoll.com
> > http://www.lucenebootcamp.com
> >
> > Lucene Helpful Hints:
> > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > http://wiki.apache.org/lucene-java/LuceneFAQ
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > For additional commands, e-mail: java-user-help [at] lucene
> >
>
> --------------------------
> Grant Ingersoll
> http://lucene.grantingersoll.com
> http://www.lucenebootcamp.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


--
http://blog.lukas-vlcek.com/


lukas.vlcek at gmail

Jan 7, 2008, 11:55 PM

Post #6 of 17 (4046 views)
Permalink
Re: Wikia search goes live today [In reply to]

BTW:
1) If they have made any improvements/changes to Nutch (or Lucene/Hadoop)
code and they keep it closed then how they can claim they are using open
sourced algorithms?
2) Wouldn't it be too expensive for them to keep their changes closed going
forward? How about if Nutch changes significantly in the future.

Obviously I don't see the big picture but I think they don't have any other
option then contributing back to community if they mean it seriously.

On Jan 8, 2008 8:49 AM, Lukas Vlcek <lukas.vlcek [at] gmail> wrote:

> This would be great!
>
> I am particularly interested how they are going about customized search
> (if they have a plan to do it). I mean if they can reorder raw search
> results based on some kind of collective knowledge (which is probably kept
> outside of Lucene index - at least that is what I can see from Nutch score
> explanations).
>
> Regards,
> Lukas
>
>
> On Jan 7, 2008 11:14 PM, Otis Gospodnetic <otis_gospodnetic [at] yahoo>
> wrote:
>
> > See my comment (around #45-50) on Techcrunch about that from late last
> > night. There is actually one Wikia guy helping Nutch - Dennis Kubes. He
> > must have been hitting reload on that TC post, because he IMed me quickly
> > after I posted my comment and clarified that he is that Wikia developer I
> > was referring to in my comment.... so I'm looking forward to more
> > contributions from Dennis and his coworkers! :)
> >
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >
> > ----- Original Message ----
> > From: Grant Ingersoll <gsingers [at] apache>
> > To: java-user [at] lucene
> > Sent: Monday, January 7, 2008 11:21:33 AM
> > Subject: Re: Wikia search goes live today
> >
> > One other thing to note, you can definitely see Lucene in action (or
> > Nutch, that is) by clicking on the score returned for a given document
> >
> > (try searching for Lucene) and you see, in all it's glory, the Lucene
> > explain results... It even displays the Nutch logo, which makes me
> > wonder if they are misusing an ASF trademark (but, IANAL, so I don't
> > know) since they don't state that Nutch is a trademark of the ASF.
> > But, that is a discussion for somewhere else...
> >
> >
> > On Jan 7, 2008, at 8:13 AM, Grant Ingersoll wrote:
> >
> > >
> > > On Jan 7, 2008, at 7:48 AM, Lukas Vlcek wrote:
> > >
> > >> Hi,
> > >>
> > >> I noticed that Wikia search goes live today (see
> > >> http://www.devxnews.com/article.php/3719906).
> > >> Does anybody know where I could find more technical information
> > >> about their
> > >> solution? Are they going to contribute their enhancements back to
> > >> Lucene/Nutch/Hadoop code? My understanding is that as long as they
> > >> claim
> > >> they want to build their solution on top of open source technology
> > >> they
> > >> should be contributing back.
> > >
> > > Not sure what they have done, but nothing in the Apache license
> > > requires contribution back, even if it would be appreciated.
> > >
> > > Cheers,
> > > Grant
> > >
> > > --------------------------
> > > Grant Ingersoll
> > > http://lucene.grantingersoll.com
> > > http://www.lucenebootcamp.com
> > >
> > > Lucene Helpful Hints:
> > > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > > http://wiki.apache.org/lucene-java/LuceneFAQ
> > >
> > >
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > > For additional commands, e-mail: java-user-help [at] lucene
> > >
> >
> > --------------------------
> > Grant Ingersoll
> > http://lucene.grantingersoll.com
> > http://www.lucenebootcamp.com
> >
> > Lucene Helpful Hints:
> > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > http://wiki.apache.org/lucene-java/LuceneFAQ
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > For additional commands, e-mail: java-user-help [at] lucene
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > For additional commands, e-mail: java-user-help [at] lucene
> >
> >
>
>
> --
> http://blog.lukas-vlcek.com/
>



--
http://blog.lukas-vlcek.com/


gsingers at apache

Jan 8, 2008, 4:47 AM

Post #7 of 17 (4051 views)
Permalink
Re: Wikia search goes live today [In reply to]

On Jan 8, 2008, at 2:55 AM, Lukas Vlcek wrote:

> BTW:
> 1) If they have made any improvements/changes to Nutch (or Lucene/
> Hadoop)
> code and they keep it closed then how they can claim they are using
> open
> sourced algorithms?

They are "using" it, they just aren't sharing it. Many companies out
there do that, they use Lucene but you would never know it unless you
dug in deep. It's probably one of the big diffs between the Apache
license and some of the "other" licenses. But, having read more about
it, I don't think Wikia is keeping it closed, but that doesn't mean
they are contributing back to us, either. They certainly could host
the repository themselves and make it available. Some people host
Lucene improvements elsewhere b/c they want different licenses. As
long as they include the ASF license and give proper credit to it, I
believe they are fine.

>
> 2) Wouldn't it be too expensive for them to keep their changes
> closed going
> forward? How about if Nutch changes significantly in the future.

I would think so, but there are no rules against doing stupid things,
right?


-Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


mike.klaas at gmail

Jan 8, 2008, 11:59 AM

Post #8 of 17 (4032 views)
Permalink
Re: Wikia search goes live today [In reply to]

On 7-Jan-08, at 11:49 PM, Lukas Vlcek wrote:

> This would be great!
>
> I am particularly interested how they are going about customized
> search (if
> they have a plan to do it). I mean if they can reorder raw search
> results
> based on some kind of collective knowledge (which is probably kept
> outside
> of Lucene index - at least that is what I can see from Nutch score
> explanations).

I don't think that there is anything like that yet. It looks to me
like a standard disjunction over title/content/host/url + a global
document boost based on pagerank-y link analysis (or simply #
inlinks). If they are incorporating the "star" ratings yet, it is
probably folded in to the global doc boost.

-Mike


> Regards,
> Lukas
>
> On Jan 7, 2008 11:14 PM, Otis Gospodnetic <otis_gospodnetic [at] yahoo>
> wrote:
>
>> See my comment (around #45-50) on Techcrunch about that from late
>> last
>> night. There is actually one Wikia guy helping Nutch - Dennis
>> Kubes. He
>> must have been hitting reload on that TC post, because he IMed me
>> quickly
>> after I posted my comment and clarified that he is that Wikia
>> developer I
>> was referring to in my comment.... so I'm looking forward to more
>> contributions from Dennis and his coworkers! :)
>>
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>
>> ----- Original Message ----
>> From: Grant Ingersoll <gsingers [at] apache>
>> To: java-user [at] lucene
>> Sent: Monday, January 7, 2008 11:21:33 AM
>> Subject: Re: Wikia search goes live today
>>
>> One other thing to note, you can definitely see Lucene in action (or
>> Nutch, that is) by clicking on the score returned for a given
>> document
>>
>> (try searching for Lucene) and you see, in all it's glory, the Lucene
>> explain results... It even displays the Nutch logo, which makes me
>> wonder if they are misusing an ASF trademark (but, IANAL, so I don't
>> know) since they don't state that Nutch is a trademark of the ASF.
>> But, that is a discussion for somewhere else...
>>
>>
>> On Jan 7, 2008, at 8:13 AM, Grant Ingersoll wrote:
>>
>>>
>>> On Jan 7, 2008, at 7:48 AM, Lukas Vlcek wrote:
>>>
>>>> Hi,
>>>>
>>>> I noticed that Wikia search goes live today (see
>>>> http://www.devxnews.com/article.php/3719906).
>>>> Does anybody know where I could find more technical information
>>>> about their
>>>> solution? Are they going to contribute their enhancements back to
>>>> Lucene/Nutch/Hadoop code? My understanding is that as long as they
>>>> claim
>>>> they want to build their solution on top of open source technology
>>>> they
>>>> should be contributing back.
>>>
>>> Not sure what they have done, but nothing in the Apache license
>>> requires contribution back, even if it would be appreciated.
>>>
>>> Cheers,
>>> Grant
>>>
>>> --------------------------
>>> Grant Ingersoll
>>> http://lucene.grantingersoll.com
>>> http://www.lucenebootcamp.com
>>>
>>> Lucene Helpful Hints:
>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>
>>>
>>>
>>>
>>>
>>> --------------------------------------------------------------------
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>>> For additional commands, e-mail: java-user-help [at] lucene
>>>
>>
>> --------------------------
>> Grant Ingersoll
>> http://lucene.grantingersoll.com
>> http://www.lucenebootcamp.com
>>
>> Lucene Helpful Hints:
>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>> For additional commands, e-mail: java-user-help [at] lucene
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>> For additional commands, e-mail: java-user-help [at] lucene
>>
>>
>
>
> --
> http://blog.lukas-vlcek.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


kubes at apache

Jan 8, 2008, 12:09 PM

Post #9 of 17 (4037 views)
Permalink
Re: Wikia search goes live today [In reply to]

Star ratings are being stored but not accounted for in the score as of
yet. The plan is to include them in future indexing scores. :)

Dennis

Mike Klaas wrote:
> On 7-Jan-08, at 11:49 PM, Lukas Vlcek wrote:
>
>> This would be great!
>>
>> I am particularly interested how they are going about customized
>> search (if
>> they have a plan to do it). I mean if they can reorder raw search results
>> based on some kind of collective knowledge (which is probably kept
>> outside
>> of Lucene index - at least that is what I can see from Nutch score
>> explanations).
>
> I don't think that there is anything like that yet. It looks to me like
> a standard disjunction over title/content/host/url + a global document
> boost based on pagerank-y link analysis (or simply # inlinks). If they
> are incorporating the "star" ratings yet, it is probably folded in to
> the global doc boost.
>
> -Mike
>
>
>> Regards,
>> Lukas
>>
>> On Jan 7, 2008 11:14 PM, Otis Gospodnetic <otis_gospodnetic [at] yahoo>
>> wrote:
>>
>>> See my comment (around #45-50) on Techcrunch about that from late last
>>> night. There is actually one Wikia guy helping Nutch - Dennis
>>> Kubes. He
>>> must have been hitting reload on that TC post, because he IMed me
>>> quickly
>>> after I posted my comment and clarified that he is that Wikia
>>> developer I
>>> was referring to in my comment.... so I'm looking forward to more
>>> contributions from Dennis and his coworkers! :)
>>>
>>> Otis
>>> --
>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>
>>> ----- Original Message ----
>>> From: Grant Ingersoll <gsingers [at] apache>
>>> To: java-user [at] lucene
>>> Sent: Monday, January 7, 2008 11:21:33 AM
>>> Subject: Re: Wikia search goes live today
>>>
>>> One other thing to note, you can definitely see Lucene in action (or
>>> Nutch, that is) by clicking on the score returned for a given document
>>>
>>> (try searching for Lucene) and you see, in all it's glory, the Lucene
>>> explain results... It even displays the Nutch logo, which makes me
>>> wonder if they are misusing an ASF trademark (but, IANAL, so I don't
>>> know) since they don't state that Nutch is a trademark of the ASF.
>>> But, that is a discussion for somewhere else...
>>>
>>>
>>> On Jan 7, 2008, at 8:13 AM, Grant Ingersoll wrote:
>>>
>>>>
>>>> On Jan 7, 2008, at 7:48 AM, Lukas Vlcek wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I noticed that Wikia search goes live today (see
>>>>> http://www.devxnews.com/article.php/3719906).
>>>>> Does anybody know where I could find more technical information
>>>>> about their
>>>>> solution? Are they going to contribute their enhancements back to
>>>>> Lucene/Nutch/Hadoop code? My understanding is that as long as they
>>>>> claim
>>>>> they want to build their solution on top of open source technology
>>>>> they
>>>>> should be contributing back.
>>>>
>>>> Not sure what they have done, but nothing in the Apache license
>>>> requires contribution back, even if it would be appreciated.
>>>>
>>>> Cheers,
>>>> Grant
>>>>
>>>> --------------------------
>>>> Grant Ingersoll
>>>> http://lucene.grantingersoll.com
>>>> http://www.lucenebootcamp.com
>>>>
>>>> Lucene Helpful Hints:
>>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>>>> For additional commands, e-mail: java-user-help [at] lucene
>>>>
>>>
>>> --------------------------
>>> Grant Ingersoll
>>> http://lucene.grantingersoll.com
>>> http://www.lucenebootcamp.com
>>>
>>> Lucene Helpful Hints:
>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>>> For additional commands, e-mail: java-user-help [at] lucene
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>>> For additional commands, e-mail: java-user-help [at] lucene
>>>
>>>
>>
>>
>> --
>> http://blog.lukas-vlcek.com/
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


stopman at gmail

Jan 8, 2008, 12:12 PM

Post #10 of 17 (4028 views)
Permalink
Re: Wikia search goes live today [In reply to]

I'm surprised they aren't keeping *any* logs or so they claim. Seems foolish
to me from a data-mining prospective.

"A Wikia employee told me today that people were already asking what the
most popular search terms were. He said there was no way of finding out as
no logs are kept." [1]
[1]
http://radar.oreilly.com/archives/2008/01/why_wikia_will_change_search.html

-M

On Jan 8, 2008 12:09 PM, Dennis Kubes <kubes [at] apache> wrote:

> Star ratings are being stored but not accounted for in the score as of
> yet. The plan is to include them in future indexing scores. :)
>
> Dennis
>
> Mike Klaas wrote:
> > On 7-Jan-08, at 11:49 PM, Lukas Vlcek wrote:
> >
> >> This would be great!
> >>
> >> I am particularly interested how they are going about customized
> >> search (if
> >> they have a plan to do it). I mean if they can reorder raw search
> results
> >> based on some kind of collective knowledge (which is probably kept
> >> outside
> >> of Lucene index - at least that is what I can see from Nutch score
> >> explanations).
> >
> > I don't think that there is anything like that yet. It looks to me like
> > a standard disjunction over title/content/host/url + a global document
> > boost based on pagerank-y link analysis (or simply # inlinks). If they
> > are incorporating the "star" ratings yet, it is probably folded in to
> > the global doc boost.
> >
> > -Mike
> >
> >
> >> Regards,
> >> Lukas
> >>
> >> On Jan 7, 2008 11:14 PM, Otis Gospodnetic <otis_gospodnetic [at] yahoo>
> >> wrote:
> >>
> >>> See my comment (around #45-50) on Techcrunch about that from late last
> >>> night. There is actually one Wikia guy helping Nutch - Dennis
> >>> Kubes. He
> >>> must have been hitting reload on that TC post, because he IMed me
> >>> quickly
> >>> after I posted my comment and clarified that he is that Wikia
> >>> developer I
> >>> was referring to in my comment.... so I'm looking forward to more
> >>> contributions from Dennis and his coworkers! :)
> >>>
> >>> Otis
> >>> --
> >>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >>>
> >>> ----- Original Message ----
> >>> From: Grant Ingersoll <gsingers [at] apache>
> >>> To: java-user [at] lucene
> >>> Sent: Monday, January 7, 2008 11:21:33 AM
> >>> Subject: Re: Wikia search goes live today
> >>>
> >>> One other thing to note, you can definitely see Lucene in action (or
> >>> Nutch, that is) by clicking on the score returned for a given document
> >>>
> >>> (try searching for Lucene) and you see, in all it's glory, the Lucene
> >>> explain results... It even displays the Nutch logo, which makes me
> >>> wonder if they are misusing an ASF trademark (but, IANAL, so I don't
> >>> know) since they don't state that Nutch is a trademark of the ASF.
> >>> But, that is a discussion for somewhere else...
> >>>
> >>>
> >>> On Jan 7, 2008, at 8:13 AM, Grant Ingersoll wrote:
> >>>
> >>>>
> >>>> On Jan 7, 2008, at 7:48 AM, Lukas Vlcek wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I noticed that Wikia search goes live today (see
> >>>>> http://www.devxnews.com/article.php/3719906).
> >>>>> Does anybody know where I could find more technical information
> >>>>> about their
> >>>>> solution? Are they going to contribute their enhancements back to
> >>>>> Lucene/Nutch/Hadoop code? My understanding is that as long as they
> >>>>> claim
> >>>>> they want to build their solution on top of open source technology
> >>>>> they
> >>>>> should be contributing back.
> >>>>
> >>>> Not sure what they have done, but nothing in the Apache license
> >>>> requires contribution back, even if it would be appreciated.
> >>>>
> >>>> Cheers,
> >>>> Grant
> >>>>
> >>>> --------------------------
> >>>> Grant Ingersoll
> >>>> http://lucene.grantingersoll.com
> >>>> http://www.lucenebootcamp.com
> >>>>
> >>>> Lucene Helpful Hints:
> >>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> >>>> http://wiki.apache.org/lucene-java/LuceneFAQ
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> >>>> For additional commands, e-mail: java-user-help [at] lucene
> >>>>
> >>>
> >>> --------------------------
> >>> Grant Ingersoll
> >>> http://lucene.grantingersoll.com
> >>> http://www.lucenebootcamp.com
> >>>
> >>> Lucene Helpful Hints:
> >>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> >>> http://wiki.apache.org/lucene-java/LuceneFAQ
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> >>> For additional commands, e-mail: java-user-help [at] lucene
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> >>> For additional commands, e-mail: java-user-help [at] lucene
> >>>
> >>>
> >>
> >>
> >> --
> >> http://blog.lukas-vlcek.com/
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > For additional commands, e-mail: java-user-help [at] lucene
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


lukas.vlcek at gmail

Jan 8, 2008, 12:15 PM

Post #11 of 17 (4037 views)
Permalink
Re: Wikia search goes live today [In reply to]

So staring will be accommodated only during indexing phase. Does it mean it
will be pretty static value not a dynamically changing variable... correct?
In other words if I add my starts to some document it won't affect the
scoring immediately but after indexing cycle. Correct?

On 1/8/08, Dennis Kubes <kubes [at] apache> wrote:
>
> Star ratings are being stored but not accounted for in the score as of
> yet. The plan is to include them in future indexing scores. :)
>
> Dennis
>
> Mike Klaas wrote:
> > On 7-Jan-08, at 11:49 PM, Lukas Vlcek wrote:
> >
> >> This would be great!
> >>
> >> I am particularly interested how they are going about customized
> >> search (if
> >> they have a plan to do it). I mean if they can reorder raw search
> results
> >> based on some kind of collective knowledge (which is probably kept
> >> outside
> >> of Lucene index - at least that is what I can see from Nutch score
> >> explanations).
> >
> > I don't think that there is anything like that yet. It looks to me like
> > a standard disjunction over title/content/host/url + a global document
> > boost based on pagerank-y link analysis (or simply # inlinks). If they
> > are incorporating the "star" ratings yet, it is probably folded in to
> > the global doc boost.
> >
> > -Mike
> >
> >
> >> Regards,
> >> Lukas
> >>
> >> On Jan 7, 2008 11:14 PM, Otis Gospodnetic <otis_gospodnetic [at] yahoo>
> >> wrote:
> >>
> >>> See my comment (around #45-50) on Techcrunch about that from late last
> >>> night. There is actually one Wikia guy helping Nutch - Dennis
> >>> Kubes. He
> >>> must have been hitting reload on that TC post, because he IMed me
> >>> quickly
> >>> after I posted my comment and clarified that he is that Wikia
> >>> developer I
> >>> was referring to in my comment.... so I'm looking forward to more
> >>> contributions from Dennis and his coworkers! :)
> >>>
> >>> Otis
> >>> --
> >>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >>>
> >>> ----- Original Message ----
> >>> From: Grant Ingersoll <gsingers [at] apache>
> >>> To: java-user [at] lucene
> >>> Sent: Monday, January 7, 2008 11:21:33 AM
> >>> Subject: Re: Wikia search goes live today
> >>>
> >>> One other thing to note, you can definitely see Lucene in action (or
> >>> Nutch, that is) by clicking on the score returned for a given document
> >>>
> >>> (try searching for Lucene) and you see, in all it's glory, the Lucene
> >>> explain results... It even displays the Nutch logo, which makes me
> >>> wonder if they are misusing an ASF trademark (but, IANAL, so I don't
> >>> know) since they don't state that Nutch is a trademark of the ASF.
> >>> But, that is a discussion for somewhere else...
> >>>
> >>>
> >>> On Jan 7, 2008, at 8:13 AM, Grant Ingersoll wrote:
> >>>
> >>>>
> >>>> On Jan 7, 2008, at 7:48 AM, Lukas Vlcek wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I noticed that Wikia search goes live today (see
> >>>>> http://www.devxnews.com/article.php/3719906).
> >>>>> Does anybody know where I could find more technical information
> >>>>> about their
> >>>>> solution? Are they going to contribute their enhancements back to
> >>>>> Lucene/Nutch/Hadoop code? My understanding is that as long as they
> >>>>> claim
> >>>>> they want to build their solution on top of open source technology
> >>>>> they
> >>>>> should be contributing back.
> >>>>
> >>>> Not sure what they have done, but nothing in the Apache license
> >>>> requires contribution back, even if it would be appreciated.
> >>>>
> >>>> Cheers,
> >>>> Grant
> >>>>
> >>>> --------------------------
> >>>> Grant Ingersoll
> >>>> http://lucene.grantingersoll.com
> >>>> http://www.lucenebootcamp.com
> >>>>
> >>>> Lucene Helpful Hints:
> >>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> >>>> http://wiki.apache.org/lucene-java/LuceneFAQ
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> >>>> For additional commands, e-mail: java-user-help [at] lucene
> >>>>
> >>>
> >>> --------------------------
> >>> Grant Ingersoll
> >>> http://lucene.grantingersoll.com
> >>> http://www.lucenebootcamp.com
> >>>
> >>> Lucene Helpful Hints:
> >>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> >>> http://wiki.apache.org/lucene-java/LuceneFAQ
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> >>> For additional commands, e-mail: java-user-help [at] lucene
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> >>> For additional commands, e-mail: java-user-help [at] lucene
> >>>
> >>>
> >>
> >>
> >> --
> >> http://blog.lukas-vlcek.com/
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > For additional commands, e-mail: java-user-help [at] lucene
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


--
http://blog.lukas-vlcek.com/


ab at getopt

Jan 8, 2008, 12:24 PM

Post #12 of 17 (4045 views)
Permalink
Re: Wikia search goes live today [In reply to]

Lukas Vlcek wrote:
> So staring will be accommodated only during indexing phase. Does it mean it
> will be pretty static value not a dynamically changing variable... correct?
> In other words if I add my starts to some document it won't affect the
> scoring immediately but after indexing cycle. Correct?

(I'm not involved in Wikia development). There are some ways to go about
it even in the pure Lucene-land, so that the updates are fast without
reindexing the main content. Hint: ParallelReader.


--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


ryantxu at gmail

Jan 8, 2008, 12:31 PM

Post #13 of 17 (4047 views)
Permalink
Re: Wikia search goes live today [In reply to]

Andrzej Bialecki wrote:
> Lukas Vlcek wrote:
>> So staring will be accommodated only during indexing phase. Does it
>> mean it
>> will be pretty static value not a dynamically changing variable...
>> correct?
>> In other words if I add my starts to some document it won't affect the
>> scoring immediately but after indexing cycle. Correct?
>
> (I'm not involved in Wikia development). There are some ways to go about
> it even in the pure Lucene-land, so that the updates are fast without
> reindexing the main content. Hint: ParallelReader.
>

in solr (1.3-dev) you can have an external value source with a function
query...


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


lukas.vlcek at gmail

Jan 8, 2008, 12:36 PM

Post #14 of 17 (4043 views)
Permalink
Re: Wikia search goes live today [In reply to]

After checking the Lucene API of ParallelReader it seems that the star score
could be stored in different index which shares the same identifier for the
documents. Such index could be small (partitioned to many small indices?) so
the updates can be fast. Is that what you meant Andrzej? ;-)

Anyway, I remember different technique which I once mentioned in Lucene mail
list taking inspiration from book called Programming Collective
Intelligence<http://www.oreilly.com/catalog/9780596529321/>.
The idea is not to store score (may be I should call it user preference)
into index but into neural net. One useful side effect is that this
technique could score reasonably even document without any stars (meaning
"similar" document to highly started documents could score better even if
they haven't been stared by any user yet).

Regards,
Lukas

On 1/8/08, Andrzej Bialecki <ab [at] getopt> wrote:
>
> Lukas Vlcek wrote:
> > So staring will be accommodated only during indexing phase. Does it mean
> it
> > will be pretty static value not a dynamically changing variable...
> correct?
> > In other words if I add my starts to some document it won't affect the
> > scoring immediately but after indexing cycle. Correct?
>
> (I'm not involved in Wikia development). There are some ways to go about
> it even in the pure Lucene-land, so that the updates are fast without
> reindexing the main content. Hint: ParallelReader.
>
>
> --
> Best regards,
> Andrzej Bialecki <><
> ___. ___ ___ ___ _ _ __________________________________
> [__ || __|__/|__||\/| Information Retrieval, Semantic Web
> ___|||__|| \| || | Embedded Unix, System Integration
> http://www.sigram.com Contact: info at sigram dot com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


--
http://blog.lukas-vlcek.com/


lukas.vlcek at gmail

Jan 8, 2008, 12:38 PM

Post #15 of 17 (4043 views)
Permalink
Re: Wikia search goes live today [In reply to]

I should note that this technique is probably not easily applicable to
current Lucene scoring mechanism without additional development.

On 1/8/08, Lukas Vlcek <lukas.vlcek [at] gmail> wrote:
>
> After checking the Lucene API of ParallelReader it seems that the star
> score could be stored in different index which shares the same identifier
> for the documents. Such index could be small (partitioned to many small
> indices?) so the updates can be fast. Is that what you meant Andrzej? ;-)
>
> Anyway, I remember different technique which I once mentioned in Lucene
> mail list taking inspiration from book called Programming Collective
> Intelligence <http://www.oreilly.com/catalog/9780596529321/> . The idea is
> not to store score (may be I should call it user preference) into index but
> into neural net. One useful side effect is that this technique could score
> reasonably even document without any stars (meaning "similar" document to
> highly started documents could score better even if they haven't been stared
> by any user yet).
>
> Regards,
> Lukas
>
> On 1/8/08, Andrzej Bialecki <ab [at] getopt> wrote:
> >
> > Lukas Vlcek wrote:
> > > So staring will be accommodated only during indexing phase. Does it
> > mean it
> > > will be pretty static value not a dynamically changing variable...
> > correct?
> > > In other words if I add my starts to some document it won't affect the
> >
> > > scoring immediately but after indexing cycle. Correct?
> >
> > (I'm not involved in Wikia development). There are some ways to go about
> > it even in the pure Lucene-land, so that the updates are fast without
> > reindexing the main content. Hint: ParallelReader.
> >
> >
> > --
> > Best regards,
> > Andrzej Bialecki <><
> > ___. ___ ___ ___ _ _ __________________________________
> > [__ || __|__/|__||\/| Information Retrieval, Semantic Web
> > ___|||__|| \| || | Embedded Unix, System Integration
> > http://www.sigram.com Contact: info at sigram dot com
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > For additional commands, e-mail: java-user-help [at] lucene
> >
> >
>
>
> --
> http://blog.lukas-vlcek.com/
>



--
http://blog.lukas-vlcek.com/


ab at getopt

Jan 8, 2008, 2:24 PM

Post #16 of 17 (4039 views)
Permalink
Re: Wikia search goes live today [In reply to]

Ryan McKinley wrote:
> Andrzej Bialecki wrote:
>> Lukas Vlcek wrote:
>>> So staring will be accommodated only during indexing phase. Does it
>>> mean it
>>> will be pretty static value not a dynamically changing variable...
>>> correct?
>>> In other words if I add my starts to some document it won't affect the
>>> scoring immediately but after indexing cycle. Correct?
>>
>> (I'm not involved in Wikia development). There are some ways to go
>> about it even in the pure Lucene-land, so that the updates are fast
>> without reindexing the main content. Hint: ParallelReader.
>>
>
> in solr (1.3-dev) you can have an external value source with a function
> query...

True, although function query tends to bring more overhead ...

While we're on the subject of complex scoring - I read an interesting
paper (I don't have a link now), which discussed a so called bucketed
scoring. The idea is that if your basic scoring is good enough to ensure
that top-N results are highly relevant, then you can split these results
into buckets of k documents (let's say 10 ;) ), and within each bucket
apply arbitrary re-ranking function, which is then very inexpensive to
perform because of the limited number of documents.

Example: you have a large corpus of web pages, and you want home pages
to appear first, even if they score somewhat lower - and it doesn't pay
off to modify the base scoring, because of overfitting, i.e. the scoring
would be good for home pages but poor for other relevant documents.


--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


kubes at apache

Jan 8, 2008, 2:53 PM

Post #17 of 17 (4043 views)
Permalink
Re: Wikia search goes live today [In reply to]

Sorry about not responding to this before now, been a little busy :).

For those of you who don't know me, I am a committer on the Nutch
project. I have been working with Wikia since early July and more
actively since the beginning of November. Before Wikia I helped start
another search engine based on Nutch called Visvo.com.

For the record, yes Search Wikia is using and will be supporting
Nutch/Hadoop/Lucene/Solr/HBase. It is the intention of Search Wikia to
help develop these projects and their communities. We have no intention
of keeping the changes we make "proprietary". Everything that Search
Wikia develops (barring an user or personal data) will be considered
open source and freely available. Any improvements made to the apache
projects will be immediately donated back to the community through the
respective project.

Making search open and transparent is not just limited to source code.
It is our intention to make the Search Wikia data freely open and
available as well. This means that people will be able to download the
crawl data, link data, content shards, and completed indexes. Also the
social networking functionality, named foowi, will become its own open
source project (probably with an apache license), and will be available
to download, use, and improve.

And Search Wikia is not alone in this. Visvo.com in coordination with
Wikia will be releasing all of its data and source code improvements to
the community under an OSI approved license, including a python
framework for managing hadoop configurations on distributed machines,
automating the fetching and indexing process, and for managing search
shards.

In terms of the Nutch logo. There are two standard nutch installations
and index farms at the following urls. One in an index hosted at the
ISC and the other is Visvo's open index. The ISC index has
approximately 35M pages while Visvo's index has a little over 50M pages.

http://search.isc.swlabs.org
http://open-index.visvo.com

The main Search Wikia site is hosted in a secure underground hosting
facility in a bunker in Iowa (http://usshc.com/) and calls to these
indexes. So when showing cached pages and explain plans those requests
go to their respective indexes.

Both indexes are available for search by either browser based or web 2.0
based clients. We are currently using NUTCH-594 to serve results from
these indexes in both xml and JSON formats. An example request
searching for java would be:

http://search.isc.swlabs.org/nutchsearch?query=java&hitsPerSite=1&lang=en&hitsPerPage=10&type=json
http://open-index.visvo.com/nutchsearch?query=java&hitsPerSite=1&lang=en&hitsPerPage=10&type=json

So we are busy working on getting the data avaiable for download.
Hopefully we should have a site setup within the next day or so. If
anybody has any questions or would like to get some specific data feel
free to send me an email.

Dennis Kubes

Lukas Vlcek wrote:
> I should note that this technique is probably not easily applicable to
> current Lucene scoring mechanism without additional development.
>
> On 1/8/08, Lukas Vlcek <lukas.vlcek [at] gmail> wrote:
>> After checking the Lucene API of ParallelReader it seems that the star
>> score could be stored in different index which shares the same identifier
>> for the documents. Such index could be small (partitioned to many small
>> indices?) so the updates can be fast. Is that what you meant Andrzej? ;-)
>>
>> Anyway, I remember different technique which I once mentioned in Lucene
>> mail list taking inspiration from book called Programming Collective
>> Intelligence <http://www.oreilly.com/catalog/9780596529321/> . The idea is
>> not to store score (may be I should call it user preference) into index but
>> into neural net. One useful side effect is that this technique could score
>> reasonably even document without any stars (meaning "similar" document to
>> highly started documents could score better even if they haven't been stared
>> by any user yet).
>>
>> Regards,
>> Lukas
>>
>> On 1/8/08, Andrzej Bialecki <ab [at] getopt> wrote:
>>> Lukas Vlcek wrote:
>>>> So staring will be accommodated only during indexing phase. Does it
>>> mean it
>>>> will be pretty static value not a dynamically changing variable...
>>> correct?
>>>> In other words if I add my starts to some document it won't affect the
>>>> scoring immediately but after indexing cycle. Correct?
>>> (I'm not involved in Wikia development). There are some ways to go about
>>> it even in the pure Lucene-land, so that the updates are fast without
>>> reindexing the main content. Hint: ParallelReader.
>>>
>>>
>>> --
>>> Best regards,
>>> Andrzej Bialecki <><
>>> ___. ___ ___ ___ _ _ __________________________________
>>> [__ || __|__/|__||\/| Information Retrieval, Semantic Web
>>> ___|||__|| \| || | Embedded Unix, System Integration
>>> http://www.sigram.com Contact: info at sigram dot com
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>>> For additional commands, e-mail: java-user-help [at] lucene
>>>
>>>
>>
>> --
>> http://blog.lukas-vlcek.com/
>>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.