Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: General

Lucene vs Sphinx

 

 

Lucene general RSS feed   Index | Next | Previous | View Threaded


patelmanojb2000 at gmail

Jan 10, 2010, 10:59 PM

Post #1 of 13 (2369 views)
Permalink
Lucene vs Sphinx

hi i am new in search development. I want to know which one is best from
Lucene and (Sphinx with Ispell).
Answer of this message will helpful to developer who are started working on
search.

Thanks.
--
View this message in context: http://old.nabble.com/Lucene-vs-Sphinx-tp27106508p27106508.html
Sent from the Lucene - General mailing list archive at Nabble.com.


ted.dunning at gmail

Jan 10, 2010, 11:56 PM

Post #2 of 13 (2317 views)
Permalink
Re: Lucene vs Sphinx [In reply to]

For my own purposes, there is no question since I need huge amounts of
flexibility, high scalability, have data that is not in SQL and am not
interested in a large C++ program in my production area.

Others will have differing points of view. I worry that the Sphinx people
say that Sphinx is necessarily faster because it is written in C++. I note
also that they don't seem to understand how SOLR massively simplifies the
deployment of Lucene for most applications.

On Sun, Jan 10, 2010 at 10:59 PM, ManojPatel <patelmanojb2000 [at] gmail>wrote:

>
> hi i am new in search development. I want to know which one is best from
> Lucene and (Sphinx with Ispell).
> Answer of this message will helpful to developer who are started working on
> search.
>
> Thanks.
> --
> View this message in context:
> http://old.nabble.com/Lucene-vs-Sphinx-tp27106508p27106508.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>
>


--
Ted Dunning, CTO
DeepDyve


charlie at flax

Jan 11, 2010, 12:51 AM

Post #3 of 13 (2312 views)
Permalink
Re: Lucene vs Sphinx [In reply to]

ManojPatel wrote:
> hi i am new in search development. I want to know which one is best from
> Lucene and (Sphinx with Ispell).
> Answer of this message will helpful to developer who are started working on
> search.
>
> Thanks.

A very wide question that's almost impossible to answer. You should let
us know first what you're trying to develop, what search features you
need, how much data will need to be searchable and your past experience
with development environments and languages.

Charlie

--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile: +44 (0)7767 825828
web: www.flax.co.uk


gsingers at apache

Jan 11, 2010, 4:07 AM

Post #4 of 13 (2314 views)
Permalink
Re: Lucene vs Sphinx [In reply to]

Well, I know I'm pretty biased, but I'd say Lucene is much more full featured and flexible, especially when delivered through Solr. Also, the communities are much larger, very helpful and development is done in the open across the projects.

-Grant

On Jan 11, 2010, at 1:59 AM, ManojPatel wrote:

>
> hi i am new in search development. I want to know which one is best from
> Lucene and (Sphinx with Ispell).
> Answer of this message will helpful to developer who are started working on
> search.
>
> Thanks.
> --
> View this message in context: http://old.nabble.com/Lucene-vs-Sphinx-tp27106508p27106508.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>


patelmanojb2000 at gmail

Jan 11, 2010, 6:01 AM

Post #5 of 13 (2308 views)
Permalink
Re: Lucene vs Sphinx [In reply to]

I have large amount of data and I want to run my search query on different
servers. How can I run lucene search engine on different server as
distributed application.



Ted Dunning wrote:
>
> For my own purposes, there is no question since I need huge amounts of
> flexibility, high scalability, have data that is not in SQL and am not
> interested in a large C++ program in my production area.
>
> Others will have differing points of view. I worry that the Sphinx people
> say that Sphinx is necessarily faster because it is written in C++. I
> note
> also that they don't seem to understand how SOLR massively simplifies the
> deployment of Lucene for most applications.
>
> On Sun, Jan 10, 2010 at 10:59 PM, ManojPatel
> <patelmanojb2000 [at] gmail>wrote:
>
>>
>> hi i am new in search development. I want to know which one is best from
>> Lucene and (Sphinx with Ispell).
>> Answer of this message will helpful to developer who are started working
>> on
>> search.
>>
>> Thanks.
>> --
>> View this message in context:
>> http://old.nabble.com/Lucene-vs-Sphinx-tp27106508p27106508.html
>> Sent from the Lucene - General mailing list archive at Nabble.com.
>>
>>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>
>

--
View this message in context: http://old.nabble.com/Lucene-vs-Sphinx-tp27106508p27110849.html
Sent from the Lucene - General mailing list archive at Nabble.com.


gsingers at apache

Jan 11, 2010, 6:09 AM

Post #6 of 13 (2309 views)
Permalink
Re: Lucene vs Sphinx [In reply to]

On Jan 11, 2010, at 9:01 AM, ManojPatel wrote:

>
> I have large amount of data and I want to run my search query on different
> servers. How can I run lucene search engine on different server as
> distributed application.

Solr has distributed search support out of the box. Or, you can write your own.

>
>
>
> Ted Dunning wrote:
>>
>> For my own purposes, there is no question since I need huge amounts of
>> flexibility, high scalability, have data that is not in SQL and am not
>> interested in a large C++ program in my production area.
>>
>> Others will have differing points of view. I worry that the Sphinx people
>> say that Sphinx is necessarily faster because it is written in C++. I
>> note
>> also that they don't seem to understand how SOLR massively simplifies the
>> deployment of Lucene for most applications.
>>
>> On Sun, Jan 10, 2010 at 10:59 PM, ManojPatel
>> <patelmanojb2000 [at] gmail>wrote:
>>
>>>
>>> hi i am new in search development. I want to know which one is best from
>>> Lucene and (Sphinx with Ispell).
>>> Answer of this message will helpful to developer who are started working
>>> on
>>> search.
>>>
>>> Thanks.
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Lucene-vs-Sphinx-tp27106508p27106508.html
>>> Sent from the Lucene - General mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>> --
>> Ted Dunning, CTO
>> DeepDyve
>>
>>
>
> --
> View this message in context: http://old.nabble.com/Lucene-vs-Sphinx-tp27106508p27110849.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search


ted.dunning at gmail

Jan 11, 2010, 8:57 AM

Post #7 of 13 (2314 views)
Permalink
Re: Lucene vs Sphinx [In reply to]

SOLR is an excellent ready to run solution as Grant suggests.

Katta is another alternative for searching very large data sets. I know
that some are searching a corpus of roughly a billion documents using Katta.

Katta is not out of the box ready to run the way that SOLR is and Katta will
scale to much larger collections, but SOLR itself will do a very decent job
of scaling. There is even a SOLR plugin to allow SOLR to search using a
Katta cluster.

On Mon, Jan 11, 2010 at 6:01 AM, ManojPatel <patelmanojb2000 [at] gmail>wrote:

>
> I have large amount of data and I want to run my search query on different
> servers. How can I run lucene search engine on different server as
> distributed application.
>
>
>


ashwin.jayaprakash at gmail

Jan 11, 2010, 10:20 AM

Post #8 of 13 (2303 views)
Permalink
Re: Lucene vs Sphinx [In reply to]

The results here might prove useful -
http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/

Ashwin Jayaprakash.


On Mon, Jan 11, 2010 at 6:01 AM, ManojPatel <patelmanojb2000 [at] gmail>wrote:

>
> I have large amount of data and I want to run my search query on different
> servers. How can I run lucene search engine on different server as
> distributed application.
>
>
>
> Ted Dunning wrote:
> >
> > For my own purposes, there is no question since I need huge amounts of
> > flexibility, high scalability, have data that is not in SQL and am not
> > interested in a large C++ program in my production area.
> >
> > Others will have differing points of view. I worry that the Sphinx
> people
> > say that Sphinx is necessarily faster because it is written in C++. I
> > note
> > also that they don't seem to understand how SOLR massively simplifies the
> > deployment of Lucene for most applications.
> >
> > On Sun, Jan 10, 2010 at 10:59 PM, ManojPatel
> > <patelmanojb2000 [at] gmail>wrote:
> >
> >>
> >> hi i am new in search development. I want to know which one is best from
> >> Lucene and (Sphinx with Ispell).
> >> Answer of this message will helpful to developer who are started working
> >> on
> >> search.
> >>
> >> Thanks.
> >> --
> >> View this message in context:
> >> http://old.nabble.com/Lucene-vs-Sphinx-tp27106508p27106508.html
> >> Sent from the Lucene - General mailing list archive at Nabble.com.
> >>
> >>
> >
> >
> > --
> > Ted Dunning, CTO
> > DeepDyve
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/Lucene-vs-Sphinx-tp27106508p27110849.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>
>


gsingers at apache

Jan 11, 2010, 10:31 AM

Post #9 of 13 (2295 views)
Permalink
Re: Lucene vs Sphinx [In reply to]

You might also note that Solr is working very actively on improving distributed search even more: http://wiki.apache.org/solr/SolrCloud

I've seen billion plus Solr installations in production, but they require some operations caretaking. The Solr Cloud stuff is aiming to make that a lot less so. For most people with smaller distributed needs (in other words, not Internet scale), the current distributed search capabilities should suffice

On Jan 11, 2010, at 11:57 AM, Ted Dunning wrote:

> SOLR is an excellent ready to run solution as Grant suggests.
>
> Katta is another alternative for searching very large data sets. I know
> that some are searching a corpus of roughly a billion documents using Katta.
>
> Katta is not out of the box ready to run the way that SOLR is and Katta will
> scale to much larger collections, but SOLR itself will do a very decent job
> of scaling. There is even a SOLR plugin to allow SOLR to search using a
> Katta cluster.
>
> On Mon, Jan 11, 2010 at 6:01 AM, ManojPatel <patelmanojb2000 [at] gmail>wrote:
>
>>
>> I have large amount of data and I want to run my search query on different
>> servers. How can I run lucene search engine on different server as
>> distributed application.
>>
>>
>>


marvin at rectangular

Jan 11, 2010, 10:52 AM

Post #10 of 13 (2304 views)
Permalink
Re: Lucene vs Sphinx [In reply to]

On Mon, Jan 11, 2010 at 10:20:04AM -0800, Ashwin Jayaprakash wrote:
> The results here might prove useful -
> http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/

They're totally worthless.

One key design decision I made was not to change any numerical tuning
parameters. I really wanted to test “Out of the Box” performance to
simulate the common developer scenario. Plus, it takes forever to optimize
parameters fairly across multiple platforms and different data sets esp.
for an over-the-weekend benchmark (see disclaimer in the Conclusion
section).

If you're not going to tune your installation, you deserve the crappy
performance you'll get.

Exposing shoddy, weekender foolishness like this is why I want ORP to do
scientifically rigorous performance benchmarking as well as relevance
benchmarking.

Marvin Humphrey


rcmuir at gmail

Jan 11, 2010, 10:59 AM

Post #11 of 13 (2303 views)
Permalink
Re: Lucene vs Sphinx [In reply to]

If you look at the code to these indexers, why did they pre-process
the corpus to a specific format sphinx needs, then run 'time indexer',
but for the lucene benchmark parsing was part of their indexer and
included in the benchmarking time?

On Mon, Jan 11, 2010 at 1:52 PM, Marvin Humphrey <marvin [at] rectangular> wrote:
> On Mon, Jan 11, 2010 at 10:20:04AM -0800, Ashwin Jayaprakash wrote:
>> The results here might prove useful -
>> http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/
>
> They're totally worthless.
>
>    One key design decision I made was not to change any numerical tuning
>    parameters. I really wanted to test “Out of the Box” performance to
>    simulate the common developer scenario. Plus, it takes forever to optimize
>    parameters fairly across multiple platforms and different data sets esp.
>    for an over-the-weekend benchmark (see disclaimer in the Conclusion
>    section).
>
> If you're not going to tune your installation, you deserve the crappy
> performance you'll get.
>
> Exposing shoddy, weekender foolishness like this is why I want ORP to do
> scientifically rigorous performance benchmarking as well as relevance
> benchmarking.
>
> Marvin Humphrey
>
>



--
Robert Muir
rcmuir [at] gmail


ashwin.jayaprakash at gmail

Jan 11, 2010, 1:45 PM

Post #12 of 13 (2305 views)
Permalink
Re: Lucene vs Sphinx [In reply to]

That might be true. But it serves as a compendium of current open src
engines (with debatable results though). Something that is missing in the
Lucene wiki.



On Mon, Jan 11, 2010 at 10:52 AM, Marvin Humphrey <marvin [at] rectangular>wrote:

> On Mon, Jan 11, 2010 at 10:20:04AM -0800, Ashwin Jayaprakash wrote:
> > The results here might prove useful -
> >
> http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/
>
> They're totally worthless.
>
> One key design decision I made was not to change any numerical tuning
> parameters. I really wanted to test Out of the Box performance to
> simulate the common developer scenario. Plus, it takes forever to
> optimize
> parameters fairly across multiple platforms and different data sets esp.
> for an over-the-weekend benchmark (see disclaimer in the Conclusion
> section).
>
> If you're not going to tune your installation, you deserve the crappy
> performance you'll get.
>
> Exposing shoddy, weekender foolishness like this is why I want ORP to do
> scientifically rigorous performance benchmarking as well as relevance
> benchmarking.
>
> Marvin Humphrey
>
>


gsingers at apache

Jan 11, 2010, 2:42 PM

Post #13 of 13 (2300 views)
Permalink
Re: Lucene vs Sphinx [In reply to]

On Jan 11, 2010, at 4:45 PM, Ashwin Jayaprakash wrote:

> That might be true. But it serves as a compendium of current open src
> engines (with debatable results though). Something that is missing in the
> Lucene wiki.

While true, it's hardly Lucene's job to keep track of all the other open source search engines out there... :-)

-Grant

Lucene general RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.