
erik at ehatchersolutions
Oct 24, 2005, 2:44 AM
Post #2 of 3
(7847 views)
Permalink
|
|
Re: Ferret: A native Ruby port of Apache Lucene
[In reply to]
|
|
Dave - thanks for Ferret, and your sharing it with us here. I've not had a chance to try it personally yet, but will very soon. On 24 Oct 2005, at 01:55, David Balmain wrote: > Just to clarify a few things, the reason I chose the name Ferret > and not > RubyLucene is because 1) rubylucene was already taken ;-) Well, it wasn't really taken exactly. I started rucene and renamed it to rubylucene at rubyforge a long while ago and never did anything with it beyond some very very rudimentary low-level I/O work. My main hesitation, besides the zillion other time commitments, was my concern about duplicating effort given how PyLucene works with GCJ and SWIG and can come up to speed with Java Lucene almost automatically. > and 2) I didn't > want to feel tied to the Apache Lucene API. If there is a better > way to do > something in Ruby, I'd like to do it that way. Having said that, > I've mostly > stuck to the Apache Lucene API. And I intend to continue supporting > the > Apache Lucene index format. Hopefully some of the ideas in Ferret > will one > day be adopted back into the Apache Lucene project. As long as it is compatible with the index format I think it's fair to use "lucene" in the name. You're welcome to "rubylucene" if you like :) > As for what I've ported so far? Almost everything. All query types > including > span queries are in there. Impressive! Lucene 1.4.3 compatibility? Or TRUNK? > * Performance: Well, it is Ruby. However, I've written the indexer > in C and > it makes the Java version seem painfully slow. So don't expect > Ferret to > remain slower than Apache Lucene forever. Do you think there is anything about the Java implementation that could be improved in this regard so that the difference is not so dramatic? What is the C code optimizing that the Java is not? Surely we could bring the Java implementation close to C level speed in terms of I/O, no? While indexing speed is certainly very important, in many (most?) projects the searching speed is the main concern and indexing speed is of much less concern. > So, if anyone wants to help out, please download it, play with it > and give > me your feedback. It's available as a gem and there is a quick > tutorial > here; > > http://ferret.davebalmain.com/api/files/TUTORIAL.html You can count on it... I'll be there a little later today! Erik
|