Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: General

Using Lucene to index OSM nodes (400M latitude/longitude points)

 

 

Lucene general RSS feed   Index | Next | Previous | View Threaded


kelly.terry.jones at gmail

Jun 23, 2009, 8:52 PM

Post #1 of 3 (336 views)
Permalink
Using Lucene to index OSM nodes (400M latitude/longitude points)

Can Lucene index the openstreetmap.org (OSM) node db (400M
latitude/longitude pairs), and then find the 20 nodes closest to a
given latitude/longitude?

More specifically:

% Can Lucene index numerical data and understand that 16 is close to
15, but far away from 160000?

% Is Lucene reasonably fast indexing 400M floating point pairs?

% After Lucene creates the 400M index, can it return search results
reasonably fast?

% Is there a guide/tutorial that shows how to use Lucene to index
numerical data (I'm using Plucene, but I'll settle for any sort of
guide)?

I tried to index OSM data w/ SQLite3, but it took forever.

I realize I could use MySQL/PostgreSQL, but I'm looking for an
embedded/serverless solution.

--
We're just a Bunch Of Regular Guys, a collective group that's trying
to understand and assimilate technology. We feel that resistance to
new ideas and technology is unwise and ultimately futile.


otis_gospodnetic at yahoo

Jun 23, 2009, 9:16 PM

Post #2 of 3 (309 views)
Permalink
Re: Using Lucene to index OSM nodes (400M latitude/longitude points) [In reply to]

Hi Kelly,

I think you want to look at LocalLucene (or LocalSolr). I haven't played with Local*, so I can't provide more than this tip. Actually, I can also suggest to dump Plucene - it's a dead project, and even when it was alive it was quite slow. If you really need to be able to search from a Perl application, your best bet may be using a Perl Solr client.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Kelly Jones <kelly.terry.jones[at]gmail.com>
> To: general[at]lucene.apache.org
> Sent: Tuesday, June 23, 2009 11:52:48 PM
> Subject: Using Lucene to index OSM nodes (400M latitude/longitude points)
>
> Can Lucene index the openstreetmap.org (OSM) node db (400M
> latitude/longitude pairs), and then find the 20 nodes closest to a
> given latitude/longitude?
>
> More specifically:
>
> % Can Lucene index numerical data and understand that 16 is close to
> 15, but far away from 160000?
>
> % Is Lucene reasonably fast indexing 400M floating point pairs?
>
> % After Lucene creates the 400M index, can it return search results
> reasonably fast?
>
> % Is there a guide/tutorial that shows how to use Lucene to index
> numerical data (I'm using Plucene, but I'll settle for any sort of
> guide)?
>
> I tried to index OSM data w/ SQLite3, but it took forever.
>
> I realize I could use MySQL/PostgreSQL, but I'm looking for an
> embedded/serverless solution.
>
> --
> We're just a Bunch Of Regular Guys, a collective group that's trying
> to understand and assimilate technology. We feel that resistance to
> new ideas and technology is unwise and ultimately futile.


ted.dunning at gmail

Jun 23, 2009, 11:06 PM

Post #3 of 3 (306 views)
Permalink
Re: Using Lucene to index OSM nodes (400M latitude/longitude points) [In reply to]

On Tue, Jun 23, 2009 at 8:52 PM, Kelly Jones <kelly.terry.jones[at]gmail.com>wrote:

> Can Lucene index the openstreetmap.org (OSM) node db (400M
> latitude/longitude pairs), and then find the 20 nodes closest to a
> given latitude/longitude?
>

Probably. (isn't that definitive!)

% Can Lucene index numerical data and understand that 16 is close to
> 15, but far away from 160000?
>

As of the recent versions, yes.


> % Is Lucene reasonably fast indexing 400M floating point pairs?
>

Should be. I have no experience with this kind of indexing. You will need
a pretty good sized amount of memory or use a sharding system like katta.


>
> % After Lucene creates the 400M index, can it return search results
> reasonably fast?
>

With the right level of parallelism, absolutely.


> % Is there a guide/tutorial that shows how to use Lucene to index
> numerical data (I'm using Plucene, but I'll settle for any sort of
> guide)?
>

Not really. Efficient numerical search is relatively new in Lucene:

See the slide shows on Michael Busch's linked-in profile:
http://www.linkedin.com/profile?viewProfile=&key=12809985&authToken=-hrn&authType=name

Also, see here:
http://wiki.apache.org/lucene-java/SearchNumericalFields



--
Ted Dunning, CTO
DeepDyve

111 West Evelyn Ave. Ste. 202
Sunnyvale, CA 94086
http://www.deepdyve.com
858-414-0013 (m)
408-773-0220 (fax)

Lucene general RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.