Gossamer Forum
Home : Products : Links 2.0 : Discussions :

Need a quick and easy Database

Quote Reply
Need a quick and easy Database
I am looking to create a starting database of say 50,000 entries for a general (all subjects) search engine. I have tried using spidering software but have produced lots of rubbish. I have looked into using the dmoz data but there is far too much to take in.

Does anyone know of anywhere where I can purchase a good database of links and categories to save me the time of checking and approving 50,000 or so links ???





Quote Reply
Re: Need a quick and easy Database In reply to
I don't think there's a quick fix. I certainly haven't found it yet!

BTW Links 2.0 won't handle 50,000 entries. Crazy

Martin Webster
--
Cebidae's UK Internet Resource
http://www.cebidae.co.uk/
Quote Reply
Re: Need a quick and easy Database In reply to
50,000 entries is way too much for Links 2.0, but the DMOZ mod can export by subcategory so grabbing stuff from the RDF's shouldn't be a problem. You might be able to get between 5 and 10 thousand links if you keep the size of descriptions and titles at a minimum and don't add any special fields.

--Drew
Quote Reply
Re: Need a quick and easy Database In reply to
I'm using Links SQL. 1. I would like to import a part of the DMOZ into my directory. 2. I want to be able to control rankings in my SQL directory of my urls inputs. 3. I want to search my sites first before the DMOZ sites. Anyone know someone who can handle this setup for me, reasonably?
Thanks

Quote Reply
Re: Need a quick and easy Database In reply to
Actually I am using Hyperseek 2000, which will handle the 50,000 I need. Its the database I am after, (I thought it was very easy to convert from Links to Hyperseek) and am willing to pay for it if someone has gone through the pain for me.

Just as a matter of interest how easy is it to use the dmoz extractor and how long should it take to extract from say 20 different categories?


Quote Reply
Re: Need a quick and easy Database In reply to
bygeorge,
LinksSQL comes with it's own export utility. Since the links would become part of your database, the only way to have Links search your original records first, would be to add a field indicating it was from DMOZ, and then forcing Links to sort those last during a search. I don't know enough about Links SQL to comment on the ranking question.

dhayden2,
Just a few minutes (if you already have of the RDFs). The extracter is somewhere at the hyperseek support site at http://www.iwebsupport.com.

--Drew
Quote Reply
Re: Need a quick and easy Database In reply to
I have pulled off the extractor from the Hyperseek site. I have no idea wot to do now though.

Do you know if there's a dummy's step by step guide to using it anywhere. I have tried searching through this forumand the one on Hyperseek without success.

One question I have to begin with is - do I need to download the entire contents file (114 MB)??

Quote Reply
Re: Need a quick and easy Database In reply to
Open the script and go through the whole file and look for all the variables. Some are at the top, a few scattered around. Yes, you need to download the entire RDF's (content.rdf is more than 300 MB unzipped)

--Drew