Gossamer Forum
Home : Products : Gossamer Links : Discussions :

DMOZ dump question

Quote Reply
DMOZ dump question
I have not done much with DMOZ, other than to slice, dice and import some sections.

Right now, I'm about 2/3 of the way through the import of the most recent DMOZ dump. It's down to as low as 20,000 links an hour, which means a long, long, time to go.

But, I'm curious about a few things of the dump.

Is each URL in the dump only once? DMOZ says links can occasionally be in several categories. If so, the URL would show up more than once in the database. If this is so, will Links add a CatLink, or will it import the URL again (or ignore it??)

I tried to order the database as it's imported so far, and there seem to be a few blank or screwed up URL's, but no official duplicates.

The number of my CatLinks reported by MySQLMan equals the number of Links.

So I'm curious about duplicate URL's and cats. Anyone know for sure?

PUGDOG´┐Ż Enterprises, Inc.

The best way to contact me is to NOT use Email.
Please leave a PM here.