Gossamer Forum
Home : Products : Links 2.0 : Discussions :

REPOST: DMOZ import help

Quote Reply
REPOST: DMOZ import help
// This was posted in Links Discussion by mistakte //

I am working on the dmoz .rdf import. The file size of the .rdf is in the 700 MB range (uncompressed). The import has been running for 6 days now and no where near completion. Last check, Catlinks are at 667,426 and Categories are at 54,205.

The dmoz.rdf was the U8 gzip not the normal. I used this on Alex's suggestion. Could this be an issue?

My question is a matter of performance. Does this at all sound odd that the import has been running for 6 days and only 1/4 of the way complete? At this rate, were talking 2-3 weeks for total import (fingers are crossed that it doesn't unexpectedly terminate along the way).

CPU resources are averaging 90-95% and memory only 10-12%. This leads me to believe it's not a performance issue on my end as the memory isn't strapped and now using disk cache.

The machine: 400 MHZ PII, 196 MB Ram, newest Perl, newest MySQL and mod_perl.

Any ideas on how to increase performance and lower import time?

The import will soon be running for an entire week. Does this sound odd?

The other thing is that the database is located on a separate server, about 5 feet away. Although it's close, it's still a network connection and this leaves me wondering if the network connectivity could be the slowness culprit. I can't imagine that it would be but something is.

As of today, the import has been running for an entire week.
Links = 777837 and Category = 60140.

Can someone help?

Quote Reply
Re: REPOST: DMOZ import help In reply to
You can delete your Threads...by clicking on Edit, then clicking on the DELETE button.


Eliot Lee