Gossamer Forum
Home : Products : Gossamer Links : Version 1.x :

Importing dmoz data revisited

Quote Reply
Importing dmoz data revisited
Hi:

I have read through, and followed the directions in the previous topic (which is great- hope Pugdog adds that one to his FAQ!)

Anyway, I got the 80 meg file to my server fine. I expanded it fine. I ran Parse_RDF.pl (yes, from Telnet) and it goes through about 5000 to 6000 topics (all the Adult, and into Arts/Music or thereabouts), then the work QUIT (in Caps) appears and it stops. It always seems to stop at a different point, too (for what tha is worth).

I know there is a place in Parse_RDF.pl for the number of topics (preset to 5000, I think), and I have upped that to 9999999, with no effect.

Does anyone know why this is happening? It never ven gets to the topics I want, so I cannot import anything. Any ideas would be helpful....

Thanks!

Dave
Quote Reply
Re: Importing dmoz data revisited In reply to
Are you on a virtual host?

If it stops, does it stop after the same amount of _time_ ...

Some ISP's have limited a running process to 30 minutes or less to prevent wild untamed processes from taking down the machine.



------------------
POSTCARDS.COM -- Everything Postcards on the Internet www.postcards.com
LinkSQL FAQ: www.postcards.com/FAQ/LinkSQL/








Quote Reply
Re: Importing dmoz data revisited In reply to
pugdog:

Thanks for the reply. That is something I have considered... my ISP does NOT allow unattended telnet bots (says so on the log in). I am always keeping the window open, so it is attended. It is a virtual host (I think!) so maybe I need to talk to them. Hoswever, the time involved is very little- 3-5 minutes... it took MUCH longer to d/l the thing, and that did not time out...

Dave
Quote Reply
Re: Importing dmoz data revisited In reply to
There tends to be a bit more leeway for system processes (FTP, Mail, Telnet, etc) than for .cgi processes ie: user nobody

But, before you drive yourself crazy, ask them <G> You might find they have a bot going around killing anything it doesn't like <G>

Also... they may have a limit on CPU time... the FTP process uses less CPU than the import program. They may limit a process to CPU time -- or percent -- as well as overall run time.



------------------
POSTCARDS.COM -- Everything Postcards on the Internet www.postcards.com
LinkSQL FAQ: www.postcards.com/FAQ/LinkSQL/