Gossamer Forum
Home : Products : Gossamer Links : Discussions :

DMOZ SERVER SHARE... want a piece of DMOZ?

Quote Reply
DMOZ SERVER SHARE... want a piece of DMOZ?
My host has offered to rent me one of his servers for a couple of days next week... He will download and unzip all of DMOZ and then make sections/categories available via ./nph-import.cgi to me...us? I would like to share the cost for this with anyone else that is interested.

Any interest or ideas, please let me know!

Thanks
Ryan
HomerUSA

Quote Reply
Re: DMOZ SERVER SHARE... want a piece of DMOZ? In reply to
People are already doing this but it is up to you. Im sure a lot of people will be interested.

If your host is willing to spend a week importing the whole of DMOZ then good luck to him :)

Paul Wilson.
http://www.wiredon.net/gt/
http://www.perlmad.com/
Quote Reply
Re: DMOZ SERVER SHARE... want a piece of DMOZ? In reply to
Paul,

On a fast machine, with enough RAM, the DMOZ import takes only about 10 hours. ON a multi-cpu machine it might take less.

The speed issues on import are directly related to hardware, and what else the cpu is doing during the import.

If I got a little more revenue support from the forums here, either in module registrations or consulting work, I could have a separate machine here doing the DMOZ import on a weekly basis. Exporting the data from links, into a links-importable database is not as much of a problem -- would take a bit of work, but it could be done. It's a matter of not putting ID"s on the "links" but merging the categories in to an existing database, or starting the category count at the next ID for the categories, and updating the "CatLinks" table accordingly.

I need about $2k of direct revenue from this forum (ie: Links SQL consulting, registrations, support, install, etc) in order to offer this service. I'm only about halfway there at this point.

From our (PUGDOG Enterprises, Inc) point of view, I need to set up this sort of machine, and export capability, because I want to set up for display and sale about 300+ Links/Directory sites over the next 6 months.

In order to do that, there are still some bugs/quirks in Links 2.x that need to be worked out, and I have to have the file upload/postcards, and hopefully the Image Gallery mods all in operation.

But the offshoot is that for very reasonable fees, (on the order of $15 to $100) I could offer portions of DMOZ in Links-Digestable/importable, even UPDATABLE formats!!

This is the last project on my list of "must do's" becuase it's such a demand thing, if I wait long enough maybe other people will start off code that can help, and because for the short term I can' do it all via interactive SQL commands on a fast machine in the middle of the night.

But, automating the process for people to do key-word or category select downloads/imports, etc is something I don't have time for, _UNLESS_ there is a direct revenue stream.

Unfortunately, because of my restrictions on time owed to my primary job, I can't do this unless the revenue is already in place -- or it is part of my primary job.

I figure in about 6 months I'll have the "time" to do this as part of my primary job. The more revenue generated here, the higher up I can push this, and the sooner it will get done.

It's an economic reality. None of us -- except GT -- is getting directly paid for anything done here. It's all back-end, side deals or profits off our running sites -- and we run FREE sites! So, people who need programming or services are going to have to pay their own way -- there are no advertisers doing it any more!

Make sense?

It's just reality.

I'll have the 01.02.xx version of the graphics/logo mod out today or tomorrow, and will freeze the features. I will start with the 02.xx multiple attachment version right after that, and the postcards plug-in should be released this week as well.

Hopefully that will help my situation, but right now I'm time-crunched, and revenue starved from all of this "new" programming.

AND!!! Don't forget, I've agreed (offered, foolishly raised my hand) to coordinate and start off the Ratings/Review mod, as a group project/learning experience -- and this may be the grandest project I've ever undertaken. The possibilities are endless!! (even making it a Links specific chunk of code!).



PUGDOGŪ Enterprises, Inc.
FAQ:http://LinkSQL.com/FAQ
Forum:http://LinkSQL.com/forum
Quote Reply
Re: DMOZ SERVER SHARE... want a piece of DMOZ? In reply to
Paul

Who is already doing this? I'd rather just buy the categories that I want... if it is reasonable. Do you know where I could do this? Who? How much$?

BTW: What part of this DMOZ process is the part that everyone has trouble with? Moving the 130mb gzip file from dmoz.org to my hosts server is fast & easy... with a dedicated server I assume gunzipping it would not be a big deal... I thought the next step was for folks to select/transfer the categories they want from the 'dedicated' server to the server where their LinksSQL is located (with the Links import.cgi software)? Is this the part that takes days & days (even when both servers are connected to the internet via T-1/3)?

Thanks
Ryan
HomerUSA
ps. I thought you were 'cutting up' and selling categories?

Quote Reply
Re: DMOZ SERVER SHARE... want a piece of DMOZ? In reply to
Transfer of dmoz takes only about 2-3 minutes.

Unzipping it takes a minute or two, depending on your system ram.

The _import_ is what takes the time. Reading the DMOZ file, then importing it. I think this has to do with how recursive DMOZ is, and the depth of the datastructures. If this was done in a 2-3 pass process, system resources could probably be cut.

Until I actually try to do this, I wouldn't know for sure.



PUGDOGŪ Enterprises, Inc.
FAQ:http://LinkSQL.com/FAQ
Forum:http://LinkSQL.com/forum
Quote Reply
Re: DMOZ SERVER SHARE... want a piece of DMOZ? In reply to
Hi what category do you want?

Paul Wilson.
http://www.wiredon.net/gt/
http://www.perlmad.com/
Quote Reply
Re: DMOZ SERVER SHARE... want a piece of DMOZ? In reply to
Paul,
It depends on the size & cost...

Ideally, I'd like:

Top: Regional: North America: United States (375k)

If that is too much:

Top: Regional: North America: United States: Alaska (4k)
Top: Regional: North America: United States: Oregon (7k)
Top: Regional: North America: United States: Nevada (3k)

if that is too much:

Top: Regional: North America: United States: Alaska: Boroughs: Kenai Peninsula
Top: Regional: North America: United States: Oregon: Counties: Jackson
Top: Regional: North America: United States: Nevada: Localities: C: Carson City

I can't imagine running more than 50mb on my shared server account.

Thanks
Ryan
ps. Is there some easy way to store/backup/delete whole categories 'all at once'?

Quote Reply
Re: [HomerUSA] DMOZ SERVER SHARE... want a piece of DMOZ? In reply to
Hi

I would need cats and links flat db of dmoz category:

http://www.dmoz.org/World/Hrvatski/

counting 2,772 links total

the listed subcategories in directory "Hrvatska" category should be top categories
in flat db.


Can anyone do me a favor?

thanks in advance