Gossamer Forum
Home : Products : Gossamer Links : Development, Plugins and Globals :

DMOZ wizard basic question

Quote Reply
DMOZ wizard basic question
Just to quick question to do with Andy's DMOZ wizard....

For the script to work I need to download and upload to my server the whole content.rdf.u8 file? all 1.x gig's worth just to import a handfull of categories?

thanks,



Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Quote Reply
Re: [Chas-a] DMOZ wizard basic question In reply to
How many links are in that 'handfull of categories'...
They can vary is size somewhat.Pirate
Quote Reply
Re: [Gypsypup] DMOZ wizard basic question In reply to
 
Indeed Laugh I'd say about 400 sites.

The reason I ask is that after running the SSH command to import the selected sub categories from DMOZ the script couldn't find the content.rdf.u8 file locally, which suggests ill need the whole file on the server?

-- keeping it noddy proof Cool




Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Quote Reply
Re: [Chas-a] DMOZ wizard basic question In reply to
Hi. Yeah, Paul is right I'm afraid. It has to grab the 1+Gb file, before it can extract the specific area you want to import (into a dump.slice file), and then it will run through and import the links/categories for you. The .dump.slice files obviously vary quite a bit in size, depending on the size of the category/categories you are trying to import.

Hope that helps.

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] DMOZ wizard basic question In reply to
 

I see the latest content.rdf.u8.gz file is 1.85 gig compressed - does the DMOZ wizard need to uncompress this to function? - will I need more than 1.85 gig's of space on my server for the plugin to run?

Cheers Andy, Smile

Charlie



Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Quote Reply
Re: [Chas-a] DMOZ wizard basic question In reply to
Hi. If you have zlib, it *should* be possible to have to extract the .gz file (i.e it will read it as the compressed file, content.rdf.u8.gz). It may be a little slower making the slice file, but it shouldn't impact the actual import process itself.

Please let me know if you would like instructions on how to use zlib.

Hope that helps.

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] DMOZ wizard basic question In reply to
   
Hi,

It would be great if you could detail the usual method people use to import DMOZ links using the Wizard.

Is the alternative to using zlib (not mentioned in the Links SQL forum before) to manually extracting the file before uploading the uncompressed version to the server?

- I just extracted the content.rdf.u8 file (using WinZIP) and that files size is 292 meg.


Thanks for your help fella.



Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Last edited by:

Chas-a: Jul 5, 2004, 2:47 AM
Quote Reply
Re: [Chas-a] DMOZ wizard basic question In reply to
Quote:
It would be great if you could detail the usual method people use to import DMOZ links using the Wizard.

Its not as simple as that I'm afriad :( DMOZ_Wizard already uses nph-import.cgi. All it does, it slice up the main RDF file (into smaller, more managable slices.. which speeds the process up), and then runs the appropriate nph-import.cgi commands (automatically).

Quote:
- I just extracted the content.rdf.u8 file (using WinZIP) and that files size is 292 meg.

Only 300Mb? Decompressed? Its normally that whilst its compressed Unsure

Quote:
Is the alternative to using zlib (not mentioned in the Links SQL forum before) to manually extracting the file before uploading the uncompressed version to the server?

Its not a simple mod to make... but if you only envisage doing it a couple of times, then it shouldn't be too much of a task.

Basically, you need to look for the part which says;

`gzip -d /full/path/to/admin/content.rdf.u8.gz`;

...and then change all references to content.rdf.u8 to contentrdf.u8.gz.

Basically, it stops it from being decompressed .. and attempts to read it, whilst its still comressed (just saves space).

Hope that helps.

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] DMOZ wizard basic question In reply to
 
Andy, thanks Smile

Almost got it now, i think...

Assuming i didn't mention the content.rdf.u8.gz file (as the DMOZ wizard doesn't refer to it by default) - what method would i use to add the uncompressed file to the Links SQL admin folder (where the DMOZ Wizard can access it)?

1. Manually extract the content.rdf.u8 file using an extractor program (example WinZip) and upload it to the admin folder?

or 2. upload the compressed version to the admin folder and extract the content.rdf.u8 file using a Shell command?

if its 2 would you mind posting the SSH commands for this?

thanks again fella ,



Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Quote Reply
Re: [Chas-a] DMOZ wizard basic question In reply to
Hi,

Quote:
1. Manually extract the content.rdf.u8 file using an extractor program (example WinZip) and upload it to the admin folder?

Unless you have a LOT of RAM on your computer, I wouldn't recommend it :(

Send me a PM with the category that you want to import, and I'll make up a slice for you. Will be a lot easier, as it should only be a few Mb, instead of a few thousand :D

Quote:
or 2. upload the compressed version to the admin folder and extract the content.rdf.u8 file using a Shell command?

The way I do it is;

wget http://rdf.dmoz.org/rdf/content.rdf.u8.gz <enter>
gzip -d content.rdf.u8.gz <enter>

.. and then you have it on your server. Not sure if thats what you were asking Unsure

BTW.. just going on lunch,... so you don't think I'm ignoring you :D

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] DMOZ wizard basic question In reply to
Quote:
Unless you have a LOT of RAM on your computer, I wouldn't recommend it :(

512k ram Wink so i can extract it NP, except the file size comes out as 292 meg which suggests its not extracting correctly. -and the trouble with this is even on broadband it'll take a few hours to upload it and check if its extracted correctly.

Cheers for the offer btw Smile

Quote:
The way I do it is;

wget http://rdf.dmoz.org/rdf/content.rdf.u8.gz <enter>
gzip -d content.rdf.u8.gz <enter>

Sounds like the best option, except it would seem wget isn't installed.

wget http://rdf.dmoz.org/rdf/content.rdf.u8.gz <enter>
bash: wget: command not found

Here's the full list of commands i've got in shell:


GNU bash, version 2.05a.0(1)-release (i686-pc-linux-gnu)
These shell commands are defined internally. Type `help' to see this list.
Type `help name' to find out more about the function `name'.
Use `info bash' to find out more about the shell in general.

A star (*) next to a name means that the command is disabled.

%[DIGITS | WORD] [&] . filename
: [ arg... ]
alias [-p] [name[=value] ... ] bg [job_spec]
bind [-lpvsPVS] [-m keymap] [-f fi break [n]
builtin [shell-builtin [arg ...]] case WORD in [PATTERN [| PATTERN].
cd [-PL] [dir] command [-pVv] command [arg ...]
compgen [-abcdefgjksvu] [-o option complete [-abcdefgjkvu] [-pr] [-o
continue [n] declare [-afFrxi] [-p] name[=value
dirs [-clpv] [+N] [-N] disown [-h] [-ar] [jobspec ...]
echo [-neE] [arg ...] enable [-pnds] [-a] [-f filename]
eval [arg ...] exec [-cl] [-a name] file [redirec
exit [n] export [-nf] [name ...] or export
false fc [-e ename] [-nlr] [first] [last
fg [job_spec] for NAME [in WORDS ... ;] do COMMA
function NAME { COMMANDS ; } or NA getopts optstring name [arg]
hash [-r] [-p pathname] [-t] [name help [-s] [pattern ...]
history [-c] [-d offset] [n] or hi if COMMANDS; then COMMANDS; [ elif
jobs [-lnprs] [jobspec ...] or job kill [-s sigspec | -n signum | -si
let arg [arg ...] local name[=value] ...
logout popd [+N | -N] [-n]
printf format [arguments] pushd [dir | +N | -N] [-n]
pwd [-PL] read [-ers] [-t timeout] [-p promp
readonly [-anf] [name ...] or read return [n]
select NAME [in WORDS ... ;] do CO set [--abefhkmnptuvxBCHP] [-o opti
shift [n] shopt [-pqsu] [-o long-option] opt
source filename suspend [-f]
test [expr] time [-p] PIPELINE
times trap [arg] [signal_spec ...] or tr
true type [-apt] name [name ...]
typeset [-afFrxi] [-p] name[=value ulimit [-SHacdflmnpstuv] [limit]
umask [-p] [-S] [mode] unalias [-a] [name ...]
unset [-f] [-v] [name ...] until COMMANDS; do COMMANDS; done
variables - Some variable names an wait [n]
while COMMANDS; do COMMANDS; done { COMMANDS ; }


Andy, what do you think the best work around for this would be? install wget (is possable)?

thanks,

Charlie




Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Last edited by:

Chas-a: Jul 5, 2004, 5:43 AM
Quote Reply
Re: [Andy] DMOZ wizard basic question In reply to
 
Quote:


BTW.. just going on lunch,...[/reply]
Well I guess that's it for today then.... Wink



Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Quote Reply
Re: [Chas-a] DMOZ wizard basic question In reply to
Mmmm... how many links are you looking at importing? Paul made a script a while back, which you could run on your PC (if you have Perl etc installed locally), and it would cut up the content.rdf.u8 file, into a smaller extract of the appropriate category/links. You can then manually upload this to your server, and instead of setting the source as "content.rdf.u8", you would need to put your file name here. Not sure if that makes sense Pirate

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Chas-a] DMOZ wizard basic question In reply to
In Reply To:
Well I guess that's it for today then.... Wink

LOL! I had a long weekend ;)

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] DMOZ wizard basic question In reply to
Ahhh, yes i remember seeing a post about this, I've got a version of Apache on Window XP Pro - should be easy enough to install Perl to go with it do you think?

Is this the one you where thinking of?

http://gossamer-threads.com/perl/gforum/gforum.cgi?post=189761

If so it looks like ill also need to run a shell command locally?

Im looking to import about 40 categories with on average 20 links in each, every month or so checking for updates.

In Reply To:
In Reply To:
Well I guess that's it for today then.... Wink

LOL! I had a long weekend ;)

Cheers


Hehe, NP Cool



Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Last edited by:

Chas-a: Jul 5, 2004, 7:46 AM
Quote Reply
Re: [Chas-a] DMOZ wizard basic question In reply to
Quote:
Ahhh, yes i remember seeing a post about this, I've got a version of Apache on Window XP Pro - should be easy enough to install Perl to go with it do you think?

I have to admin... I used a package Blush No idea how easy it is to install on a local PC manually :(

Quote:
Is this the one you where thinking of?

Nope :p This one;

http://www.gossamer-threads.com/...i?post=246319#246319

Quote:
If so it looks like ill also need to run a shell command locally?

You should be able to do it via DOS if its correctly setup. I normally do it with something like;

Code:
cd c:/Programs~1/wwwroot/cgi-bin/
perl test.cgi --options etc

Hope that helps.

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] DMOZ wizard basic question In reply to
Managed to install wget on my server and imported the content.rdf.u8.gz file and unzipped it Smile

Got a couple of questions for you Andy,

When I enter a category to import I get the following error:

Error: Can't open out file '/server/path/to/cgi-bin/admin/Some_Dmoz_Category.dump'. Reason: Permission denied

The script's not writting the .dump or dump.slice files to the server but when I manually upload a blank text file to that path called Some_Dmoz_Category.dump and set chmod it to 777 it works.

Any idea what could be causing this?

And is it possable to define the Links SQL category the DMOZ dump links import to? -is there a command i can enter to overide the default DMOZ category structure?

Thanks for your help,

Charlie



Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Quote Reply
Re: [Chas-a] DMOZ wizard basic question In reply to
Hi. Not sure about the permissions problems Unsure

Regarding the destination of the category. Try editing dmoz_cron.cgi, so that the dmoz_cron.cgi script has the following changed to whatever category name you want;

Code:
--rdf-destination="Regional"

Hope that helps.

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] DMOZ wizard basic question In reply to
 It seemed to work on the first Job i setup yesterday so i uninstalled the plugin and deleted all the related files via FileMan and reinstalled, still getting Permission errors when I don't manually upload blank files set to 777.

"Try editing dmoz_cron.cgi"

I tried this (editing in FileMan) and it rewrites the path for each category import, it looks like I would need to manually edit this for every category also i wouldn't be able to use
the automated update (cron) at all...

Id say this plugin's good at full category dumps that duplicate the ODP's structure but is very limited for any custom imports -hopefully be more scaleable with the next few updates/version Wink




Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile

Quote Reply
Re: [Chas-a] DMOZ wizard basic question In reply to
I did put that feature in a while back ... Just forgot about it Tongue

Try entering this for specific categories;

Regional/Europe:::My_Europe_Category

... this should put all links from Regional/Europe, into Root/My_Europe_Category.

Hope that helps.

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] DMOZ wizard basic question In reply to
In Reply To:
I did put that feature in a while back ... Just forgot about it Tongue

and your a professional plugin's developer?!? Tongue

Smile

Quote:
Try entering this for specific categories;

Regional/Europe:::My_Europe_Category

... this should put all links from Regional/Europe, into Root/My_Europe_Category.

It does work, thats not the problem, the issue I have with it is you need to change the path each category so +40 categories == pain in the a** Wink

Thanks for your help,

Charlie



Comedy Quotes - Glinks 3.3.0, PageBuilder, StaticURLtr, CAPTCHA, User_Edit_Profile