Gossamer Forum
Home : Products : Gossamer Links : Version 1.x :

Parse_RDF.pl Suggestion

Quote Reply
Parse_RDF.pl Suggestion
Instead of having the script parse every single category, wouldn't it be more efficient to "scroll" to where the category starts and then parse? Because this is taking a long time Smile I'm on a p2 350, 64 megs of ram.
Quote Reply
Re: Parse_RDF.pl Suggestion In reply to
I've been trying to think of a good way to do it.

I'm not sure of what is being kept track of, but certainly scanning until at least a first-hit is found would cut a lot of time.

As the file is now extremely huge, I'm still hoping Dmoz will offer it broken up by top-level category, rather than as one huge file.

Quote Reply
Re: Parse_RDF.pl Suggestion In reply to
I think the next best way is to use Joe's editor (like you said) and then find the top level category block it to the very end, output it to a file, and then run it. After your category is finished, you can control C it.....

joe's editor is not hard to find at rpmfind.net and here are the instructions for doing this.

"If you want to move, copy, save or delete a specific section of text, you can do it with highlighted blocks.
First, move the cursor to the start of the section of text you want to work on, and press ^K B. Then move the
cursor to the character just after the end of the text you want to affect and press ^K K. The text between the
^K B and ^K K should become highlighted. Now you can move your cursor to someplace else in your document and
press ^K M to move the highlighted text there. You can press ^K C to make a copy of the highlighted text and
insert it to where the cursor is positioned. ^K Y to deletes the highlighted text. ^K W, writes the high-
lighted text to a file.

A very useful command is ^K /, which filters a block of text through a unix command. For example, if you
select a list of words with ^K B and ^K K, and then type ^K / sort, the list of words will be sorted. Another
useful unix command for ^K /, is tr. If you type ^K / tr a-z A-Z, then all of the letters in the highlighted
block will be converted to uppercase.

After you are finished with some block operations, you can just leave the highlighting on if you don't mind it
(of course, if you accidently hit ^K Y without noticing...). If it really bothers you, however, just hit ^K B
^K K, to turn the highlighting off."

from the man file.
Quote Reply
Re: Parse_RDF.pl Suggestion In reply to
Trouble is... the file is now so huge, I wouldn't want to try to pop it into an editor on any production machine. It's grown almost 30% since the last time I tried.

Quote Reply
Re: Parse_RDF.pl Suggestion In reply to
I'll add this to my parse_rdf.pl todo list. =) It shouldn't be too hard, basically at the beginning of the while loop, just skip lines until you hit a matching topic category.