Gossamer Forum
Quote Reply
nph-import.cgi fixes?
GT, is there any chance of getting the nph-import.cgi bugs fixed? I have a major one I would really like to see fixed, and also a suggestion.

Bug: If you define --rdf-destination with a value that holds an _ in it, then you will get a 'FatherID can not be defined as NULL' error.

Suggestion: If someone is doing a search for Top/Business/Consumer_Goods_and_Services/Home_and_Garden, then why not just start the main process checking when the import script finds;

Top/Business

It seems pointless to go through all of the subcategories, which uses up CPU (obviously). I would imagine this would speed up the performance quite a lot.

Any feedback?

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] nph-import.cgi fixes? In reply to
Quote:
Suggestion: If someone is doing a search for Top/Business/Consumer_Goods_and_Services/Home_and_Garden, then why not just start the main process checking when the import script finds;

Top/Business

It seems pointless to go through all of the subcategories, which uses up CPU (obviously). I would imagine this would speed up the performance quite a lot.

Erm, without scanning the whole file, how do you expect the import script to know where Top/Business is?
Quote Reply
Re: [Paul] nph-import.cgi fixes? In reply to
Well, go through with a regex check, to see if it starts with the correct category? Currently it goes *really* slow. I wrote a basic script, to find a category at the end of the content.rdf.u8 file, while only reporting it being found when it was (i.e not returning skipping messages), it successfully found the line in a matter of 10-20 seconds. Surely the nph-import.cgi script should be that fast to get to the appropriate category? Unsure

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] nph-import.cgi fixes? In reply to
Quote:
I wrote a basic script, to find a category at the end of the content.rdf.u8 file, while only reporting it being found when it was (i.e not returning skipping messages), it successfully found the line in a matter of 10-20 seconds.

I find that hard to believe. The script I posted the other day to chop the content file into individual categories takes several minutes to get to the regional category and thats using a while loop and index() which is faster than a regex.
Quote Reply
Re: [Paul] nph-import.cgi fixes? In reply to
>>>The script I posted the other day to chop the content file into individual categories takes several minutes to get to the regional category <<<

My point exactly. nph-import.cgi takes about 1/2 a day to get there! I have left it running for 1 whole day before it had even started to import the World category (this is running it with content.rdf.u8, and not a sliced file). I just can't see why it is sooo much slower Unsure My script is;

Code:
#!/usr/bin/perl

$time = `date`;

print "Start: $time\n";

print "Running check...\n";

open(WRITE, "content.rdf.u8") || die "Cant write file. Reason: $!";
while (<WRITE>) {

chomp;
if ($_ =~ /<Topic r:id=\"Top\/Kids_and_Teens\/International\">/) {
print "Found: $_ \n"; last;
}

}
close(WRITE);

$end_time = `date`;


print "End: $end_time\n";

...and that produces the following;

Quote:
bash-2.05a$ perl test.cgi
Start: Wed Jun 25 18:21:27 PDT 2003

Found: <Topic r:id="Top/Kids_and_Teens/International">
End: Wed Jun 25 18:23:34 PDT 2003

..so it took 2mins and 8 seconds to get nearly to the bottom of the content.rdf.u8 file, whats going on? Unsure Oh, and thats not even running as root level user, its just a basic user with SSH permissions.

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!

Last edited by:

Andy: Jun 25, 2003, 3:25 AM
Quote Reply
Re: [Andy] nph-import.cgi fixes? In reply to
Hold on, I thought you said 10-20 seconds Tongue
Quote Reply
Re: [Paul] nph-import.cgi fixes? In reply to
Ok, I was exagerating that a little. I think its because I was doing the 'Home' category, which is about a 1/4 of the way through. Either way though, there is a lot of time difference in the execution times.

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!