Gossamer Forum
Home : Products : Gossamer Links : Version 1.x :

Parse_RDF.pl adding Cat/Links wrong.

Quote Reply
Parse_RDF.pl adding Cat/Links wrong.
I was having problems similiar to other posts using Parse_RDF.pl and DMOZ to add Categories and Links to an existing database and finding links scattered all over the database.
The problem occurs because Parse_RDF does not get the highkey value for Categories (High ID) when it calls mysql for it. This could be a perl problem or mysql level problem (something for pugdog or Alex to look at). What it does is try to add the new Categories from DMOZ starting at Category ID 1. It fails if the table row exists but adds the Links with that Cat ID. If there is an empty row (deleted Category) it adds that ID with the Links having that ID. The end result is an unpredictalbe mess with misplaced links and orphaned categories.
The quick and easy fix is determine your highkey (high Category ID)using the admin panels or Mysqlman and set it in Parse_RDF.pl when you set your subset, prefix etc. Set the Global my $CATID to the ID of the last Category in the database. Just remember to update it each time you run Parse_RDF.pl.
Using this method I have loaded over 7000 links and their categories with several runs of Parse_RDF.pl with complete success. I hope this help those out there have problems.

Quote Reply
Re: Parse_RDF.pl adding Cat/Links wrong. In reply to
Thanks! I missed this post the when you posted it.

Looking at the logic, I see:

Code:
my $max_id = $dbh_Links->prepare ("select ID from Category order by ID desc limit 1");
$max_id->execute();
($CATCNT) = $max_id->fetchrow_array();
$CATCNT++;
Because in the following loops, the assignmnet:

$CATCNT = $CATID++;

Is used, I _THINK_ the above code should really be:


Code:
my $max_id = $dbh_Links->prepare ("select ID from Category order by ID desc limit 1");
$max_id->execute();
($CATID) = $max_id->fetchrow_array();
$CATID++;
Looking at the variables... it seems that $CATID and $CATCNT have changed roles, but without studying the logic (because I have no way to run the program) it _REALLY_ looks like $CATID is _NOT_ used properly, and should just be changed to $CATCNT.

If anyone has the means to give this a try, let me know...

If you look, $CATID is _NOT_USED_ anywhere, except to be incremented _BEFORE_ it is given a value, _AND_ then as a test:

Code:
if (! defined $CATID) {
die "Found link tag outside of topic at line: $.\n$line\n";
}
Which again, seems to be before it's ever really initialized.

Any takers????



http://www.postcards.com
FAQ: http://www.postcards.com/FAQ/LinkSQL/

Quote Reply
Re: Parse_RDF.pl adding Cat/Links wrong. In reply to
pugdog, I tried your mod, no difference. Parse_RDF.pl from 1.13 has the same code as you suggested. I did some investigating and found my problem is more basic. What happens is the "SELECT MAX(ID) from Category" always returns a "0" value. I tried several different query options with the same result!! Anybody have any ideas? My Mysql level is 3.22.21.

Quote Reply
Re: Parse_RDF.pl adding Cat/Links wrong. In reply to
Code:
my $max_id = $dbh_Links->prepare ("select ID from Category order by ID desc limit 1");
This should work, since it's not a function, it's an actual select.

What you are doing, is ordering the database by Category ID, in DESCENDING order, so that the highest number (whatever it is) is on the top, then only taking ONE record "limit 1" -- ie: getting the top record with the higest value, and since you are only requesting one field, that value is what is returned.

Enter the select statement into the SQL query box in Links Admin or MySQLMan and see what happens. If you get a value back, it works. If you don't get a value back, then it is your version of MySQL (but I really doubt that).




http://www.postcards.com
FAQ: http://www.postcards.com/FAQ/LinkSQL/

Quote Reply
Re: Parse_RDF.pl adding Cat/Links wrong. In reply to
This is the code from the Parse_RDF.pl I have been using (downloaded from scripts about a week ago).

my $max_id = $db->prepare ("select MAX(ID) from Category");
$max_id->execute();
($CATID) = $max_id->fetchrow_array();
$CATID++;

This is the rplacement code I used.

my $max_id = $db->prepare ("select ID from Category order by ID desc limit 1");
$max_id->execute();
($CATID) = $max_id->fetchrow_array();
$CATID++;

This worked correctly for me so should be good with the same version of Parse_RDF.pl (1.11). Thanks for the help pugdog.