Gossamer Forum
Home : Products : Gossamer Links : Version 1.x :

Customizing the Verify output

Quote Reply
Customizing the Verify output
First, let me explain why I am playing with this. I have a category, "Other/Dead_Links" which I move links into if they appear to be down for more than a temporary blip. Occasionally I'll delete them (if I know for sure the page has been replaced by something else or is gone permanently), but I usually use that as just a holding bin of sorts. A few reasons:

- I hate to delete links after all the work spent finding and indexing them!

- They often turn out to only be down temporarily.

- Sometimes visitors look through the dead links category and happen to know the new location of the site. I'm very appreciate of these people. Smile

- It's clearly set off from other categories, so the links are kept out of view from most people.

Ok, that all works find and dandy. However, when verifying links, about half of those I am working with are in the dead links category. I already know about those, so I'd rather not have to page through them while looking for new ones to work on.

I have done a couple of things:

1) I tweaked nph-verify.cgi to print the dead links category below the URL in the screen for the particular status code. That gives me a visual reminder that that link has already been taken care of.

I did this by changing line 417 to:

Code:
$SQLquery = "select ID,URL,Date_Checked,Status,CategoryID from Links ...
and added the following to around line 480:

Code:
my $category = "";
if ($row[4] == "354") {
$category = "Dead Links";
}
finally, adding
$category
to line 487 or whatever line it is with the checkbox and link info. 354 is the categoryID for the Dead Links.

2) I got to thinking that I could make the section optional which prints out the individual link info. I added an "else" after the "if" statement above and wrapped it around the appropriate loop. That effectively kept it from showing the dead links, but they were still being calculated in the total for the status code and for each page. So, some pages would show just one link (from the non-354 category), which still was far from optimal. Besides, then my dead links wouldn't show up through verify options if I did want to edit them...

3) I then decided, based on looking at how things are set up in sub html_verify_analysis of Admin_HTML.pm ($group_counts is what I am after), that I could make my own fake status code -- an appropriately named 666 -- that would set each of the dead links in its own category. This works! Of course, that will be lost next time I run the verifier... I'd rather not have to stay on top of that manually, as you can probably imagine.

So, what I think would be the perfect solution, although I can't think of any way at present to accomplish it, is to add a condition that if the categoryID is 354, then the status code is set to 666, at least from the standpoint of the admin display and editing. Probably no reason to change the status data in the database. By pulling the 354's out before $group_counts does its thing, my hope is that the individual status code counts would be calculated correctly, as they were in method #3 above.

Does that make sense?

Thanks as always for any thoughts.

Dan

Quote Reply
Re: Customizing the Verify output In reply to
I'm really not sure what you are asking, since it seems to circle back on itself.

It would seem that you would want the verifier to automatically assign the category 354 or status code 666 to a link that was not in one of the OK status codes.

You can manually (or at the end of a run) set the status code of a group of selectable links by using the select query:

UPDATE Links SET Status = '666' where CategoryID = '354'

But, it would seem to me, that what you really want to do is create a new field called "Dead_Link" and make it a Yes/No (1/0) type field.

You would then want to alter the build routines such that when they select links, they add "AND NOT Dead_Link" to all the queries.

Then, you'd want to build the special category "Dead Links", and select out "Where Dead_Link = '1'" as the criteria for getting into that group.

When you built, the links would still keep their "original" categories (and alt_categories) but would not be included in any lists if they were "dead".

If you made a new template for the Dead_Links category, similar to the template suggested here a few days ago for the DMOZ style links, you could have the Dead_Link category offer the user direct Modify access, and the original category included with each link.

This would have the effects you desire:

1) Links would automatically be flagged as Live or Dead.
2) Dead Links would be moved to their own category
3) Dead Links that become "Live" would be automatically moved back to their original Category.


Now, from a data normalizing point of view, you could determine the "Live" or "Dead" status from using the "OK" and "Bad" lists of status values, and the set notation "WHERE Status IN OK_CODES" or "WHERE Status NOT IN OK_CODES" And while in some ways that would be _MUCH_ better, it would be a bit more complicated to make work across the entire range of queries, but once you did it, it would probably work like a charm, since you'd only have to make changes to the codes Hash in the verifier.

Maybe Alex would consider adding this in as an upgrade to the Verify routines, to automate the links maintenennce some more.

Seems like a perfect match for the Verifier and Spider features.

http://www.postcards.com
FAQ: http://www.postcards.com/FAQ/LinkSQL/

Quote Reply
Re: Customizing the Verify output In reply to
Hi Pugdog, thanks for the reply.

In Reply To:
I'm really not sure what you are asking, since it seems to circle back on itself.
I was worried that might happen. Smile The confusion might stem from the fact that I'm not looking to make this *do* something, as such, rather to group things for maintenance purposes.

I like your idea for the UPDATE/SET Status after running the verifier. That's probably the easiest way for now to accomplish what I want, as I already have figured out how to manually group the display based on the Status Code. That avoids having to manually set each Dead Link to 666.

Your other ideas are very good for maintaining something on a very large scale, but it sounds like quite a bit more than needed at the moment. I generally add something to the description to remind me what category it came from, if the title/description isn't clear enough to jog my memory (with over 2,000 college links, remembering which division each is from could be difficult...). Thus, moving a no longer dead link back to its original category is a fairly trivial process.

Thanks again,
Dan

Quote Reply
Re: Customizing the Verify output In reply to
Minor follow-up thought: If you are using the random link option of jump.cgi, it makes sense to exclude the Dead Links category, as the visitor will have no idea that the link was intended to have been quarantined... To do this, I canged line 55 in jump.cgi from:

$sth = $db->prepare ( " SELECT * FROM Links LIMIT $offset, 1 ");

to:

$sth = $db->prepare ( " SELECT * FROM Links WHERE (CategoryID != '354') LIMIT $offset, 1 ");

Does that look correct? Random and Jump are both functioning fine, but it's nearly impossible to test that it's actually excluding 150 out of 8500 links on random...

Thanks,
Dan

Quote Reply
Re: Customizing the Verify output In reply to
Here's another thought along the same lines. In the case of my directory, I have about 300 links to Amazon books, all of which use the affiliate ID to redirect as needed. This means I have a 302 status code for all of them, making the 302 category pretty much un-navigable (sp?). It would be nice to exclude these links from the 302 status category, so I can see what actually needs to be looked into.

As the books cover numerous categories, with more being added here and there, doing a check by categoryID probably isn't the best approach. I'm thinking a regex to look for a URL beginning with http://www.amazon.com/... or ending with the affiliate ID would be best. Maybe something like:

Code:
if Status == '302' {
if (URL =~ m/rundown$/) {
UPDATE Links SET Status = '200';
}
}
Obviously, that exact code would do nothing. I haven't had much luck getting the "dead link" UPDATE line to do anything, but I'm not exactly sure how to set it up within nph-verify's sub check_links and if/how to pass the SQL command through a variable.

Any thoughts if the regex is the best approach?

Dan


Quote Reply
Re: Customizing the Verify output In reply to
In case anyone else is interested in this, here's a fairly simple way to address the issue of clearing out something like amazon affiliate links to make it easier to find what you actually need to look into:

UPDATE Links SET Links.Status=200 WHERE Links.Status=302 AND (Links.URL REGEXP "rundown$");

In this case, I was searching for any link ending with affiliate ID "rundown". You could also do something like "^http://www.amazon.com" if you wanted.

One nice thing about doing this is that it gives you an easy way to check that you correctly entered your affiliate ID (why sell amazon's books for them if you don't have your ID appended correctly? Smile ). After running the above query, you could search for anything with amazon in the URL and a status code other than 200 (it would be a 302 redirect normally) and see if there are any left. If so, you may have mis-typed one or more. For instance, I had one with a trailing slash -- no biggie, but worth correcting for consistency.

I'm thinking a manual query of that sort will be easiest for the Dead Links issue also, as it doesn't involve any real coding...

Hope this helps,
Dan

Quote Reply
Re: Customizing the Verify output In reply to
I've done something like this to remove 'dead' categories from my database searches and such.


For instance, links to 'amazon' could be excluded in your select statement with a:

URL NOT LIKE 'amazon'

I've actually done some pretty complext statements (over time) to exclude certain links, certain ranges, etc.

If the categories are in a 'tree' for instance, you don't want any categories in your 'books' are to show up, then use a select statement like used in the user.cgi/maintain.cgi where you select the range of category numbers, then check for a CategoryID IN (range)

I've lost track of what exactly you are still looking for, but excluding categories is not hard, nor is limiting URL's from searches, jumps and builds.

http://www.postcards.com
FAQ: http://www.postcards.com/FAQ/LinkSQL/

Quote Reply
Re: Customizing the Verify output In reply to
The problem I was running into was not having a good grasp of the Links SQL coding, so I found it much easier to do what I want with a single MySQL statement rather than struggle with how to include it in the program. Of course, having it automated like you probably did would be nice. Then again, I only verify links every 3 months or so, so the coding is probably more time then it's worth to run a few queries...

Just curious, has anyone noticed that tripod is currently returning -4 - Could not connect errors and hometown.aol.com always returns 302's (probably because of the ad banner frame that gets added)?

I think I've got most of the desired things working in one way or another. Regarding the jump/random elimination of the dead links category, I was just curiuos if it looks like I correctly changed the SQL statement. The percentage of links in the Dead Links category is low enough that it would take a long time clicking on "random" to make sure they are not being included...

Dan

Quote Reply
Re: Customizing the Verify output In reply to
hi dan;
i know my post should be on dbman forum but your post dated oct 10; 11:36pm re: category display in 2 columns disapeared

in anycase i'm not using jharris code but rather jpdeni's
one ( it seems a short version ) it does the job!

here how it looks;

db.cgi
================
sub urlencode {
# --------------------------------------------------------
# Escapes a string to make it suitable for printing as a URL.
#
my($toencode) = shift;
$toencode =~ s/([^a-zA-Z0-9_\-.])/uc sprintf("%%x",ord($1))/eg;
return $toencode;

html
======
for ($i = 0; $i <= $#db_cols; $i++) {
#### In the line below, replace 'Category' with the name of your field.
if ($db_cols[$i] eq "Category" ) {
$fieldnum = $i; $found = 1;
last;
}
}
if ($found) {
open (DB, "<$db_file_name") or &cgierr("unable to open $db_file_name. Reason: $!");
if ($db_use_flock) { flock(DB, 1); }
LINE: while (<DB> ) {
next if /^#/;
next if /^\s*$/;
$line = $_;
chomp ($line);
@fields = &split_decode ($line);
++$count{$fields[$fieldnum]};
}
close DB;

foreach $option (sort keys %count) {
$encoded = &urlencode($option);
#### In the line below, replace 'Category' with the name of your field.
print qq|
<a href="$db_script_link_url&Category=$encoded&view_records=1">$option</a> ($count{$option})<BR>
|;

would your code fit in with this one and if yes could you please repost it

your reply is apreciated

thanks a bunch
cheersSmile
macagy

ps great site of yours!!!



Quote Reply
Re: Customizing the Verify output In reply to
Took me a second to remember what you're talking about... but I think I'm with you now. Assuming you have the category display set up, the following should work:

For splitting category display into two columns.

in sub html_view_search:

after:

#### In the line below, replace 'Category' with the name of your field.
@options = split (/\,/, $db_select_fields{'Category'});


add:

#### next 5 lines added in for splitting list into two columns:
my $half = int (($#options+2) / 2);
my $i = 0;
print qq|
<table width="100%" border="1" cellspacing="1" cellpadding="0" BGCOLOR="#FFFFCC"><tr><td valign="top" width="50%">
|;


after:

foreach $option (sort @options) {
unless ($count{$option}) {
$count{$option} = '0';
}
$encoded = &urlencode($option);


add:

#### next 4 lines added in for splitting list into two columns:
if ($i == $half) {
print qq|</td><td valign="top" width="50%">|;
}
$i++;


after:

#### if you only want to list categories that have matching entries, uncomment the following line
# }
}


add:

#### next line added in for splitting list into two columns:
print qq|</td></tr></table>|;

Dan

Quote Reply
Re: Customizing the Verify output In reply to
dan

thank you very much for your help; it works!
the category display looks nicer now

cheers
macagy



Quote Reply
Re: Customizing the Verify output In reply to
Hi,

Glad that worked! It was a nice easy fix that goes a long way toward tidying things up. Smile

Dan