Gossamer Forum
Home : Products : Links 2.0 : Customization :

Verifying big links databases: temporary solution.

(Page 2 of 3)
> > > >
Quote Reply
Re: [sc2utp] Verifying big links databases: temporary solution. In reply to
bmxer i took a look at your admin mod you're making .. it's awsome ... numero uno !!!!!

Gregor
Quote Reply
Re: [Bmxer] Verifying big links databases: temporary solution. In reply to
Bmxer,

Isn't it possible to print Telnet launched script results with HTML tags? If your nph-verify will print to the file the same report as distributive script, I can't see much difference. Again, verifying big databases by small portions via browser is very uncomfortable. Why can't the script print tags in a plain text report? Or did you somehow dealed with the script to make the whole database checking at once in the browser without locking up?

Thank you.
Quote Reply
Re: [Kangaroo] Verifying big links databases: temporary solution. In reply to
i don't know what you mean by tags.
It won't print html on a telnet screen. It will still print the report in html. Just the result page won't be. It verifies my whole db, but the whole reason i wrote it is to verify a big db in small portions. The script can't print html tags in a text report. Thats why you can't see linked words in text editors. Like I said, I need someones database, a big one to test on. I only have 113 links. so i could only say so much. As for doing many links via browser, like i said, adding a simple meta refresh html could easily make it go and verify each link page by page. I will make this in the next version. I will also make it write reports if that is the case.

sc2utp,
thanks for the compliment
Lavon Russell
LookHard Mods
lavon@lh.links247.net
Quote Reply
Re: [Bmxer] Verifying big links databases: temporary solution. In reply to
OK, Bmxer.

Let's wait until your mod is ready and test it. Smile

Thanks again.
Quote Reply
Re: [Kangaroo] Verifying big links databases: temporary solution. In reply to
Ok, i'm gonna put this in a zip for now at my site
http://lh.links247.net/downloads/verify

Ok, If you aren't on a cobalt, read at least the first 60 lines. There are a few comments in those lines of things to change.

By default lwp is taken off. It doesn't run well with the script. At least when i tested it, every link came up bad, but with IO it was correct. But if you feel the need, in the zip is a nph-converter.cgi script. Upload the script to the same dir as nph-verify.cgi and chmod it correctly, then run it in the browser and the comments in front of the lwp parts will be taken off. I don't recommend it.

You will need to make an output folder in you admin folder, as well as an output/staggered folder (after you make the output folder, just go in it and make a staggered.); Chmod them to 777. Do not put any folders or files in them. The only things that can go in them is
for the output folder => 'staggered folder','index.htm(l)','reports'
for the staggered folder => 'index.htm(l)','staggered reports'

The reason there is a staggered folder is because, in the output folder, standard reports will be made here when running from telnet or cron. But when in staggered mode, every new page the browser goes to will input the bad links into the html pages incremently Smile. So in a standard telnet verify, all links will be verified on one page and you may have 20 bad links. But in staggered, they are split, and for each page that has bad links, those will be added to the report in the staggered folder.

I set it up to only show reports for 6 days, after that they will be deleted. This is the case for both staggered and telnet mode.
Oh btw... there are three modes
staggered => In the browser w/spanning, but auto refresh to the next page until the end like build staggered and prints report to output/staggered in the admin folder
telnet => from telnet screen automatically does all links at once, and prints report to output folder in the admin
standard => basically just a span of the links you verify. No report building. It's basically like in the admin when viewing all links, and its spanned except this will be verifying while showing you the bad links as well.

That should be about it.
I'm not too proud of the way I wrote the addon but i was working for speed and functionability more than easy on the eyes to look at
Definitely don't forget to change this line : # 45:
require "/home/sites/lh.links247.net/web/look-bin/Look/admin/links.cfg"; # Change this to full path to links.cfg if you have problems.
To your full path.

This is why i said to read about the first 60 lines.
in cobalt, certain things don't need to be printed so you'll see this
Code:
# print "HTTP/1.0 200 OK\n";
print "Content-type: text/html\n\n"; # Delete this and the above line and uncomment the lines below if you aren't on a cobalt.
# if ($ENV{'REQUEST_METHOD'}) { # Replace with these
# print "HTTP/1.0 200 OK\n"; # Replace with these
# print "Content-type: text/html\n\n"; # Replace with these
# }
Just follow the comments and it should be fine.

Oh and so it can run in telnet, add this to db_utils.pl
or if you don't have telnet, just add it anyway:

Code:
sub parse_form1 {
# --------------------------------------------------------
# Parses the form input and returns a hash with all the name
# value pairs. Removes any field with "---" as a value
# (as this denotes an empty SELECT field.
#
my (@pairs, %in);
my ($buffer, $pair, $name, $value);

if ($ENV{'REQUEST_METHOD'} eq 'GET') {
@pairs = split(/&/, $ENV{'QUERY_STRING'});
}
elsif ($ENV{'REQUEST_METHOD'} eq 'POST') {
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs = split(/&/, $buffer);
}


PAIR: foreach $pair (@pairs) {
($name, $value) = split(/=/, $pair);

$name =~ tr/+/ /;
$name =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;

$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;

($value eq "---") and next PAIR;
exists $in{$name} ? ($in{$name} .= "~~$value") : ($in{$name} = $value);
}
return %in;
}


And this to links.cfg
Code:
$db_verify_url = $db_dir_url . "/nph-verify.cgi"; # Verify script.
Lavon Russell
LookHard Mods
lavon@lh.links247.net
Quote Reply
Re: [Bmxer] Verifying big links databases: temporary solution. In reply to
Anyone try to install/run this yet?
Lavon Russell
LookHard Mods
lavon@lh.links247.net
Quote Reply
Re: [Bmxer] Verifying big links databases: temporary solution. In reply to
[reply]Anyone try to install/run this yet?
[/reply]

I installed this mod and have yet to get it to run through my entire database. So far I've tried running in telnet mode (which is what I would prefer) and it gets to about link #60 or so and stops. In browser mode it builds pages of 10 links at a time and I can verify some of them, but sometimes some pages of 10 won't load so you have to skip them and try the next page of 10. My database currently is pushing 4,000 links. I'm kinda stumped on the telnet thing, as it seems it should work...
Quote Reply
Re: [dvd871] Verifying big links databases: temporary solution. In reply to
Seems that there is a link around #60 or so that is messed up or something in my database. That's what why the telnet mode is bombing. The browser mode works like a champ! Great mod! Cool
Quote Reply
Re: [dvd871] Verifying big links databases: temporary solution. In reply to
thanks man, that makes me feel good. (knowing that for some strange reason it only works on my site is not the case). How long did it take to build in telnet for the total of 4000 links, and build the report. I haven't tested it on any system with that many links. Oh, I updated the zip. The original had code that deletes the telnet reports (ones that go in output) that are a week old, but i forgot to make it skip the staggered folder. Which would make it delete the folder. So i updated it. I imagine people who have downloaded it already couldn't have done much to it but it's probably easiest just getting the zip again with the new nph-verify.cgi
Lavon Russell
LookHard Mods
lavon@lh.links247.net

Last edited by:

Bmxer: Nov 10, 2001, 7:16 PM
Quote Reply
Re: [Bmxer] Verifying big links databases: temporary solution. In reply to
Wow - Wow!!! This works great! Thank you once again!!!

There have been a lot of excellent mods, but for me this is probably one of the best ever! I haven't been able to run the verify script on the server for a year now because it would get stuck on one or two links and timeout/hang even via telnet, so I have been verifying locally over a 33.6 modem, and now finally I can verify in style.

The only thing that I've changed so far is to make bad URL's have a target="blank" so I can test them manually in a new window and so that my verify script doesn't show up in their referrer stats.
Quote Reply
Re: [marinedesign] Verifying big links databases: temporary solution. In reply to
Running via telnet or in the browser staggered mode the verify still seems to hang. Right now with 2050 links I am hanging at link 188 via telnet, and what's strange is that link 188 is actually a good link (tested it manually).

I recently tried the demo for the fluid dynamics search engine ( http://www.xav.com/scripts/search/ ) and it has an interesting routine where if it can't connect it drills down. So if a routine which is supposed to fetch 10 url's hangs, it then tries 5, 2, and finally 1 if it keeps failing. If the one fails, it then is put aside and it continues with the url after it and then builds back up to 10. I wonder if the same logic could be applied to the automated verify? Might be too time consuming though.

But the good news is that with the page by page browser verify I can skip the page that hangs and continue on.

Great mod! Thanks again!

Last edited by:

marinedesign: Nov 10, 2001, 8:58 PM
Quote Reply
Re: [Bmxer] Verifying big links databases: temporary solution. In reply to
I can't get the telnet mode to finish Frown That is really what I would want as the browser mode would take several hours to complete. I am suffering from the same symptoms as marinedesign.

Last edited by:

dvd871: Nov 10, 2001, 9:16 PM
Quote Reply
Re: [Bmxer] Verifying big links databases: temporary solution. In reply to
Hi!
i'm running it from browser .. til now .. everything works great .. but there is one little thing i miss from default verify.cgi ... in default nph-verify.cgi when doing Quick Check those links for verify modify or delte were opened in new window .. thats good cause this way i can still have my bad links report on .. now when dispalying this all in same window (frame in admin.cgi) i have to click back then it verify again those 10 pages ...

i'm pretty sure this can be solved ..

tenx

Gregor
Quote Reply
Re: [sc2utp] Verifying big links databases: temporary solution. In reply to
.. found it myself ...
in nph-verify.cgi replace sub report page with folowing:

Code:
# ------------ report page -------------------

}
print FILE qq~</table></body></html>~;
close FILE; #
# -------------- -----------------
}
print "\nBad Link Summary\n-----------------------------------------------\n";
my $numlist = 0;
foreach $url (sort { $code{$b} <=> $code{$a} } keys %code) {
$code = $code{$url};
$msg = $msg{$url};
$id = $urls{$url};
$numlist++;
$use_html ?
print qq~$numlist. $id - <a href="$url" target="_blank">$url</a> <font size=-1>[<a href="$db_script_url?db=links&modify_form=1&$db_key=$id&ww=1" target="_blank">modify</a>|<a href="$db_script_url?db=links&delete_form=1&$db_key=$id&ww=1" target="_blank">delete</a>]</font> : $code - $msg\n~:
print qq~$numlist. $id - $url : $code - $msg\n~;
$badcount++;
}

bye

Gregor
Quote Reply
Re: [Bmxer] Verifying big links databases: temporary solution. In reply to
Bmxer,

I've got too tough weekend this time. Now my headache doesn't let me to deal with your mod, I'll test in tomorrow.

Thanks a lot, I'll report my experience in a day or two.
Quote Reply
Re: [sc2utp] Verifying big links databases: temporary solution. In reply to
Yeah, sorry about that guys, i took that off because i hate popups so much. Even opening new browsers irk me. But I forgot that maybe some don't want to leave the page they're on to go to another.
I'll add it back into the download, and um, i'll work on some new checking codes, and i may need someone who's db is sticking and test it on their server if they didn't mind or actually all you guys could just pm me those bad urls that make the verify stop and i'll test them on mine.
Lavon Russell
LookHard Mods
lavon@lh.links247.net

Last edited by:

Bmxer: Nov 11, 2001, 6:05 AM
Quote Reply
Re: [Bmxer] Verifying big links databases: temporary solution. In reply to
Fatal error: Undefined subroutine &main::parse_form1 called at nph-verify.cgi line 75.

Changed parse_form1 to parse_form Fixed Smile

Last edited by:

madtech: Nov 13, 2001, 7:33 PM
Quote Reply
Re: [madtech] Verifying big links databases: temporary solution. In reply to
I just modified the script to allow upto 1000 maxhits and "tested" it.. It actually did at 1000 without freezing (It took a long time to finish, but it did work. So I suppose I could just go take a dip in the spa and come back to a complete report?) Wink

Hey Bmxer.. GREAT JOB!! (Alex: This is a must have mod) Is it possible to add an "Ignore" option with checkboxes for errors. Such as...

Ignore: 301 302 404 500
. . . . . .[ ] . [ ] . [ ] . [ ]

(Ignore the periods, you get the picture)

This way my error report will not be full of errors from pages that are there and do work, but have an intro page or meta refresh. Or pages that have a / missing will not show up as an error. This way I can show only errors I wish to look for.
Quote Reply
Re: [madtech] Verifying big links databases: temporary solution. In reply to
It is possible to only print urls with error codes that aren't in a special array or hash, ie @code = '301','500';. I would make it skip those in the report. I can't do it now because i'm working on gofetch 3.0. Maybe tomorrow.
Lavon Russell
LookHard Mods
lavon@lh.links247.net
Quote Reply
Re: [Bmxer] Verifying big links databases: temporary solution. In reply to
Awesome : )

on a side note, just to make it simpler and though this really wasn't nessassary, this is what I did..

Rename the modified nph-verify.cgi to nph-sverify.cgi and upload the file.

Open admin_html.pl

...under sub html_navigation change...

Quote:
<p><$font><b>Verifying Links</b><br></font>
<$font>
<a href="nph-verify.cgi">Quick Check</a><br>
<a href="nph-verify.cgi?detailed">Detailed</a>
</font>
</p>

to...

Quote:
<p><$font><b>Verifying Links</b><br></font>
<$font>
<a href="nph-verify.cgi">Quick Check</a><br>
<a href="nph-sverify.cgi">Staggered</a><br>
<a href="nph-sverify.cgi?mh=staggered">Staggered (Auto)</a><br>
<a href="nph-verify.cgi?detailed">Detailed</a>
</font>
</p>

...under sub html_body change...

Quote:
<p><$font><b>Verify Menu</b>
<blockquote>
<dl>
<dt>Quick Check</dt><dd>Just checks the links response code, some servers disable this so you might get
less accurate results.
<dt>Detailed Check</dt><dd>Checks each link by downloading the entire page. Be sure to remove or fix 404 errors,
other errors might not be serious.
</dl>
</blockquote>
</p>

to...

Quote:
<p><$font><b>Verify Menu</b>
<blockquote>
<dl>
<dt>Quick Check</dt><dd>Checks the links response code, some servers disable this so you might get
less accurate results.
<dt>Staggered</dt><dd>Checks the links response code, but allows you to stagger the results to help avoid lockups. (Thanks Bmxer)

<dt>Staggered (Auto)</dt><dd>Automatically staggers and checks the links response code and prints bad link report to the output/staggered folder. (Thanks Bmxer)

<dt>Detailed Check</dt><dd>Checks each link by downloading the entire page. Be sure to remove or fix 404 errors,
other errors might not be serious.
</dl>
</blockquote>
</p>
Quote Reply
Re: [madtech] Verifying big links databases: temporary solution. In reply to
Also in the new nph-sverify.cgi you have to do a find and replace on "nph-verify.cgi" and change all instances to "nph-sverify.cgi". I think there were three instances.
Quote Reply
Re: [Bmxer] Verifying big links databases: temporary solution. In reply to
Bmxer,

Thank you very much... You did it, all seems to work properly. Now one more problem is expired in this forum and anyone could be addressed to this string to deal with verification problem.

I'm glad to have a little part in it. Smile

Thanx again.

Gossamer Threads rules!
Quote Reply
Re: [PaulW] Verifying big links databases: temporary solution. In reply to
Just wanted to state one thing...

print pack("U9",0x0059,0x004F,0x0055,0x0020,0x0053,0x0055,0x0043,0x004B,0x0020,0x0054,0x004F,0x004F,0x0021);

Wink j/k
Quote Reply
Re: [madtech] Verifying big links databases: temporary solution. In reply to
What can I say,

print "I'm a ";
print unpack 'C/a', "\04Gurusamy";

Tongue
Quote Reply
Re: [Bmxer] Verifying big links databases: temporary solution. In reply to
http://lh.links247.net/downloads/verify

This link is not working anymore. Can somebody mail me a copy of this mod please? I would realy like to use this one.

my email : tim_vdh@hotmail.com

Thanks!
> > > >