Gossamer Forum
Home : General : Perl Programming :

Checking links with LWP::UserAgent

Quote Reply
Checking links with LWP::UserAgent
Hi

I've written the below sub to check all URLs in my database and return a error message for each individual URL if the status code returned is not 200.

The sub works, but is slow. I've got 20 entries in my database at the moment, and the following sub takes just over 11 seconds to check all 20 entires.

Does anyone know of a (much) faster way of doing this? Is LWP::UserAgent the way to go, or is the way I'm accessing my MySQL slow?

Would selecting url from database be a lot faster than selecting * (all fields)? Should I get all results first into an array or hash and then output them instead of outputting one by one?

Thanks for any help you can give me!

Code:
use LWP::UserAgent;

$ua = new LWP::UserAgent;
$ua->agent("OpticDB LinkCheck/0.1");

&connect_to_db;

my $clock_start = time; # start timer

$sth = $dbh->prepare("SELECT * FROM $DB_MYSQL_NAME WHERE res_type = 'Online' ORDER BY id");
$sth->execute ();

my $count = 0;

while (my $ref = $sth->fetchrow_hashref ())
{

my $req = new HTTP::Request GET => $ref->{'url_en'};
my $res = $ua->request($req);

$res_id = $ref->{id};
$res_code = $res->code;
$res_msg = $res->message;

unless ($res_code eq "200") {

$count ++;

$tmpl_show_record .= qq|

.. html to show erroneous records goes here ...
|;
}
}

$num_dead = $count;

if ($count == 0) {
&error_html("No dead links found!");
exit;
}

$sth->finish();

my $clock_finish = time - $clock_start; # end timer and compare
$time_taken = sprintf ("%.2f", $clock_finish); # trim time to 2 decimal points

$dbh->disconnect;

- wil

Last edited by:

Wil: Dec 5, 2001, 2:59 AM