Gossamer Forum
Home : General : Perl Programming :

Anyone up for a challenge?

Quote Reply
Anyone up for a challenge?
I'll admit it. I'm very very lazy.

Does anyone fancy modifying verify.nph (?) that comes with the Links program so I can run it on my custom made database? Oh, preety please! It's a nice and easy challenge for you ;-)

My database (can I really call such a clumsy thing a database) is a directory containing 300 + text files. Each text file is a database record. The second field in every record is the URL field that I need to check for errors.

Anyone got any snippets of codes handy?

Thanks!

Wil

Quote Reply
Re: Anyone up for a challenge? In reply to
If all the files are in the same directory then you could possibly use something like:

Code:
my $path = "/path/to/db";

opendir(DIR, $path) || die $!;
@dbs = grep { /\.txt$/ } readdir(DIR);
close(DIR);

foreach (@dbs) {
open(FILE,"<$dir/$_") || die $!;
while (<FILE>) {
chomp;
split /\|/;
push @urls, $_[1];
}
}
That should put all the URL's from each file into @urls and then you can just use the exitsting code to check each url within @urls.

That was just a quick effort and I'll no doubt get groaned at for using a foreach loop but that is the best I could think of. I'm sure some better ideas will be posted in due course.


Installs:http://wiredon.net/gt
FAQ:http://www.perlmad.com

Quote Reply
Re: Anyone up for a challenge? In reply to
Hi Paul

Thank you for your efforts. I have plugged your suggestions into what I had already and come up with this.

Code:
opendir (DIR,"$E_DATA_DIR");
@files = grep (!/^\.\.?$/, readdir (DIR));
closedir (DIR);

foreach $file (@files) {

open (FILE1,"$E_DATA_DIR/$file");
@filedata1 = <FILE1>;
close (FILE1);

$fields1 = join('',@filedata1);

@fields1 = split(/\|/,$fields1);

$dat_id = $fields1[0];
$e_dat_name = $fields1[1];
$e_dat_url = $fields1[2];

push @urls_to_check, $e_dat_url;
}
I'll try gluing the above code together with wht verify module of Links and see where I get.

Thanks

Wil Stephens

Quote Reply
Re: Anyone up for a challenge? In reply to
What does the contents of the text file look like?

Can't you change this:

open (FILE1,"$E_DATA_DIR/$file");
@filedata1 = <FILE1>;
close (FILE1);
$fields1 = join('',@filedata1);
@fields1 = split(/\|/,$fields1);
$dat_id = $fields1[0];
$e_dat_name = $fields1[1];
$e_dat_url = $fields1[2];
push @urls_to_check, $e_dat_url;
}

to......

Code:
open (FILE1,"$E_DATA_DIR/$file") || die $!;
while (<FILE1>) {
chomp;
split /\|/;
$dat_id = $_[0];
$e_dat_name = $_[1];
$e_dat_url = $_[2];
push @urls_to_check, $e_dat_url;
}
Installs:http://wiredon.net/gt
FAQ:http://www.perlmad.com

Quote Reply
Re: Anyone up for a challenge? In reply to
No. Can't hack another man's code here. Darn! ;-)

Ah well. Anyone else got any suggestions or a very simple checking program I could work with?

Rgds

Wil Stephens

Quote Reply
Re: Anyone up for a challenge? In reply to
Paul

Yeah. I don't seem to have a problem with creating a sub-routine to gather the URLs together for processing. The problem comes in trying to process the URLs. I can't make no head or tail of Alex's work, unfortunately, so I need some guidance, or an example sub.

Rgds

Wil Stephens

Quote Reply
Re: Anyone up for a challenge? In reply to
You need to look at this bit:

Code:
open (DB, "<$db_file_name") or &cgierr("error in verify_links. unable to open db file: $db_file_name. Reason: $!");
while (<DB>) {
/^#/ and next; # Skip comment Lines.
/^\s*$/ and next; # Skip blank lines.
chomp;
@data = &split_decode($_);
if (($data[$db_url] =~ /^http/) or ($data[$db_url] =~ /^ftp/)) {
my $id = $data[$db_key_pos];
my $url = $data[$db_url];
$seen{$url}++ and next;
($status, $error) = &check_link ($url);
$urls{$url} = $id;
if (exists $ok_status{$status}) {
$use_html ?
print qq|Checked <a href="$url" target="_blank">$id</a> - Success ($status). Message: $ok_status{$status}. URL: $url\n| :
print qq|Checked $id - Success ($status). Message: $ok_status{$status}\n|;
}
As you can see, this code is searching through links.db for the url's, but as you already have them in an array then you can remove big chunks of this code and replace it with code to loop through your array instead.

Installs:http://wiredon.net/gt
FAQ:http://www.perlmad.com

Quote Reply
Re: Anyone up for a challenge? In reply to
Paul

Yeah. I've looked through the code. And it's obvious that I will need to re-write something from scratch to do what I want to do.

The only line that I need to know from Alex's script (or any other script) is how to use the LWP module. How do you access the module, and check HTTP headers with it. The documentation ain't that goood, and I was hoping I could just pinch a few lines from someone else's code. But it doesn't seem that simple.

Ah well, back to the manuals.

Thanks
Wil Stephens