Gossamer Forum: Products: Links 2.0: Customization: Extract URL and its link title (Plz HELP)

Dec 24, 2000, 11:39 PM

Robbie

User (63 posts)

Dec 24, 2000, 11:39 PM

Post #1 of 8

Shortcut

Extract URL and its link title (Plz HELP)

Hi! Does anyone have any idea on extract the URL's (of certain type of file) out of a webpage and put it in a flat text database?

For example:
file.html containing the following contents:
<a href="../files/num_1.zip">1st one</a>
<a href="http://www.fff.com/files/num_2.zip">2nd one</a>

process and put into spidered.db:
1st one|../files/num_1.zip
2nd one|http://www.fff.com/files/num_2.zip

I know this is kinda complicated. I tried but nothing works:(
THANKS IN ADVICE!!!

Dec 26, 2000, 10:04 AM

Stealth

Veteran (17240 posts)

Dec 26, 2000, 10:04 AM

Post #2 of 8

Shortcut

Re: Extract URL and its link title In reply to

Have you looked at the goFetch Modification? Have you read the Threads in this
forum about the goFetch Modification?

The goFetch Modification is located at:

http://lookhard.hypermart.net/links-mods/

Regards,

Eliot Lee

Dec 26, 2000, 7:26 PM

Robbie

User (63 posts)

Dec 26, 2000, 7:26 PM

Post #3 of 8

Shortcut

Re: Extract URL and its link title In reply to

Thanks!

But how do I modify goFetch so that it'll grab the content b/w <a> and </a> instead of fetch the address and get the title of the page?

Thank you.

Dec 27, 2000, 10:40 AM

Bmxer

Veteran (1311 posts)

Dec 27, 2000, 10:40 AM

Post #4 of 8

Shortcut

Re: Extract URL and its link title In reply to

the old spider did this, you just need to put something in it to save the results and you shouldn't copy what i have in goFetch to do it

Dec 27, 2000, 11:36 AM

Robbie

User (63 posts)

Dec 27, 2000, 11:36 AM

Post #5 of 8

Shortcut

Re: Extract URL and its link title In reply to

Does the old spider refers to the "Virtual Solutions Links Spider"?

Dec 27, 2000, 6:51 PM

Robbie

User (63 posts)

Dec 27, 2000, 6:51 PM

Post #6 of 8

Shortcut

What's wrong w/ this script? In reply to

This is the script but it prints out nothing, any idea?
#!perl
use LWP::Simple;
$URL = "http://www.perl.com";
$src = get($URL);
while ($src =~ m#<a\s+ href\s*=\s*"?([^"] ?)"?>(. ?)</a>#ig) {
($link, $title) = ($1, $2);
$output .= "$title|$link\n";
}
print "$output";

Dec 29, 2000, 8:18 PM

Stealth

Veteran (17240 posts)

Dec 29, 2000, 8:18 PM

Post #7 of 8

Shortcut

Re: What's wrong w/ this script? In reply to

The $title and $link variables seem to be undefined, meaning that nothing will print.

Regards,

Eliot Lee

Dec 30, 2000, 4:11 AM

Robbie

User (63 posts)

Dec 30, 2000, 4:11 AM

Post #8 of 8

Shortcut

Re: What's wrong w/ this script? In reply to

What should I do? Thanks for your help.