Hi guys. I'm probably being really stupid here, but I'm just trying to figure out how to use GT::HTML::LinkExtractor. The documentation gives this as an example;
my $extractor = new GT::HTML::LinkExtractor(
wanted_urls => [@GT::HTML::LinkExtractor::URL_ALL],
wanted_images => ['http://', 'https://'],
get_images => 1,
get_text => 1,
follow_base => 1,
base => 'http://somedomain.com'
);
$extractor->extract( $html ); # Extract from a string
$extractor->extract_from_file( "/path/to/html.htm" ); # extract from file
$extractor->extract_from_uri( "http://www.yahoo.com";, "http://www.gossamer-threads.com"; ); # extract from uri my $links = $extractor->links;
...and with the followning HTML;
href => "http://link";,
class => "MyClass"
content => [
"text",
{
src => "http://image";,
alt => "My Alt Tag,
},
"more text"
]
}
I'm just a little concerned on how to get ALL the link results out from $link ... maybe $link->{text}[0]->{src} .. or something?
TIA
Andy (mod)
andy@ultranerds.co.uk
IMPORTANT: I've now moved to ultranerds.co.uk, and the .com will no longer work!
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package (plugins total "value" $3,325 & rising, for just $350)| GLinks ULTRA Package PRO (plugins total "value" $5,625 & rising, for just $500)
Support Forum | Links SQL Plugins | DMOZ Dumps | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Compare our different Plugin packages *new* Free CSS Templates
Code:
use GT::HTML::LinkExtractor; my $extractor = new GT::HTML::LinkExtractor(
wanted_urls => [@GT::HTML::LinkExtractor::URL_ALL],
wanted_images => ['http://', 'https://'],
get_images => 1,
get_text => 1,
follow_base => 1,
base => 'http://somedomain.com'
);
$extractor->extract( $html ); # Extract from a string
$extractor->extract_from_file( "/path/to/html.htm" ); # extract from file
$extractor->extract_from_uri( "http://www.yahoo.com";, "http://www.gossamer-threads.com"; ); # extract from uri my $links = $extractor->links;
...and with the followning HTML;
Code:
<a href="http://link"; class="MyClass">text<img src="http://image"; alt="My Alt Tag">more text</a>Code:
{ href => "http://link";,
class => "MyClass"
content => [
"text",
{
src => "http://image";,
alt => "My Alt Tag,
},
"more text"
]
}
I'm just a little concerned on how to get ALL the link results out from $link ... maybe $link->{text}[0]->{src} .. or something?
TIA
Andy (mod)
andy@ultranerds.co.uk
IMPORTANT: I've now moved to ultranerds.co.uk, and the .com will no longer work!
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package (plugins total "value" $3,325 & rising, for just $350)| GLinks ULTRA Package PRO (plugins total "value" $5,625 & rising, for just $500)
Support Forum | Links SQL Plugins | DMOZ Dumps | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Compare our different Plugin packages *new* Free CSS Templates

