Which would be better for parsing title and description only from a site's meta (more commonly installed module so I can use on anyones site)
HTML:parse, HTML:parser, HTML::TokeParser or write my own?
I am currently using LWP:Simple to grab the site into a local variable.
I want the parse to be smart enough to recognise possible missing meta tags, and variations on the meta tag.
Like:
{
$end_metad = pos($metadata);
$metadata = substr($metadata, 0, $end_metad -1);
} $metadata =~ s/Content="//gi;
$metadata =~ s/Content = "//gi;
$metadata =~ s/Content= "//gi;
$metadata =~ s/Content ="//gi;
$metadata =~ s/"//g; if (($metadata =~ /NAME=DESCRIPTION/i) or
($metadata =~ /NAME = DESCRIPTION/i) or
($metadata =~ /NAME =DESCRIPTION/i) or
($metadata =~ /NAME= DESCRIPTION/i))
{
$metadiz = $metadata;
$metadiz =~ s/NAME=DESCRIPTION//gi;
$metadiz =~ s/NAME = DESCRIPTION//gi;
$metadiz =~ s/NAME= DESCRIPTION//gi;
$metadiz =~ s/NAME =DESCRIPTION//gi;
$metadiz =~ s/\n//g;
} if (($metadata =~ /NAME=KEYWORDS/i) or
($metadata =~ /NAME = KEYWORDS/i) or
($metadata =~ /NAME =KEYWORDS/i) or
($metadata =~ /NAME= KEYWORDS/i))
{
$metakeyw = $metadata;
$metakeyw =~ s/NAME=KEYWORDS//gi;
$metakeyw =~ s/NAME = KEYWORDS//gi;
$metakeyw =~ s/NAME= KEYWORDS//gi;
$metakeyw =~ s/NAME =KEYWORDS//gi;
$metakeyw =~ s/\n//g;
}
}
http://www.iuni.com/...tware/web/index.html
Links Plugins
HTML:parse, HTML:parser, HTML::TokeParser or write my own?
I am currently using LWP:Simple to grab the site into a local variable.
I want the parse to be smart enough to recognise possible missing meta tags, and variations on the meta tag.
Like:
Code:
$metadata =~ s/HTTP-EQUIV/name/gi; if ($metadata =~ m/>/gi) {
$end_metad = pos($metadata);
$metadata = substr($metadata, 0, $end_metad -1);
} $metadata =~ s/Content="//gi;
$metadata =~ s/Content = "//gi;
$metadata =~ s/Content= "//gi;
$metadata =~ s/Content ="//gi;
$metadata =~ s/"//g; if (($metadata =~ /NAME=DESCRIPTION/i) or
($metadata =~ /NAME = DESCRIPTION/i) or
($metadata =~ /NAME =DESCRIPTION/i) or
($metadata =~ /NAME= DESCRIPTION/i))
{
$metadiz = $metadata;
$metadiz =~ s/NAME=DESCRIPTION//gi;
$metadiz =~ s/NAME = DESCRIPTION//gi;
$metadiz =~ s/NAME= DESCRIPTION//gi;
$metadiz =~ s/NAME =DESCRIPTION//gi;
$metadiz =~ s/\n//g;
} if (($metadata =~ /NAME=KEYWORDS/i) or
($metadata =~ /NAME = KEYWORDS/i) or
($metadata =~ /NAME =KEYWORDS/i) or
($metadata =~ /NAME= KEYWORDS/i))
{
$metakeyw = $metadata;
$metakeyw =~ s/NAME=KEYWORDS//gi;
$metakeyw =~ s/NAME = KEYWORDS//gi;
$metakeyw =~ s/NAME= KEYWORDS//gi;
$metakeyw =~ s/NAME =KEYWORDS//gi;
$metakeyw =~ s/\n//g;
}
}
http://www.iuni.com/...tware/web/index.html
Links Plugins