What I'm trying to do is:
1) extract text from betwen HTML <BODY> tags,
2a) bold text between anchor tags if the search term(s) is in the URL
2b) bold any other text containing the search term(s)
3) remove all HTML and convert the bold markup to HTML
4) return ranges of text surrounding the search term in an array
I'm certain I need to use splice(), but just not how go about selecting the ranges.
Here is what I have so far:
my ($str, $q) = @_;
my $re = '(' . ( join '|', map { quotemeta } split / /, $q ) . ')';
my @parts = split /((?:<[^>]*>[^<]*<[^>]*>))/, $str;
foreach my $part (@parts) {
if ($part =~ /$re/) {
if ($part =~ /^<a/i) {
#highlight the link description
$part =~ s,(<[^>]*>([^<]*)<[^>]*>),\[b\]$2\[\/b\],;
}
else {
#highlight keywords
$part =~ s,$re,\[b\]$1\[\/b\],gi;
}
}
}
$str = join "", @parts;
$str =~ s/<[^>]*>//g;
$str =~ s/\[b\]/<b>/g;
$str =~ s/\[\/b\]/<\/b>/g;
return $str;
}
If there's a module on CPAN that does this that would be cool too!
Philip
------------------
Limecat is not pleased.
1) extract text from betwen HTML <BODY> tags,
2a) bold text between anchor tags if the search term(s) is in the URL
2b) bold any other text containing the search term(s)
3) remove all HTML and convert the bold markup to HTML
4) return ranges of text surrounding the search term in an array
I'm certain I need to use splice(), but just not how go about selecting the ranges.
Here is what I have so far:
Code:
sub highlight { my ($str, $q) = @_;
my $re = '(' . ( join '|', map { quotemeta } split / /, $q ) . ')';
my @parts = split /((?:<[^>]*>[^<]*<[^>]*>))/, $str;
foreach my $part (@parts) {
if ($part =~ /$re/) {
if ($part =~ /^<a/i) {
#highlight the link description
$part =~ s,(<[^>]*>([^<]*)<[^>]*>),\[b\]$2\[\/b\],;
}
else {
#highlight keywords
$part =~ s,$re,\[b\]$1\[\/b\],gi;
}
}
}
$str = join "", @parts;
$str =~ s/<[^>]*>//g;
$str =~ s/\[b\]/<b>/g;
$str =~ s/\[\/b\]/<\/b>/g;
return $str;
}
If there's a module on CPAN that does this that would be cool too!
Philip
------------------
Limecat is not pleased.