Gossamer Forum
Home : General : Perl Programming :

Extracting text data from HTML page

Quote Reply
Extracting text data from HTML page
DBMan has put me on a 'Learning Perl' path. But surrounded with all
my Perl Cookbooks etc. I am still finding a few basic things quite
difficult, hopefully you can help!

I am trying to extract several sets of data from an html page
that is located between <!--START--> and <!--END--> tags:

Code:
open (MYFILE,"/info.html");
$i = 1;
print "Content-type: text/html\n\n";

while (<MYFILE> ) {
if ($i < 4) {
print if (/<!--START-->/ ... /<!--END-->/ | $i++);
}
}

This works fine and prints out exactly what I want to
extract from the html page. However, I want to put that
same data into a variable, so changed the 'while' loop to:

Code:
while (<MYFILE> ) {
if ($i < 4) {
($lines .= <MYFILE> ) if (/<!--START-->/ ... /<!--END-->/ | $i++);
}
}
print $lines; ### To see what data is in $lines

But this produces a completely different, wrong, result.
What am I doing wrong trying to assign the data to $lines ?