Blog
Developers
Careers
Support
Contact
Gossamer Threads
Solutions
Results
About
Mailing Lists
Resource Centre
Forum
Tools
Home
Who's Online
Tags
Favourites
Login
Forum Search
(
Advanced Search
)
This forum
This category
All forums
for
Home
:
General
:
Perl Programming
:
Regular Expression
Previous Thread
Next Thread
Print Thread
View Threaded
Dec 24, 2000, 11:12 AM
waffle
User
(69 posts)
Dec 24, 2000, 11:12 AM
Post #1 of 7
Views: 4507
Shortcut
Regular Expression
Does anyone know what Regex to use to print out all of the tags in an HTML file. Thanks.
Adrian
Dec 24, 2000, 12:11 PM
dan
Enthusiast
(760 posts)
Dec 24, 2000, 12:11 PM
Post #2 of 7
Views: 4385
Shortcut
Re: Regular Expression
In reply to
Try this (untested):
$text =~ s/<([^>]| )*>//g;
Dan
Dec 24, 2000, 3:35 PM
jsu
User
(381 posts)
Dec 24, 2000, 3:35 PM
Post #3 of 7
Views: 4387
Shortcut
Re: Regular Expression
In reply to
that would delete all the html tags in a html file..
use this..
[i have a feeling it won't work..]
#!/usr/bin/perl
$file = "whatever.html";
open FILE, "$file";
while (<FILE>) { $html .= $_; }
close FILE;
print "Content-type: text/plain\n\n";
foreach $tag ($html =~ /<(.+?)>/) {
print "$tag\n";
}
if i am correct.. that will print out the tags out..
umm..... for:
<img src="blah.gif">
it will print out:
img src="blah.gif"
if you just want "img".. change:
foreach $tag ($html =~ /<(.+?)>/) {
to
foreach $tag ($html =~ /<([^\s]+)(.*?)>/) {
Jerry Su
widgetz sucks
Dec 24, 2000, 3:53 PM
dan
Enthusiast
(760 posts)
Dec 24, 2000, 3:53 PM
Post #4 of 7
Views: 4373
Shortcut
Re: Regular Expression
In reply to
Jerry:
I assumed he meant he wanted the HTML tags removed - i.e., convert HTML to text.
Adrian: Could you clarify.
Dan
Dec 24, 2000, 6:13 PM
waffle
User
(69 posts)
Dec 24, 2000, 6:13 PM
Post #5 of 7
Views: 4371
Shortcut
Re: Regular Expression
In reply to
Hi Dan,
Sorry for the confusion. Jerry's got what I tried to say. Thanks a lot to both of you for your help.
Regards,
Adrian
Dec 24, 2000, 7:04 PM
waffle
User
(69 posts)
Dec 24, 2000, 7:04 PM
Post #6 of 7
Views: 4373
Shortcut
Re: Regular Expression
In reply to
Hi Jerry,
Thanks for the code. However, it only printed the first html tag. I made a few changes:
#!/usr/bin/perl
$file = "whaterver.html";
open FILE, "$file";
while (<FILE>) {
@tags = /<(\/?\w+)+/ig;
foreach (@tags) { print "$_\n" }
}
close FILE;
Thanks,
Adrian
Dec 25, 2000, 5:44 PM
jsu
User
(381 posts)
Dec 25, 2000, 5:44 PM
Post #7 of 7
Views: 4337
Shortcut
Re: Regular Expression
In reply to
#!/usr/bin/perl
$file = "whatever.html";
open FILE, "$file";
while (<FILE>) { $html .= $_; }
close FILE;
print "Content-type: text/plain\n\n";
foreach $tag ($html =~ m#</?(.+?)>#g) {
print "$tag\n";
}
i forgot the g! no!!!!
Jerry Su
widgetz sucks
Previous Thread
Next Thread
Print Thread
View Threaded