Gossamer Forum
Home : General : Perl Programming :

A perl code question

Quote Reply
A perl code question
I have a perl code question for you guys. If the command

$value =~ s/<([^>]|\n)*>//g;

will remove all html from a document. What would have to be done to it so that it will only erase the html tag <title>?

Please help

Matt
Quote Reply
Re: A perl code question In reply to
Since regular expression can often be more of an art than a science, I will try my flavoring of this, try the following, this would only remove the title tag itself, but not the actual title text between the <title> and </title>

$value =~ s/<(title|\/title)([^>]|\n)*>//g;

To be honest, parts of this reg-ex are a bit mysterious to me, so I would have to research them a bit to make sure I knew what they were doing, basically, what I THINK its doing is finding all items that happen to be between tag elements < and > and then it also looks ahead for a closing > just in case someone did something like >> at the end of a tag. It also looks for any line breaks in the tag, and allows for those as well. Adding the title or /title at the beginning SHOULD check for just those tags.

Hope this helps,


------------------
Fred Hirsch
Web Consultant & Programmer