Gossamer Forum
Home : General : Perl Programming :

quick question on substitution in perl

Quote Reply
quick question on substitution in perl
hi all, i am new to perl and processing some strings like the following:

<A HREF="#fin14448_7">Notes to Consolidated Financial Statements</A></FONT></TD>

I want to delete all the stuff inside < and > (including '<' and '>')

so the results after the sub will be "Notes to Consolidated Financial Statements"

i used the algorithm 's/<.*>//g' to do the substitute, however, it drops the whole line (essentially it's deleting anything between the first '<' and the last '>').

anyone can help with the issue? Thanks a lot for any help.
Quote Reply
Re: [new2perl] quick question on substitution in perl In reply to
Your pattern is greedy. See http://perldoc.perl.org/perlre.html#Regular-Expressions for more information.

Since you are parsing markup, you don't want to use regular expressions. In HTML, '<img src="blah" alt=">">' is valid (alt is CDATA), but when you splice it up using a less greedy RE:

s/<.*?>//g;

You are still left with '">' which isn't what you want.
Post deleted by mkp In reply to