Gossamer Forum
Home : General : Internet Technologies :

regex troubles

Quote Reply
regex troubles
Sometimes I'm amazed at the things I can accomplish with a simple regex, and other times I just feel like a moron. I've been struggling for a good while on this one and I just can't get it to work. Basically I need a regex that strips out any empty html entities from a string. They can have any amount of whitespace or other empty entities in between the opening and closing tag, but no text or images or other entities that contain text or images. I'm using PHP, but if any perl wizards feel like giving it a whirl, I'm sure I can translate.

My embarrassingly crude and incomplete best effort so far follows:

Code:
$string = preg_replace("(<li>\s*</li>|<ul>\s*</ul>|<font[^>]*>( |\n|\r|\t)*</font>|<b>( |\n|\r|\t)*</b>|<td[^>]*>( |\n|\r|\t)*</td>|<tr[^>]*>( |\n|\r|\t)*</tr>|<table[^>]*>( |\n|\r|\t)*</table>)", "", $string);


This works for the <li> items, but that's about it. It seems to stop at that point.

Much gratitude to any kind soul who can lend a hand.

Fractured Atlas :: Liberate the Artist
Services: Healthcare, Fiscal Sponsorship, Marketing, Education, The Emerging Artists Fund
Subject Author Views Date
Thread regex troubles hennagaijin 4784 Apr 18, 2003, 5:30 PM
Thread Re: [hennagaijin] regex troubles
hennagaijin 4511 Apr 21, 2003, 4:57 AM
Thread Re: [hennagaijin] regex troubles
Paul 4479 Apr 21, 2003, 5:38 AM
Post Re: [Paul] regex troubles
hennagaijin 4487 Apr 21, 2003, 6:19 AM
Post Re: [hennagaijin] regex troubles
Paul 4490 Apr 21, 2003, 5:31 AM