Gossamer Forum
Home : General : Perl Programming :

Different regexp question: HTML

Quote Reply
Different regexp question: HTML
Okay...

I added the following regexp in one of my module files:

Quote:

my $in = shift;
$in =~ s,<(.*?)>,\&lt\;$1\&gt\;,gs;
$in =~ s/\n/<br>/g;
$in =~ s/\n\n/<p>/g;


Now, this works as it should in terms of adding and editing posts.

The problem is when viewing posts, the HTML codes show up, of course, so I use the following codes in another sub for viewing posts:

Quote:

$in =~ s,<(.*?)>,\&lt\;$1\&gt\;,gs;


This works, however, line break <br> and paragraph break <p> only show up as HTML codes, and the spacing is lost.

Any advice is welcome.

Thanks in advance.
========================================
Buh Bye!

Cheers,
Me
Quote Reply
Re: [Chewbaca] Different regexp question: HTML In reply to
Im not sure what the problem is but you shouldn't (or don't need to) escape &'s and ;'s ....that can sometimes cause unexpeced results.

Also I'd change:

$in =~ s,<(.*?)>,\&lt\;$1\&gt\;,gs;

to...

$in =~ s,<([^>])>,&lt;$1&gt;,gs;

...to be safe.
Quote Reply
Re: [Chewbaca] Different regexp question: HTML In reply to
Quote:
This works, however, line break <br> and paragraph break <p> only show up as HTML codes

Well it will do as you are converting tags to &lt and &gt - they need to be < and > for them html to be interpreted surely?


Have I misunderstood?


Quote Reply
Re: [RedRum] Different regexp question: HTML In reply to
That is correct, Paul.

However, I want all HTML tags except for < p > and < br > to be converted to &lt;something&gt;

Look at the following URI:

http://www.anthrotech.com/...topic=50&print=1

The post shows as:

Quote:

</td></tr></table> <br><br><script>test</script> <br><br>test


The post should show like the following:

Quote:

</td></tr></table>

<script>test</script>

test


Does this make better sense?

Thanks for the reply. I do appreciate it.

========================================
Buh Bye!

Cheers,
Me

Last edited by:

Chewbaca: Nov 13, 2001, 8:34 PM
Quote Reply
Re: [Chewbaca] Different regexp question: HTML In reply to
First you convert newlines to HTML...
Then you convert HTML to plain text...
Surely it's just a logic problem?

I guess you could either hold off converting newlines until you display posts, or convert plain text back into HTML (could be annoying)

s~\&lt\;br\&gt\;~<br>~g;

BTW, doesn't Chewbacca have two C's? Crazy

- Mark

Astro-Boy!!
http://www.zip.com.au/~astroboy/

Last edited by:

AstroBoy: Nov 13, 2001, 9:11 PM
Quote Reply
Re: [AstroBoy] Different regexp question: HTML In reply to
Ugh you did it as well Crazy

s~&lt;br&gt;~<br>~g;

...will work.
Quote Reply
Re: [Chewbaca] Different regexp question: HTML In reply to
If you want p's and br's try:

$in =~ s,&lt;(br|p)&gt;,<$1>,sig;

Last edited by:

RedRum: Nov 14, 2001, 4:51 AM
Quote Reply
Re: [AstroBoy] Different regexp question: HTML In reply to
Thanks, Andy...It is NOT a logic problem...it is a regexp problem.

Got it?
========================================
Buh Bye!

Cheers,
Me
Quote Reply
Re: [RedRum] Different regexp question: HTML In reply to
Thanks, Paul...That should work! Give it a shot this evening.
========================================
Buh Bye!

Cheers,
Me
Quote Reply
Re: [Chewbaca] Different regexp question: HTML In reply to
In Reply To:
Thanks, Andy...It is NOT a logic problem...it is a regexp problem.

Got it?
You change newlines to HTML, and then wonder why turning HTML into text detroy's BR's...

Maybe logic was the wrong word - all I meant was it could have been planned ahead a bit better. Eg, you could turn newlines into a temporary marker (like DBMan does) and then convert the markers back into BR's for viewing.

Oh well, at least you fixed it Smile

Cheers,

- Mark

Astro-Boy!!
http://www.zip.com.au/~astroboy/
Quote Reply
Re: [AstroBoy] Different regexp question: HTML In reply to
Again, Andy, the problem is not with the process (I already replace strings with forum markup language, with the exception of break lines and paragraph beak codes).

Goodbye!
========================================
Buh Bye!

Cheers,
Me
Quote Reply
Re: [Chewbaca] Different regexp question: HTML In reply to
Who's Andy? Crazy Wink

Peace Cool

- Mark

Astro-Boy!!
http://www.zip.com.au/~astroboy/