Gossamer Forum
Home : General : Perl Programming :

text to html conversion

Quote Reply
text to html conversion
After open an text file, i need to use perl to do the following thing:

Recognize a normal HTTP URL and turn it into a link to itself,

e.g. http://www.yahoo.com/

?lt;/SPAN> <a href=“http://www.yahoo.com/?amp;gt;http://www.yahoo.com/</a>

Assume that the HTTP URL has form http://hostname/path where the path part

is optional. The path part of a URL can consists of a variety of characters:

[-a-zA-Z0-9_:@&?=+,.!/~*%$]. The hostname contains words that are

separated by dots. The dots are strict separators which means that there

must be something in between for them to separate, e.g. yahoo..com is not a

valid hostname. The possible alternatives for the last part of the host name

are: com|edu|gov|mil|net|org|biz|info|name or any 2 letters in small case (a-z).



Can anyone help me? Thx
Quote Reply
Re: [avdic] text to html conversion In reply to
you could try something like this. it doesn't do everything you need but if you pick up a book on perl programming this will give you a good start on how to customize it to what you want.

Code:

# first get the data from the file
my $buf;
open FH, "</path/to/file.txt";
{
local $/ = undef;
$buf = <FH>;
}
close FH;

# next, parse the data and put <a hrefs.. around the possible URLs.
$buf =~ s,(
http://\w+\.(\w+\.)*\w+ # first the domain name
[^\s]* # then the rest of it
),<a href="$1">$1</a>,gix;

print $buf; # output the text with <a href=""" setup

Last edited by:

Aki: Sep 20, 2002, 11:30 AM
Quote Reply
Re: [Aki] text to html conversion In reply to
Ok, using the above method how to print the "paragraphs, tabs and other formatting" in the html version so the text file read will be printed in html but instead of a one large paragraph, it will format accordingly.

Right now I am using <pre></pre> tags in printing but one big problem is that sometimes the long text lines extend way too much on the screen.

Any suggestions?

thanks

Last edited by:

socrates: Sep 24, 2002, 9:44 PM
Quote Reply
Re: [socrates] text to html conversion In reply to
Could try doing a quick regex on the buf before it prints. something like this:

Code:
$buf =~ s,\n,<br>,g;
Quote Reply
Re: [Aki] text to html conversion In reply to
Thanks, that works great. I thought it would be much complicated. Solved a nasty issue.
Quote Reply
Re: [socrates] text to html conversion In reply to
There's a program I set up once a while back called txt2html and it's still being used that was pretty decent in converting text pages into html, it handles all sort of things from linking url's on certain keyword appearance, bulleted lists, etc.

http://www.aigeek.com/txt2html/