Gossamer Forum
Home : General : Perl Programming :

cgi-question: html->perl text input

Quote Reply
cgi-question: html->perl text input
when i send the content of a text-input field, using a form on an HTML site , the string that is sent to my perl script contains '+' instead of ' ', and characters like exclamation and question marks are a mess. How do i avoid that, or transform the special character codes back into what was originally entered ?
Thanks alot for your help.

b.
Quote Reply
Re: [bartb] cgi-question: html->perl text input In reply to
You should set the character set of the HTML page, so it will likely solve your problem.

Best regards,
Webmaster33


Paid Support
from Webmaster33. Expert in Perl programming & Gossamer Threads applications. (click here for prices)
Webmaster33's products (upd.2004.09.26) | Private message | Contact me | Was my post helpful? Donate my help...
Quote Reply
Re: [webmaster33] cgi-question: html->perl text input In reply to
hmm, i tried that, and it does not seem to change things.

If you have time, i can show you what i did.

I input the text here:

http://retina.anatomy.upenn.edu/~bart/B_running/PTC_attendance/attendance_test.html

if you view source, you can see that i just added a meta tag to set the character set. the cgi script prints the query string on a new html site, which also contains the meta tag.

Am i making some trivial mistake here? Thanks very much for your help - i programmed in C alot, but am new to cgi-scripting. I like it alot, because i can see the possibilities, but it takes a bit to get fluent. i really appreciate your help.
Quote Reply
Re: [bartb] cgi-question: html->perl text input In reply to
well, i guess this is just the way HTTP does it.
i will pass each input through a battery of replacing functions (like this for putting in the commas: $arg_value =~ s/\%2C/\,/g;).

that is my best solution for now. I thought there might be a standard function or script for this - i am sure i am not the first person to run into this. let he who knows teach me, for now, thanks for your reply.

b.
Quote Reply
Re: [bartb] cgi-question: html->perl text input In reply to
Oh, I think I know what's your problem.
You mean that the space is encoded into %20 or +;

All input is encoded in CGI, like in URLs. If you want to display it, decode it.

BTW: your code would be helpful in helping you...

Best regards,
Webmaster33


Paid Support
from Webmaster33. Expert in Perl programming & Gossamer Threads applications. (click here for prices)
Webmaster33's products (upd.2004.09.26) | Private message | Contact me | Was my post helpful? Donate my help...
Quote Reply
Re: [webmaster33] cgi-question: html->perl text input In reply to
'If you want to display it, you must decode it' -> i realized that is just a fact of life in the HTTP world. I put in this block, which takes care of most of it:

$descr =~ s/\+/\ /g;
$descr =~ s/\%2B/\+/g;
$descr =~ s/\%2C/\,/g;
$descr =~ s/\%3F/\?/g;
$descr =~ s/\%21/\!/g;
$descr =~ s/\%3A/\:/g;
$descr =~ s/\%3B/\;/g;
$descr =~ s/\%28/\(/g;
$descr =~ s/\%29/\)/g;.

i am on my way. thanks very much again.
Quote Reply
Re: [bartb] cgi-question: html->perl text input In reply to
There is a shorter solution.
Just search for "url decode perl" in Google...


Code:
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/seg;

Best regards,
Webmaster33


Paid Support
from Webmaster33. Expert in Perl programming & Gossamer Threads applications. (click here for prices)
Webmaster33's products (upd.2004.09.26) | Private message | Contact me | Was my post helpful? Donate my help...
Quote Reply
Re: [webmaster33] cgi-question: html->perl text input In reply to
aah! that's what i was after.
Quote Reply
Re: [webmaster33] cgi-question: html->perl text input In reply to
In Reply To:
There is a shorter solution.
Just search for "url decode perl" in Google...


Code:
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/seg;


There is a safer solution. Use the CGI module which will handle that stuff for you at only minimal cost.

Last edited by:

mkp: Feb 26, 2006, 6:42 AM
Quote Reply
Re: [mkp] cgi-question: html->perl text input In reply to
Why is it not safe?
Do you know how slow the CGI.pm usage is?

Best regards,
Webmaster33


Paid Support
from Webmaster33. Expert in Perl programming & Gossamer Threads applications. (click here for prices)
Webmaster33's products (upd.2004.09.26) | Private message | Contact me | Was my post helpful? Donate my help...
Quote Reply
Re: [webmaster33] cgi-question: html->perl text input In reply to
Yes I do know how slow CGI is. I also know that, if I were rewriting CGI, I'd write it very differently. But the loss is minimal compared with the benefits of using it. (Even though I don't like it very much, I still find myself lazing out and using it.)

Using CGI is safer (especially for new Perl users) because CGI already works. If your script does not work, it's not because of CGI but it might be because of whatever alternative you use.
Quote Reply
Re: [mkp] cgi-question: html->perl text input In reply to
These are the codes used by simple_escape & unescape methods in CGI::Util module:
Code:
sub simple_escape {
return unless defined(my $toencode = shift);
$toencode =~ s{&}{&}gso;
$toencode =~ s{<}{&lt;}gso;
$toencode =~ s{>}{&gt;}gso;
$toencode =~ s{\"}{&quot;}gso;
# Doesn't work. Can't work. forget it.
# $toencode =~ s{\x8b}{&#139;}gso;
# $toencode =~ s{\x9b}{&#155;}gso;
$toencode;
}

# unescape URL-encoded data
sub unescape {
shift() if @_ > 1 and (ref($_[0]) || (defined $_[1] && $_[0] eq $CGI::DefaultClass));
my $todecode = shift;
return undef unless defined($todecode);
$todecode =~ tr/+/ /; # pluses become spaces
$EBCDIC = "\t" ne "\011";
if ($EBCDIC) {
$todecode =~ s/%([0-9a-fA-F]{2})/chr $A2E[hex($1)]/ge;
} else {
$todecode =~ s/%(?:([0-9a-fA-F]{2})|u([0-9a-fA-F]{4}))/
defined($1)? chr hex($1) : utf8_chr(hex($2))/ge;
}
return $todecode;
}

As you see simple_escape() function is pretty same like the one used by Bartb.
The unescape() method has also nothing special. It does additionally support the EBCDIC, utf8 encodings, but nothing special.

The only codes which is needed to compare is:
Code:
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/seg;
and
$todecode =~ tr/+/ /; # pluses become spaces
$todecode =~ s/%(?:([0-9a-fA-F]{2})|u([0-9a-fA-F]{4}))/

The only thing, which might missing from original decoding code is the + sign translation.
So the following code should be right:
Code:
$value =~ tr/+/ /; # pluses become spaces
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/seg;


Again.
Why to use the whole CGI.pm module, which does gives you a lot overhead, when you can do with a pretty short code the same thing???
Using the CGI::Util would be even better, as only this module was affected and not the whole CGI module.

But still unnecessary, when a regexp solves the problem...


EDIT: in general, I think we aggree, that CGI module is good for beginners. They save a lot programming with it. But when performance becomes important, the CGI module is not useful anymore.
Additionally IMHO, when you are depending on a module like CGI, and later if you want to ignore the usage of it because of performance reasons, it will likely need a lot additional work to replace all CGI related occurences. I already faced this situation, and I can say you, it took a lot development time to completely ignore CGI usage.

Best regards,
Webmaster33


Paid Support
from Webmaster33. Expert in Perl programming & Gossamer Threads applications. (click here for prices)
Webmaster33's products (upd.2004.09.26) | Private message | Contact me | Was my post helpful? Donate my help...

Last edited by:

webmaster33: Mar 2, 2006, 3:57 AM