Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Catalyst: Users

Catalyst, utf8 in form element type text

 

 

Catalyst users RSS feed   Index | Next | Previous | View Threaded


mariusauto-catalyst at kjeldahl

May 4, 2008, 3:09 PM

Post #1 of 8 (829 views)
Permalink
Catalyst, utf8 in form element type text

I'm having a small problem that I hope somebody has a simple solution
to. I'm using Catalyst with TT for the view, PostgreSQL and everything
set up using utf8 (in perl source "use utf8", in postgres using
"enable_utf8" and in the actual templates containing utf8 encoded
interational characers). I've verified that the data stored in postgres
is actually stored correctly (international characters in the postgres
table display correctly in psql, and data pulled from both the database
and templates show international characters fine).

Everything seems to work fine, with one small exception. Whenever I have
a HTML form input type=text with an international character and the form
validation fails, so the default value of the input field contains the
international character, the rest of the html document does no longer
display international characters correctly. If I remove the
international character from the input field and resubmit, everything is
displayed correctly again.

I'm guessing the browser detects that the document contains some element
that is not proper utf8, and disables utf8 altogether before displaying
whenever the input field contains an international characters.

The input field value is set in the template from the
$c->req->parameters passed in the stash.

So my question is what's the best way to handle this? Can an input value
in a form handle a utf8 encoded string at all, and if so how can I
convince it my string is utf8, and if I do does the browser detect it
automagically?

Any pointers?

Thanks,

Marius K.

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


pagaltzis at gmx

May 5, 2008, 6:39 AM

Post #2 of 8 (771 views)
Permalink
Re: Catalyst, utf8 in form element type text [In reply to]

* Marius Kjeldahl <mariusauto-catalyst [at] kjeldahl> [2008-05-05 00:20]:
> Everything seems to work fine

“Seems” being the operative word.

> with one small exception. Whenever I have a HTML form input
> type=text with an international character and the form
> validation fails, so the default value of the input field
> contains the international character, the rest of the html
> document does no longer display international characters
> correctly.

That is because all of that was not marked as character data to
begin with. When Perl tries to concatenate it with a Unicode
string, it sees byte strings so it decodes them as Latin-1. Then
all the UTF-8 multibyte characters turn into gremlins.

> I'm guessing the browser detects that the document contains
> some element that is not proper utf8, and disables utf8
> altogether before displaying whenever the input field contains
> an international characters.

You’re probably wrong about that guess. What headers do you send?

Do you use `<meta http-equiv="Content-Type">`? (Bad idea, btw.)

> If I remove the international character from the input field
> and resubmit, everything is displayed correctly again. […] The
> input field value is set in the template from the
> $c->req->parameters passed in the stash.

Are you using Catalyst::Plugin::Unicode?

> So my question is what's the best way to handle this?

Did you tell Template Toolkit or whatever template engine you use
that the templates are in UTF-8?

> Can an input value in a form handle a utf8 encoded string at
> all

Yes.

> and if so how can I convince it my string is utf8, and if I do
> does the browser detect it automagically?

No, the headers must be set correctly.

> Any pointers?

In addition to the above? Check out encoding::warnings.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


mariusauto-catalyst at kjeldahl

May 5, 2008, 7:12 AM

Post #3 of 8 (779 views)
Permalink
Re: Re: Catalyst, utf8 in form element type text - Solved [In reply to]

Problem solved. In my "View" class, like:

package MyApp::View::TT;
use strict; use warnings;
use base 'Catalyst::View::TT';

replace the last line with:

use base 'Catalyst::View::TT::ForceUTF8';

and everything works fine. I guess there was some confusion between
Template Toolkit and non-utf8 stash strings or similar.

Thanks,

Marius K.


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


moseley at hank

May 5, 2008, 11:28 AM

Post #4 of 8 (767 views)
Permalink
Re: Re: Catalyst, utf8 in form element type text - Solved [In reply to]

On Mon, May 05, 2008 at 04:12:53PM +0200, Marius Kjeldahl wrote:
> Problem solved. In my "View" class, like:
>
> package MyApp::View::TT;
> use strict; use warnings;
> use base 'Catalyst::View::TT';
>
> replace the last line with:
>
> use base 'Catalyst::View::TT::ForceUTF8';

That seems like the wrong approach.

Data should be decoded on input from the outside and encoded on
output. I'm not sure when it would be advisable to force utf8 flag
on items in the stash, but I have not looked at that module in a
while.

<form> tags should have accept-charset

C::P::Unicode::Encoding should be used (I suggest with reservations).
That will decode parameters and encoding output.

If your templates are UTF8 then ENCODING => 'UTF-8' when creating TT
object.

Do what's required for your database to handle utf-8.

--
Bill Moseley
moseley [at] hank


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


mariusauto-catalyst at kjeldahl

May 5, 2008, 12:22 PM

Post #5 of 8 (766 views)
Permalink
Re: Re: Catalyst, utf8 in form element type text - Solved [In reply to]

Bill Moseley wrote:
>> use base 'Catalyst::View::TT::ForceUTF8';
>
> That seems like the wrong approach.
>
> Data should be decoded on input from the outside and encoded on
> output. I'm not sure when it would be advisable to force utf8 flag
> on items in the stash, but I have not looked at that module in a
> while.
>
> <form> tags should have accept-charset

I tried this but couldn't get it working correctly, which may be
entirely my fault of course.

> C::P::Unicode::Encoding should be used (I suggest with reservations).
> That will decode parameters and encoding output.

I looked into this and related modules trying to figure out exactly
where to do what, which lead me to the solution posted.

> If your templates are UTF8 then ENCODING => 'UTF-8' when creating TT
> object.

Tried this as well. Didn't work. As far as I managed to figure out, that
solution requires the plugin you mentioned, or a similar one (possibly
ending in Encode instead of Encoding - I'm taking this from memory while
googling for a solution to my problem).

> Do what's required for your database to handle utf-8.

In my case, everything is utf8. The source code (with embedded strings),
the database and I see no reason to start juggling back and forth
between encodings unless there is a specific need. There may be one,
which I'm sure further testing will demonstrate, but for now I'm ok.

Actually, I found one place where it was actually needed already. I'm
using some of the Yahoo YUI "ajax" components which didn't work great
with utf8, and a simple "decode" (from utf8) before returning some
values in a ajax component seemed to solve it just. There may be flags
that can be set in the YUI library which enable utf8 encoding also,
which would probably be a better solution.

Thanks,

Marius K.

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


moseley at hank

May 5, 2008, 12:46 PM

Post #6 of 8 (767 views)
Permalink
Re: Re: Catalyst, utf8 in form element type text - Solved [In reply to]

On Mon, May 05, 2008 at 09:22:19PM +0200, Marius Kjeldahl wrote:
> ><form> tags should have accept-charset
>
> I tried this but couldn't get it working correctly, which may be
> entirely my fault of course.

What does "couldn't get it working" mean? You couldn't get an
accept-charset on your form tags?


> >C::P::Unicode::Encoding should be used (I suggest with reservations).
> >That will decode parameters and encoding output.
>
> I looked into this and related modules trying to figure out exactly
> where to do what, which lead me to the solution posted.

It's just a plugin in. You add it to the use Catalyst list of
plugins. It only decodes $c->req->parameters (failing to decode
body_parameters, btw) and then encodes the $c->req->body in
finalize().


> >If your templates are UTF8 then ENCODING => 'UTF-8' when creating TT
> >object.
>
> Tried this as well. Didn't work. As far as I managed to figure out, that
> solution requires the plugin you mentioned, or a similar one (possibly
> ending in Encode instead of Encoding - I'm taking this from memory while
> googling for a solution to my problem).

Again, not sure what "didn't work" means, but it doesn't require any
other modules -- it just says your templates should be decoded as the
encoding you specify:

perldoc -m Template::Provider

search for ENCODING


--
Bill Moseley
moseley [at] hank


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


pagaltzis at gmx

May 6, 2008, 12:04 AM

Post #7 of 8 (770 views)
Permalink
Re: Catalyst, utf8 in form element type text - Solved [In reply to]

* Bill Moseley <moseley [at] hank> [2008-05-05 21:40]:
> <form> tags should have accept-charset

Browsers tend to ignore that and send the form data in the same
encoding as the page that the form was on. Some browsers also do
other screwy things. Overall this is an area of much hatefulness.
For best results, <http://search.cpan.org/perldoc?Encode::HEBCI>
is the way to go. But most of the time it’s overkill, since once
you get your pages to be served as UTF-8 properly, you can pretty
much forget the issue.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


moseley at hank

May 6, 2008, 9:43 AM

Post #8 of 8 (743 views)
Permalink
Re: Re: Catalyst, utf8 in form element type text - Solved [In reply to]

On Tue, May 06, 2008 at 09:04:38AM +0200, Aristotle Pagaltzis wrote:
> * Bill Moseley <moseley [at] hank> [2008-05-05 21:40]:
> > <form> tags should have accept-charset
>
> Browsers tend to ignore that and send the form data in the same
> encoding as the page that the form was on.

"Browsers" is a bit general.

Yes, IE will use the HTTP Content-Type header over accpet-charset in
the <form> tag (and over any <meta> tag as well).

Firefox 2 will use accept-charset (even if its different from the
HTTP charset). So, it's good to have an accept-charset and make sure
it matches the page's Content-Type charset.

At least, that's how I remember it.


> other screwy things. Overall this is an area of much hatefulness.
> For best results, <http://search.cpan.org/perldoc?Encode::HEBCI>
> is the way to go. But most of the time it’s overkill, since once
> you get your pages to be served as UTF-8 properly, you can pretty
> much forget the issue.

That's what I do -- I set the Content-Type, meta http-equiv,
and accept-charset on the form all to utf-8. Any browser that screws
that up likely isn't supported in other ways, too.

--
Bill Moseley
moseley [at] hank


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/

Catalyst users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.