Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Catalyst: Users

Unicode trouble with Catalyst::Engine::FastCGI

 

 

Catalyst users RSS feed   Index | Next | Previous | View Threaded


catalyst4 at augensalat

Nov 23, 2009, 3:57 AM

Post #1 of 14 (3315 views)
Permalink
Unicode trouble with Catalyst::Engine::FastCGI

After I recently re-installed my Development-Perl (that one, that I use
apart from the production Perl installation), all pages that went
through Catalyst::Engine::FastCGI got double-utf8-encoded.

When using the standalone HTTP server, everything is fine.

Both installations used Perl 5.10.1. I did a snapshot of the old modules
at first and then re-installed everything on the new installation.

Googling around I found
http://www.mail-archive.com/catalyst [at] lists/msg00051.html ,
but I couldn't verify Jonathan's assumption in my case - even before the
buffer is written into the FastCGI pipe in
Catalyst::Engine::FastCGI::write the whole buffer (header and body)
looked fine right before calling STDOUT->syswrite.

Is there anyone here, who has experienced similar effects and maybe
found a solution (besides kicking out the FastCGI engine)?

Meanwhile I'm proxying from the main webserver to C:E:HTTP::Prefork,
which works fine, but I would prefer keeping FastCGI as an alternative.

Bernhard Graf

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


catalyst4 at augensalat

Nov 23, 2009, 10:06 AM

Post #2 of 14 (3197 views)
Permalink
Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

Good news everyone!
I found the gist of the matter and some places that should be fixed.

> Googling around I found
> http://www.mail-archive.com/catalyst [at] lists/msg00051.html ,
> but I couldn't verify Jonathan's assumption in my case - even before the
> buffer is written into the FastCGI pipe in
> Catalyst::Engine::FastCGI::write the whole buffer (header and body)
> looked fine right before calling STDOUT->syswrite.

Meanwhile I realized, that the final output buffer (header + body)
actually /has/ the UTF-8 flag set. So it seems, that Jonathan's idea
(above link) also matches my case.
Everything seems fine again, when I insert this line

utf8::downgrade($buffer) if utf8::is_utf8($buffer);

into Catalyst::Engine::FastCGI::write() before
*STDOUT->syswrite($buffer).

While this fixes the problem, it is still unclear, why the utf8 flag is
set for the whole buffer.

Since the body has already be turned into a clean octet-stream by
C:P:Unicode, the header must have been flagged as utf8 - and this was
actually the case. Digging further I found out, that only the cookie
string had the utf8 flag. The cookie is built with CGI::Simple::Cookie
from a hash, and in this hash only one value had the utf8 flag:
$cookie->{domain}. And this comes from myapp.yml, which is a YAML file,
of course.

So Ladies and Gentleman, may I present you the culprit?
It is YAML::XS! Everything read by YAML::XS

perl -MYAML::XS -E '
my $config = YAML::XS::LoadFile("myapp.yml");
say((utf8::is_utf8($config->{name}) ? "is" : "is not"), " utf8");
'
is utf8

On the old installation YAML::XS was not installed. Actually I installed
that module right after I re-installed everything from the snapshot.

perl -MYAML::Syck -E '
my $config = YAML::Syck::LoadFile("myapp.yml");
say((utf8::is_utf8($config->{name}) ? "is" : "is not"), " utf8");
'
is not utf8

So my first step is to kick YAML::XS. KICK, KICK, KICK!!!
Restart Cat-App - now it tells me:

Use of YAML::Syck or YAML to parse config files is DEPRECATED. Please
install YAML::XS for proper YAML support at
/opt/perl/5.10.1/lib/site_perl/Config/Any.pm line 198

But my output is fixed! WoohooOO!

So first of all YAML::XS must be fixed (I'll file a bug report now).

Also Catalyst::Engine(::FastCGI) should assure, that this implicit
ut8::upgrade does not happen when the body has already been encoded into
an octet-stream. But I'm not sure if utf8::downgrade() is the right
approach to fix this, because other Cat engines are not affected by this
utf8-flag bug, and those don't explicitly utf8::downgrade the buffer either.


Bernhard Graf


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


pagaltzis at gmx

Nov 23, 2009, 10:25 AM

Post #3 of 14 (3195 views)
Permalink
Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

* Bernhard Graf <catalyst4 [at] augensalat> [2009-11-23 19:10]:
> Meanwhile I realized, that the final output buffer (header
> + body) actually /has/ the UTF-8 flag set. So it seems, that
> Jonathan's idea (above link) also matches my case. Everything
> seems fine again, when I insert this line
>
> utf8::downgrade($buffer) if utf8::is_utf8($buffer);
>
> into Catalyst::Engine::FastCGI::write() before
> *STDOUT->syswrite($buffer).

The conditional is superfluous. If you downgrade a string that
doesn’t need downgrading, it’s a no-op.

> While this fixes the problem, it is still unclear, why the utf8
> flag is set for the whole buffer.

It shouldn’t matter.

> Since the body has already be turned into a clean octet-stream
> by C:P:Unicode, the header must have been flagged as utf8 - and
> this was actually the case. Digging further I found out, that
> only the cookie string had the utf8 flag. The cookie is built
> with CGI::Simple::Cookie from a hash, and in this hash only one
> value had the utf8 flag: $cookie->{domain}. And this comes from
> myapp.yml, which is a YAML file, of course.
>
> So Ladies and Gentleman, may I present you the culprit? It is
> YAML::XS! Everything read by YAML::XS
>
> perl -MYAML::XS -E '
> my $config = YAML::XS::LoadFile("myapp.yml");
> say((utf8::is_utf8($config->{name}) ? "is" : "is not"), " utf8");
> '
> is utf8

No, that’s not the culprit.

The culprit is Catalyst::Engine::FastCGI, which does not pay
attention to the UTF8 flag.

> On the old installation YAML::XS was not installed. Actually
> I installed that module right after I re-installed everything
> from the snapshot.
>
> perl -MYAML::Syck -E '
> my $config = YAML::Syck::LoadFile("myapp.yml");
> say((utf8::is_utf8($config->{name}) ? "is" : "is not"), " utf8");
> '
> is not utf8
>
> So my first step is to kick YAML::XS. KICK, KICK, KICK!!!
> Restart Cat-App - now it tells me:
>
> Use of YAML::Syck or YAML to parse config files is DEPRECATED. Please
> install YAML::XS for proper YAML support at
> /opt/perl/5.10.1/lib/site_perl/Config/Any.pm line 198
>
> But my output is fixed! WoohooOO!
>
> So first of all YAML::XS must be fixed (I'll file a bug report now).

No. YAML::XS is completely correct.

> Also Catalyst::Engine(::FastCGI) should assure, that this
> implicit ut8::upgrade does not happen when the body has already
> been encoded into an octet-stream.

Nope.

> But I'm not sure if utf8::downgrade() is the right approach to
> fix this, because other Cat engines are not affected by this
> utf8-flag bug, and those don't explicitly utf8::downgrade the
> buffer either.

I’m not sure why ::FastCGI is affected specifically. Whether
a string is upgraded or not should make no difference, though. In
both cases the output should be the same. If it’s not, then
something is broken in ::FastCGI.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


catalyst4 at augensalat

Nov 23, 2009, 10:53 AM

Post #4 of 14 (3203 views)
Permalink
Re: Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

Aristotle Pagaltzis schrieb:

>> While this fixes the problem, it is still unclear, why the utf8
>> flag is set for the whole buffer.
>
> It shouldn’t matter.

But it does.

>> So Ladies and Gentleman, may I present you the culprit? It is
>> YAML::XS! Everything read by YAML::XS
>>
>> perl -MYAML::XS -E '
>> my $config = YAML::XS::LoadFile("myapp.yml");
>> say((utf8::is_utf8($config->{name}) ? "is" : "is not"), " utf8");
>> '
>> is utf8
>
> No, that’s not the culprit.
>
> The culprit is Catalyst::Engine::FastCGI, which does not pay
> attention to the UTF8 flag.

Obviously YAML::XS doesn't do that either.

But aside from that I agree with you, that something ist broken in
F(ast)CGI. It seems more that F(ast)CGI pays attention to the utf8 flag
where it shouldn't, because it seems to automatically utf8::encode the
buffer due to the set utf8 flag.

> No. YAML::XS is completely correct.

No, it is not. Because it claims:

This module exports the functions Dump and Load. These functions are
intended to work exactly like YAML.pm's corresponding functions.

And that's not the case. And therefore you can't use it as a
drop-in-replacement for other YAML modules.


Bernhard Graf

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


pagaltzis at gmx

Nov 23, 2009, 1:15 PM

Post #5 of 14 (3203 views)
Permalink
Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

Hi Bernhard,

* Bernhard Graf <catalyst4 [at] augensalat> [2009-11-23 20:00]:
> Aristotle Pagaltzis schrieb:
>
>>> While this fixes the problem, it is still unclear, why the
>>> utf8 flag is set for the whole buffer.
>>
>> It shouldn’t matter.
>
> But it does.

yes, because ::FastCGI is broken. :-) Is what I’m saying.

>>> So Ladies and Gentleman, may I present you the culprit? It
>>> is YAML::XS! Everything read by YAML::XS
>>>
>>> perl -MYAML::XS -E '
>>> my $config = YAML::XS::LoadFile("myapp.yml");
>>> say((utf8::is_utf8($config->{name}) ? "is" : "is not"), " utf8");
>>> '
>>> is utf8
>>
>> No, that’s not the culprit.
>>
>> The culprit is Catalyst::Engine::FastCGI, which does not pay
>> attention to the UTF8 flag.
>
> Obviously YAML::XS doesn't do that either.

It does. It sets the flag to correctly reflect the state of the
internal byte buffer of the scalar so that its string value will
mean the right thing.

> But aside from that I agree with you, that something ist broken
> in F(ast)CGI. It seems more that F(ast)CGI pays attention to
> the utf8 flag where it shouldn't, because it seems to
> automatically utf8::encode the buffer due to the set utf8 flag.

Ah, that may be, yeah, it would produce the result you are
seeing, and would clearly be broken. C::E::FastCGI should just
output the string as it is and let perl worry about its meaning,
rather than (if that’s what it does) checking the UTF8 flag
itself and doing something broken in response.

> > No. YAML::XS is completely correct.
>
> No, it is not. Because it claims:
>
> This module exports the functions Dump and Load. These
> functions are intended to work exactly like YAML.pm's
> corresponding functions.
>
> And that's not the case. And therefore you can't use it as
> a drop-in-replacement for other YAML modules.

Hmm. I guess it depends on whether think of “drop-in” as
something that replicates behaviour down to internal
implementation details, or just functions identically on the
semantic level. When I see a module advertise itself like that
I assume only the latter.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


bobtfish at bobtfish

Nov 23, 2009, 2:23 PM

Post #6 of 14 (3191 views)
Permalink
Re: Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

On 23 Nov 2009, at 21:15, Aristotle Pagaltzis wrote:
> * Bernhard Graf <catalyst4 [at] augensalat> [2009-11-23 20:00]:
>> Aristotle Pagaltzis schrieb:
>>
>>>> While this fixes the problem, it is still unclear, why the
>>>> utf8 flag is set for the whole buffer.
>>>
>>> It shouldnt matter.
>>
>> But it does.
>
> yes, because ::FastCGI is broken. :-) Is what Im saying.

The FastCGI engine doesn't do anything with encoding/uft8, and nor
does FCGI.pm

This probably means (I'm guessing) that the xs part of FCGI doesn't
correctly handle buffers which are characters rather than bytes.

We can get this fixed, but not without test cases, and I'm struggling
to reproduce the issue here.

Help?

Cheers
t0m



_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


catalyst4 at augensalat

Nov 24, 2009, 4:32 AM

Post #7 of 14 (3177 views)
Permalink
Re: Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

Tomas Doran schrieb:

> The FastCGI engine doesn't do anything with encoding/uft8, and nor does
> FCGI.pm
>
> This probably means (I'm guessing) that the xs part of FCGI doesn't
> correctly handle buffers which are characters rather than bytes.

I wrote a test program to keep all possible side effects away.
It is here http://scsys.co.uk:8001/36569 and it shows (using option -u),
that FCGI indeed encodes all data as soon as the utf-8 flag is on.

This is very interesting, because the latest FCGI release is nearly
seven years old. Did the utf-8 flag exist at that time at all?
Is there something in Perl's guts that does such encoding automatically?

> We can get this fixed, but not without test cases, and I'm struggling to
> reproduce the issue here.
>
> Help?

I know hardly anything about Perl XS code. Maybe someone with a clue
could have a look at FCGI.xs^H^HXL?

As a quick fix, we could utf8::downgrade($buffer) in
Catalyst::Engine::FastCGI right before syswrite. Doesn't hurt, as far as
I understand.


Bernhard

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


pagaltzis at gmx

Nov 24, 2009, 8:41 AM

Post #8 of 14 (3164 views)
Permalink
Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

* Tomas Doran <bobtfish [at] bobtfish> [2009-11-23 23:20]:
> This probably means (I'm guessing) that the xs part of FCGI
> doesn't correctly handle buffers which are characters rather
> than bytes.

That was my initial guess.

> We can get this fixed, but not without test cases, and I'm
> struggling to reproduce the issue here.
>
> Help?

Glad that Bernhard stepped up because I’ve never used FCGI. :-)


* Bernhard Graf <catalyst4 [at] augensalat> [2009-11-24 13:40]:
> As a quick fix, we could utf8::downgrade($buffer) in
> Catalyst::Engine::FastCGI right before syswrite. Doesn't hurt,
> as far as I understand.

It wouldn’t even necessarily be a quick fix in this case, IMO. It
would be preferrable to fix it in FCGI, but if the maintainer is
gone and a new version is not likely to happen, then downgrading
in the FastCGI engine would be the correct and proper fix. Any
fix in the XS code would effectively imply a transient downgrade
anyway. If the body string represents an encoded octet sequence,
then downgrading will always succeed, as well.

(If it’s not downgradeable, then the engine should probably throw
an exception, as mentioned in the other thread.)

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


m.blackman at fairfx

Jan 5, 2010, 4:32 AM

Post #9 of 14 (2645 views)
Permalink
Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

On 23 Nov 2009, at 11:57, Bernhard Graf wrote:

> After I recently re-installed my Development-Perl (that one, that I use
> apart from the production Perl installation), all pages that went
> through Catalyst::Engine::FastCGI got double-utf8-encoded.
>
> When using the standalone HTTP server, everything is fine.
>
> Both installations used Perl 5.10.1. I did a snapshot of the old modules
> at first and then re-installed everything on the new installation.
>
> Googling around I found
> http://www.mail-archive.com/catalyst [at] lists/msg00051.html ,
> but I couldn't verify Jonathan's assumption in my case - even before the
> buffer is written into the FastCGI pipe in
> Catalyst::Engine::FastCGI::write the whole buffer (header and body)
> looked fine right before calling STDOUT->syswrite.
>
> Is there anyone here, who has experienced similar effects and maybe
> found a solution (besides kicking out the FastCGI engine)?

For us, we think the implicit concatenation in our templates
of both encoded (utf8 off) and decoded data (utf8 on) led to this
result with FastCGI and Catalyst::Plugin::Unicode (now discouraged I see).

The "cure" was to use the DBD::Pg specific pg_enable_utf8 attribute
to persuade DBD::Pg to return decoded strings instead of the
default encoded ones. We are also running our Perl binary with
the '-CSD' flags, so that all inputs and outputs are regarded
as UTF-8 encoded, but I'm not convinced this step is particularly
necessary.

- Mark

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


neo at gothic-chat

Jan 22, 2010, 5:38 AM

Post #10 of 14 (2344 views)
Permalink
Re: Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

Thanks for this thread! I had the same problem with double encoding
while running Catalyst with FastCGI (even more strange - only on newer
applications on the same server).

Bernhard Grafs proposal for a quick fix in
http://www.mail-archive.com/catalyst [at] lists/msg08401.html
actually worked for me.


Regards,
Neo

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


bobtfish at bobtfish

Jan 22, 2010, 5:48 AM

Post #11 of 14 (2341 views)
Permalink
Re: Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

Neo [GC] wrote:
> Thanks for this thread! I had the same problem with double encoding
> while running Catalyst with FastCGI (even more strange - only on newer
> applications on the same server).
>
> Bernhard Grafs proposal for a quick fix in
> http://www.mail-archive.com/catalyst [at] lists/msg08401.html
> actually worked for me.

You would be wanting this:

http://search.cpan.org/CPAN/authors/id/M/MS/MSTROUT/FCGI-0.68_02.tar.gz

which fixes the issue properly. :)

If people could test and shout up if it works for them, that'd be
appreciated!

Cheers
t0m

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


catalyst4 at augensalat

Jan 22, 2010, 6:01 AM

Post #12 of 14 (2347 views)
Permalink
Re: Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

Tomas Doran schrieb:

> http://search.cpan.org/CPAN/authors/id/M/MS/MSTROUT/FCGI-0.68_02.tar.gz
>
> which fixes the issue properly. :)
>
> If people could test and shout up if it works for them, that'd be
> appreciated!

I did install it recently and everything seems to work correctly, though
ATM I'm using the proxy-HTTP::Prefork combo for my project.

Bernhard Graf


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


rod.taylor at gmail

Jan 25, 2010, 8:28 PM

Post #13 of 14 (2220 views)
Permalink
Re: Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

0.6802 does not seem to work for me.


I have some unicode text in a PostgreSQL database. It extracts
properly with the utf8 flag on (checked), it renders properly into the
template (save $c->response->body() to disk and serve static version
with apache shows perfectly), it also works as expected when served
via "Server" instead of "FastCGI".

The headers are slightly different (charset=utf-8 vs charset=UTF-8)
but Firefox takes both as being UTF-8.

The content has UTF-8 characters double encoded with the new FastCGI module.

Note, I'm stuck on Apache 1.3.41.


Also using Unicode::Encoding (encoding => 'UTF-8'), View::TT2
(ENCODING => 'UTF-8'), and FormFu ({ constructor => { tt_args => {
ENCODING => 'UTF-8' }, }, }).

PostgreSQL connects via with pg_enable_utf8 => 1.




On Fri, Jan 22, 2010 at 08:48, Tomas Doran <bobtfish [at] bobtfish> wrote:
> Neo [GC] wrote:
>>
>> Thanks for this thread! I had the same problem with double encoding while
>> running Catalyst with FastCGI (even more strange - only on newer
>> applications on the same server).
>>
>> Bernhard Grafs proposal for a quick fix in
>> http://www.mail-archive.com/catalyst [at] lists/msg08401.html
>> actually worked for me.
>
> You would be wanting this:
>
> http://search.cpan.org/CPAN/authors/id/M/MS/MSTROUT/FCGI-0.68_02.tar.gz
>
> which fixes the issue properly. :)
>
> If people could test and shout up if it works for them, that'd be
> appreciated!
>
> Cheers
> t0m
>
> _______________________________________________
> List: Catalyst [at] lists
> Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
> Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
> Dev site: http://dev.catalyst.perl.org/
>

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


bobtfish at bobtfish

May 21, 2010, 10:06 AM

Post #14 of 14 (1304 views)
Permalink
Re: Re: Unicode trouble with Catalyst::Engine::FastCGI [In reply to]

On 26 Jan 2010, at 04:28, Rod Taylor wrote:

> 0.6802 does not seem to work for me.


Sorry for the very late reply, but...

Do you still have this issue? Can you retest using the latest
Plugin::Unicode::Encoding and the latest FCGI as they've both had
important fixes since that point.

Cheers
t0m


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/

Catalyst users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.