Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Catalyst: Users

Catalyst crashing hard with UTF-8 string

 

 

Catalyst users RSS feed   Index | Next | Previous | View Threaded


paulm at paulm

Jul 7, 2009, 7:07 PM

Post #1 of 7 (1112 views)
Permalink
Catalyst crashing hard with UTF-8 string

I'm uploading utf-8 news feed data through a web form, processing it
and reporting back to the user errors that are encountered during a
news feed parse. In those error reports I'm including snippets of the
input data via TT,

[% IF telluser.error %]
<p class = "error">
[% FOREACH error_msg = telluser.error %]
[% TRY %]
[% error_msg %]
[% CATCH %]
Error printing message
[% END%]
<br />
[% END %]
</p>
[% END %]


It turns out that if a UTF-8 string is in error_msg the backend
request completely dies (I get a Proxy Error "Reason: Error reading
from remote server"), even with use Catalyst /-Debug/ and that [% TRY
%] block. I know it's UTF-8 because Encode::is_utf8 says so, and warn
"$string" is showing up in the terminal correctly. The string contains
a right single quote (E2 80 99): If I do [% error_msg | html_entities
%] it is successfully converted to &rsquo;. Altho unfortunately so are
the HTML tags...

Something's definitely changed wrt to UTF-8 behavior since we did our
big upgrade from Catalyst 5.7. Are there any 'known gotchas' I could
check?

At this point my debugging fu runs out--help appreciated on where to look next.

Paul

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


bobtfish at bobtfish

Jul 8, 2009, 9:40 AM

Post #2 of 7 (1049 views)
Permalink
Re: Catalyst crashing hard with UTF-8 string [In reply to]

Paul Makepeace wrote:
> Something's definitely changed wrt to UTF-8 behavior since we did our
> big upgrade from Catalyst 5.7. Are there any 'known gotchas' I could
> check?
>
> At this point my debugging fu runs out--help appreciated on where to look next.

No gotchas that I can think of.

Are you sure this is a Catalyst issue, and not caused by a recent TT
upgrade?

You know it's dieing inside TT, right? So you can Data::Dumper the input
which causes it to die to a file, write a program that instantiates
View::TT, calls ->render with the same input (and template), and that
should crap out in the same way?

Sound sane? And should give you something self contained which you can
test with different TT versions easily..

Cheers
t0m


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


paulm at paulm

Jul 8, 2009, 10:01 AM

Post #3 of 7 (1057 views)
Permalink
Re: Catalyst crashing hard with UTF-8 string [In reply to]

On Wed, Jul 8, 2009 at 11:40 AM, Tomas Doran<bobtfish [at] bobtfish> wrote:
> Paul Makepeace wrote:
>>
>> Something's definitely changed wrt to UTF-8 behavior since we did our
>> big upgrade from Catalyst 5.7. Are there any 'known gotchas' I could
>> check?
>>
>> At this point my debugging fu runs out--help appreciated on where to look
>> next.
>
> No gotchas that I can think of.
>
> Are you sure this is a Catalyst issue, and not caused by a recent TT
> upgrade?
>
> You know it's dieing inside TT, right? So you can Data::Dumper the input
> which causes it to die to a file, write a program that instantiates
> View::TT, calls ->render with the same input (and template), and that should
> crap out in the same way?

Ya, I was kinda hoping you wouldn't say this, and that there was a way
to catch whatever was happening in Catalyst or trace the execution
path to get to the point where the things actually dieing.

(FWIW, upgrading TT from 2.20 to 2.21 didn't appear to help.)

P

> Sound sane? And should give you something self contained which you can test
> with different TT versions easily..
>
> Cheers
> t0m
>
>
> _______________________________________________
> List: Catalyst [at] lists
> Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
> Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
> Dev site: http://dev.catalyst.perl.org/
>

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


bobtfish at bobtfish

Jul 8, 2009, 2:29 PM

Post #4 of 7 (1042 views)
Permalink
Re: Catalyst crashing hard with UTF-8 string [In reply to]

On 8 Jul 2009, at 18:01, Paul Makepeace wrote:

>> You know it's dieing inside TT, right? So you can Data::Dumper the
>> input
>> which causes it to die to a file, write a program that instantiates
>> View::TT, calls ->render with the same input (and template), and
>> that should
>> crap out in the same way?
>
> Ya, I was kinda hoping you wouldn't say this, and that there was a way
> to catch whatever was happening in Catalyst or trace the execution
> path to get to the point where the things actually dieing.

Well, start by throwing Devel::SimpleTrace, or Carp::Always at it to
get a strack trace.

MyApp::View::TT->render is going to get called, so add something like
this to that class:

use Data::Dumper;
sub render {
my ($self, $c, $template, $args) = @_;
local $Data::Dumper::Maxdepth = 4;
warn Dumper([$template, $args]);
$self->next::method($c, $template, $args);
}

increase Maxdepth if needed until it pukes..

You should literally be able to dump the Dumper glob into a .t file,
and build a test around it....

use MyApp;
use MyApp::View::TT;

my $view = MyApp::View::TT->new(%config_your_app_gives_TT_view);
my $VAR! = # Dumper crap here

my ($template, $args) = @$VAR1;

$view->render('MyApp', $template, $args); # Should blow up, in the
'correct' way..

Cutting template / data down to smallest replicable size and throwing
away Catalyst app and using raw TT should be easy from there forwards :)

Another thought - have you tried disabling the XS stash for TT? (IIRC
there is an env var to do this) and seeing if that affects the crash?

Cheers
t0m


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


paulm at paulm

Jul 13, 2009, 7:33 AM

Post #5 of 7 (1011 views)
Permalink
Re: Catalyst crashing hard with UTF-8 string [In reply to]

Summary: this turns out to be an issue with Catalyst::Engine::HTTP's
Remote handle not being set to UTF-8.

On Wed, Jul 8, 2009 at 4:29 PM, Tomas Doran<bobtfish [at] bobtfish> wrote:
>
> On 8 Jul 2009, at 18:01, Paul Makepeace wrote:
>
>>> You know it's dieing inside TT, right? So you can Data::Dumper the input
>>> which causes it to die to a file, write a program that instantiates
>>> View::TT, calls ->render with the same input (and template), and that
>>> should
>>> crap out in the same way?
>>
>> Ya, I was kinda hoping you wouldn't say this, and that there was a way
>> to catch whatever was happening in Catalyst or trace the execution
>> path to get to the point where the things actually dieing.
>
> Well, start by throwing Devel::SimpleTrace, or Carp::Always at it to get a
> strack trace.
>
> MyApp::View::TT->render is going to get called, so add something like this
> to that class:
>
> use Data::Dumper;
> sub render {
>    my ($self, $c, $template, $args) = @_;
>    local $Data::Dumper::Maxdepth = 4;
>    warn Dumper([$template, $args]);
>    $self->next::method($c, $template, $args);
> }
>
> increase Maxdepth if needed until it pukes..
>
> You should literally be able to dump the Dumper glob into a .t file, and
> build a test around it....
>
> use MyApp;
> use MyApp::View::TT;
>
> my $view = MyApp::View::TT->new(%config_your_app_gives_TT_view);
> my $VAR! = # Dumper crap here
>
> my ($template, $args) = @$VAR1;
>
> $view->render('MyApp', $template, $args); # Should blow up, in the 'correct'
> way..

Thanks for this. Since what I ended up is a little different from
what's here I'll copy it for anyone else going down this path,

use MyApp;
use MyApp::View::TT;

my $c = MyApp->new;
my $view = MyApp::View::TT->new($c, {});

my $error = "It\x{2019}s crazy";
my $template = \'[% foo %]'; # or 'path/to/template';

my $output = $view->render($c, $template, {foo => $error});
binmode(*STDOUT, ':utf8');
print "Output: $output\n";
__END__

It turns out that it's not the rendering in TT at all(!) as evidenced
by this test script working.

**

So I can now produce this hard crash by replacing
MyApp::View::TT::render with sub render { "\x{2639}" }. Digging even
further back I found this has it work,

--- /usr/local/share/perl/5.10.0/Catalyst/Engine/HTTP.pm.orig
2009-07-13 15:27:27.000000000 +0100
+++ /usr/local/share/perl/5.10.0/Catalyst/Engine/HTTP.pm
2009-07-13 15:25:58.000000000 +0100
@@ -263,6 +263,7 @@
select Remote;

Remote->blocking(1);
+ Remote->binmode(':utf8');

# Read until we see all headers
$self->{inputbuf} = '';

I don't know enough about how to make this a general solution
(presumably not all Catalyst::Engine::HTTP output will be UTF-8).

I'm still not sure why it's die'ing or why Apache is serving this
Proxy Error / thinking the output is unexpected. Maybe a 'wide print'
error is ending up in stderr or something. Any tips on this
appreciated.

> Cutting template / data down to smallest replicable size and throwing away
> Catalyst app and using raw TT should be easy from there forwards :)
>
> Another thought - have you tried disabling the XS stash for TT? (IIRC there
> is an env var to do this) and seeing if that affects the crash?

As it turns out from above I didn't really need to pursue this but
again for the record, the way AIUI to do this is,

use Template::Config;
$Template::Config::STASH = 'Template::Stash';
__END__

If there's an env var I didn't see it.

(Exploring this one of my templates threw up a bug in Template::Stash
though; reported to the TT list.)

Paul

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


paulm at paulm

Jul 13, 2009, 8:37 AM

Post #6 of 7 (1002 views)
Permalink
Re: Catalyst crashing hard with UTF-8 string [In reply to]

On Mon, Jul 13, 2009 at 9:33 AM, Paul Makepeace<paulm [at] paulm> wrote:
> Summary: this turns out to be an issue with Catalyst::Engine::HTTP's
> Remote handle not being set to UTF-8.
>
> On Wed, Jul 8, 2009 at 4:29 PM, Tomas Doran<bobtfish [at] bobtfish> wrote:
>>
>> On 8 Jul 2009, at 18:01, Paul Makepeace wrote:
>>
>>>> You know it's dieing inside TT, right? So you can Data::Dumper the input
>>>> which causes it to die to a file, write a program that instantiates
>>>> View::TT, calls ->render with the same input (and template), and that
>>>> should
>>>> crap out in the same way?
>>>
>>> Ya, I was kinda hoping you wouldn't say this, and that there was a way
>>> to catch whatever was happening in Catalyst or trace the execution
>>> path to get to the point where the things actually dieing.
>>
>> Well, start by throwing Devel::SimpleTrace, or Carp::Always at it to get a
>> strack trace.
>>
>> MyApp::View::TT->render is going to get called, so add something like this
>> to that class:
>>
>> use Data::Dumper;
>> sub render {
>>    my ($self, $c, $template, $args) = @_;
>>    local $Data::Dumper::Maxdepth = 4;
>>    warn Dumper([$template, $args]);
>>    $self->next::method($c, $template, $args);
>> }
>>
>> increase Maxdepth if needed until it pukes..
>>
>> You should literally be able to dump the Dumper glob into a .t file, and
>> build a test around it....
>>
>> use MyApp;
>> use MyApp::View::TT;
>>
>> my $view = MyApp::View::TT->new(%config_your_app_gives_TT_view);
>> my $VAR! = # Dumper crap here
>>
>> my ($template, $args) = @$VAR1;
>>
>> $view->render('MyApp', $template, $args); # Should blow up, in the 'correct'
>> way..
>
> Thanks for this. Since what I ended up is a little different from
> what's here I'll copy it for anyone else going down this path,
>
> use MyApp;
> use MyApp::View::TT;
>
> my $c = MyApp->new;
> my $view = MyApp::View::TT->new($c, {});
>
> my $error = "It\x{2019}s crazy";
> my $template = \'[% foo %]';  # or 'path/to/template';
>
> my $output = $view->render($c, $template, {foo => $error});
> binmode(*STDOUT, ':utf8');
> print "Output: $output\n";
> __END__
>
> It turns out that it's not the rendering in TT at all(!) as evidenced
> by this test script working.
>
> **
>
> So I can now produce this hard crash by replacing
> MyApp::View::TT::render with sub render { "\x{2639}" }. Digging even
> further back I found this has it work,
>
> --- /usr/local/share/perl/5.10.0/Catalyst/Engine/HTTP.pm.orig
> 2009-07-13 15:27:27.000000000 +0100
> +++ /usr/local/share/perl/5.10.0/Catalyst/Engine/HTTP.pm
> 2009-07-13 15:25:58.000000000 +0100
> @@ -263,6 +263,7 @@
>             select Remote;
>
>             Remote->blocking(1);
> +            Remote->binmode(':utf8');
>
>             # Read until we see all headers
>             $self->{inputbuf} = '';

Ah, awesome, this breaks multipart upload.

If the utf8 encoding is set just before the handler it seems to work,

--- /usr/local/share/perl/5.10.0/Catalyst/Engine/HTTP.pm.orig
2009-07-13 15:27:27.000000000 +0100
+++ /usr/local/share/perl/5.10.0/Catalyst/Engine/HTTP.pm
2009-07-13 16:14:47.000000000 +0100
@@ -288,6 +288,7 @@
}
}

+ Remote->binmode(':utf8');
$self->_handler( $class, $port, $method, $uri, $protocol );

if ( $self->_has_write_error ) {

P

> I don't know enough about how to make this a general solution
> (presumably not all Catalyst::Engine::HTTP output will be UTF-8).
>
> I'm still not sure why it's die'ing or why Apache is serving this
> Proxy Error / thinking the output is unexpected. Maybe a 'wide print'
> error is ending up in stderr or something. Any tips on this
> appreciated.
>
>> Cutting template / data down to smallest replicable size and throwing away
>> Catalyst app and using raw TT should be easy from there forwards :)
>>
>> Another thought - have you tried disabling the XS stash for TT? (IIRC there
>> is an env var to do this) and seeing if that affects the crash?
>
> As it turns out from above I didn't really need to pursue this but
> again for the record, the way AIUI to do this is,
>
> use Template::Config;
> $Template::Config::STASH = 'Template::Stash';
> __END__
>
> If there's an env var I didn't see it.
>
> (Exploring this one of my templates threw up a bug in Template::Stash
> though; reported to the TT list.)
>
> Paul
>

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


bobtfish at bobtfish

Jul 15, 2009, 2:23 PM

Post #7 of 7 (985 views)
Permalink
Re: Catalyst crashing hard with UTF-8 string [In reply to]

On 13 Jul 2009, at 15:33, Paul Makepeace wrote:

> Summary: this turns out to be an issue with Catalyst::Engine::HTTP's
> Remote handle not being set to UTF-8.


Setting the remote handle to utf8 makes perl automagically encode
everything from a utf8 character string into a byte string (for you).

This isn't the engine's job.

It's the rendering pipeline's job to ensure that you have a document
whos body is a string of bytes (rather than perl's internal characters).

Catalyst::Plugin::Unicode does this for you by calling uft8::encode
(the perl internal character string is encoded into a byte string (in
utf8)).

>
> my $error = "It\x{2019}s crazy";
> my $template = \'[% foo %]'; # or 'path/to/template';
>
> my $output = $view->render($c, $template, {foo => $error});
> binmode(*STDOUT, ':utf8');
> print "Output: $output\n";
> __END__
>
> It turns out that it's not the rendering in TT at all(!) as evidenced
> by this test script working.
>
> **
>
> So I can now produce this hard crash by replacing
> MyApp::View::TT::render with sub render { "\x{2639}" }. Digging even
> further back I found this has it work,
>

I can't see a crash at all here, that's shitty :/.

Cheers
t0m


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/

Catalyst users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.