Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Catalyst: Users

charset not needed for Catalyst::Action::REST?

 

 

Catalyst users RSS feed   Index | Next | Previous | View Threaded


moseley at hank

Feb 24, 2012, 7:50 AM

Post #1 of 3 (236 views)
Permalink
charset not needed for Catalyst::Action::REST?

When using Catalyst::Action::REST the content-type response never includes
a charset. JSON seems to be handled correctly in code -- JSON strings are
always UTF-8. Does that mean there is no need to specify a charset on
responses?

And what if a JSON request comes in with a non-UTF8 charset? Should that
be ignored? It's application/json, not text/json so maybe there no
encoding issues?

What about other serializations? YAML is UTF-8 or UTF-16. Does that mean
the charset needs to be included in response? And again, if a request
comes in with UTF-16 does it need to be decoded or does that happen in
YAML::Syck?

Event text/html doesn't include a charset in a the "serialized" response.


Does there need to be an additional decoding and encoding layer when using
Catalyst::Action::REST? Should I force a charset on all responses?


BTW -- doesn't seem like YAML survies a round trip like JSON does:

As expected:

$ perl -MEncode -wle '$x = "\x{263A}"; print length( $x )'
1

$ perl -MEncode -wle '$x = Encode::encode_utf8("\x{263A}"); print length(
$x )'
3


And also as expected:


$ perl -MJSON -MEncode -wle 'print
length(JSON::decode_json(JSON::encode_json( ["\x{263A}"]) )->[0])'
1


But YAML drops the utf8 flag:

$ perl -MYAML::Syck -MEncode -wle 'print
length(YAML::Syck::Load(YAML::Syck::Dump( ["\x{263A}"]) )->[0])'
3




--
Bill Moseley
moseley [at] hank


moseley at hank

Feb 25, 2012, 12:38 AM

Post #2 of 3 (228 views)
Permalink
Re: charset not needed for Catalyst::Action::REST? [In reply to]

Ok, while trying to dig a bit more on Content-Type in RESTful services, I
came across this blog entry, although it is mostly about versioning.

http://thereisnorightway.blogspot.com/2011/02/versioning-and-types-in-resthttp-api.html

He's suggesting a vender specific Content-Type that includes a version
number. Frankly, that sounds like a nightmare to maintain
(and certainly not in the same Catalyst app). The argument for the
vender-specific Content-Type is that application/json could mean return any
json. But, I think it's enough of a contract that they are connecting to
my web service to know that the json returned MUST be specific to my
service.

Anyone using vender-specific Content-Type headers as he suggests?

Anyone have suggestions on the charset question below?


I've asked a few times about API versioning (including how to have
Controller actions inherit from each other). And I'm now of the opinion
that it's just a bad idea and I've decided to use the common three digit
version and return it in an X-Version response header. If the first
"major" digit changes it means the API is backwards incompatible and I'll
use a new hostname or header for the load balancer to dispatch on.




On Fri, Feb 24, 2012 at 10:50 PM, Bill Moseley <moseley [at] hank> wrote:

> When using Catalyst::Action::REST the content-type response never includes
> a charset. JSON seems to be handled correctly in code -- JSON strings are
> always UTF-8. Does that mean there is no need to specify a charset on
> responses?
>
> And what if a JSON request comes in with a non-UTF8 charset? Should that
> be ignored? It's application/json, not text/json so maybe there no
> encoding issues?
>
> What about other serializations? YAML is UTF-8 or UTF-16. Does that mean
> the charset needs to be included in response? And again, if a request
> comes in with UTF-16 does it need to be decoded or does that happen in
> YAML::Syck?
>
> Event text/html doesn't include a charset in a the "serialized" response.
>
>
> Does there need to be an additional decoding and encoding layer when using
> Catalyst::Action::REST? Should I force a charset on all responses?
>
>
> BTW -- doesn't seem like YAML survies a round trip like JSON does:
>
> As expected:
>
> $ perl -MEncode -wle '$x = "\x{263A}"; print length( $x )'
> 1
>
> $ perl -MEncode -wle '$x = Encode::encode_utf8("\x{263A}"); print length(
> $x )'
> 3
>
>
> And also as expected:
>
>
> $ perl -MJSON -MEncode -wle 'print
> length(JSON::decode_json(JSON::encode_json( ["\x{263A}"]) )->[0])'
> 1
>
>
> But YAML drops the utf8 flag:
>
> $ perl -MYAML::Syck -MEncode -wle 'print
> length(YAML::Syck::Load(YAML::Syck::Dump( ["\x{263A}"]) )->[0])'
> 3
>
>
>
>
> --
> Bill Moseley
> moseley [at] hank
>



--
Bill Moseley
moseley [at] hank


bobtfish at bobtfish

Feb 25, 2012, 1:05 AM

Post #3 of 3 (225 views)
Permalink
Re: charset not needed for Catalyst::Action::REST? [In reply to]

On 24 Feb 2012, at 15:50, Bill Moseley wrote:

> When using Catalyst::Action::REST the content-type response never includes a charset. JSON seems to be handled correctly in code -- JSON strings are always UTF-8. Does that mean there is no need to specify a charset on responses?


Theoretically, you don't need to, but I think we should.. Specifically I've heard reported encoding issues talking to some other stacks which were fixed by us doing this explicitly.

> And what if a JSON request comes in with a non-UTF8 charset? Should that be ignored? It's application/json, not text/json so maybe there no encoding issues?

I thought that JSON was always UTF-8, but I read the spec recently, and whilst it's always unicode, it can be encoded as utf-others also:

JSON text SHALL be encoded in Unicode. The default encoding is UTF-8.

Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
at the pattern of nulls in the first four octets.


> What about other serializations? YAML is UTF-8 or UTF-16. Does that mean the charset needs to be included in response? And again, if a request comes in with UTF-16 does it need to be decoded or does that happen in YAML::Syck?
>

The latter, but yes I think the charset should also be included.

> Event text/html doesn't include a charset in a the "serialized" response.

I would think that text/html should be handled by C::P::Unicode::Encoding still, if that was present?

> Does there need to be an additional decoding and encoding layer when using Catalyst::Action::REST?

I'm of the opinion there shouldn't need to be.

> Should I force a charset on all responses

I think we should fix this, at least for JSON and YAML where the right thing to do is entirely clear..

> BTW -- doesn't seem like YAML survies a round trip like JSON does:
> <snip>
> But YAML drops the utf8 flag:
>
> $ perl -MYAML::Syck -MEncode -wle 'print length(YAML::Syck::Load(YAML::Syck::Dump( ["\x{263A}"]) )->[0])'
> 3

Eugh. This works as expected with YAML and YAML::XS, I vote that we should stop using YAML::Syck as it's less maintained (and clearly has encoding issues).

Anyone have strong reasons for not doing this?

Cheers
t0m


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/

Catalyst users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.