Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Catalyst: Users

Language selection in URLs

 

 

Catalyst users RSS feed   Index | Next | Previous | View Threaded


moseley at hank

Nov 15, 2009, 7:06 AM

Post #1 of 15 (2061 views)
Permalink
Language selection in URLs

What's your preferred approach to specifying a language tag in a URL? Is
there strong argument for one over the other?

http://example.com/en_us/path/to/some/index.html # language prefix

http://example.com/path/to/some/index.html?lang=en_us

Are pages in different languages different resources or different versions
of the same resource?

Obviously, the prefix is easier if you use relative URLs, but uri_for makes
adding the query parameter easy. Although, probably could argue that the
prefix approach is more efficient than wrapping uri_for for every generated
link.

--
Bill Moseley
moseley [at] hank


joel at fysh

Nov 15, 2009, 7:23 AM

Post #2 of 15 (2002 views)
Permalink
Re: Language selection in URLs [In reply to]

On 15 Nov 2009, at 15:06, Bill Moseley wrote:

>
> What's your preferred approach to specifying a language tag in a URL? Is there strong argument for one over the other?
>
> http://example.com/en_us/path/to/some/index.html # language prefix
>
> http://example.com/path/to/some/index.html?lang=en_us

No no no! Allow the client and server to negotiate what content to serve for the resource identified. As a URI to a resource which may vary according to many dimensions, /path/to/some/content is fine.

GET /path/to/content HTTP/1.1
Accept-Language: en
Accept: text/html

> Are pages in different languages different resources or different versions of the same resource?

They are different *representations* of the same *resource*. The dimension of variation in this case is the Accept-Language: request header.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.4

and you might reasonably vary the content type returned according to the client's Accept: header too.

> Obviously, the prefix is easier if you use relative URLs, but uri_for makes adding the query parameter easy. Although, probably could argue that the prefix approach is more efficient than wrapping uri_for for every generated link.

If you really must stick it in the URL, I'd go for something like:
/path/to/content/en
/path/to/content/pt_BR
etc

A better question is: what kind of problems are you solving where server-driven or agent-driven content negotiation as described in the HTTP 1.1 RFC (an excellent and very readable document, honestly) are insufficient?

/joel
_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


orasnita at gmail

Nov 15, 2009, 8:00 AM

Post #3 of 15 (1997 views)
Permalink
Re: Language selection in URLs [In reply to]

From: "Bill Moseley" <moseley [at] hank>
> What's your preferred approach to specifying a language tag in a URL? Is
> there strong argument for one over the other?
>
> http://example.com/en_us/path/to/some/index.html # language prefix
>
> http://example.com/path/to/some/index.html?lang=en_us

I prefer the former way because the URL looks nicer.
(Not a very "strong" argument:)

> Are pages in different languages different resources or different versions
> of the same resource?

In most cases I think it is the same content with a different presentation
style, language...

> Obviously, the prefix is easier if you use relative URLs, but uri_for
> makes
> adding the query parameter easy. Although, probably could argue that the
> prefix approach is more efficient than wrapping uri_for for every
> generated
> link.

There is an example on the Catalyst wiki for overriding prepare_path() in
order to use urls like /en/path/to/another/file.html without needing to
change all controllers.
Octavian


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


simonw at digitalcraftsmen

Nov 15, 2009, 8:08 AM

Post #4 of 15 (2003 views)
Permalink
Re: Language selection in URLs [In reply to]

On 15/11/09 15:23, Joel Bernstein wrote:
> A better question is: what kind of problems are you solving where
> server-driven or agent-driven content negotiation as described in the
> HTTP 1.1 RFC (an excellent and very readable document, honestly) are
> insufficient?

How do you easily change the Accept-Language header in $modern_browser[1] ?

If I'm using my friend's laptop (he's a Finn), letting it choose to show
me a page in Finnish isn't much use to me as I speak not a word.

OK, so that's a slightly contrived example but equally important is
making content in multiple languages accessible to search engine
spiders. In this instance you need to be able to disambiguate the
content via the URI since that is all you have. You need to provide them
with a link to different language versions in the form of a URI since
they don't spider the web multiple times with different Accept: headers
to see what they get. As a content provider one has to be a bit more
explicit about it.

S.
[1] I know how to do it in Firefox, no idea in other browsers and in the
real world, clicking on a link that says "English" is easier than trying
to work out the vagaries of each browser UI.

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


orasnita at gmail

Nov 15, 2009, 8:15 AM

Post #5 of 15 (2005 views)
Permalink
Re: Language selection in URLs [In reply to]

From: "Joel Bernstein" <joel [at] fysh>

On 15 Nov 2009, at 15:06, Bill Moseley wrote:

>
> What's your preferred approach to specifying a language tag in a URL? Is
> there strong argument for one over the other?
>
> http://example.com/en_us/path/to/some/index.html # language prefix
>
> http://example.com/path/to/some/index.html?lang=en_us

No no no! Allow the client and server to negotiate what content to serve for
the resource identified. As a URI to a resource which may vary according to
many dimensions, /path/to/some/content is fine.

GET /path/to/content HTTP/1.1
Accept-Language: en
Accept: text/html

A better question is: what kind of problems are you solving where
server-driven or agent-driven content negotiation as described in the HTTP
1.1 RFC (an excellent and very readable document, honestly) are
insufficient?

/joel

The most important reason I needed to use URLS like /en/dir/file,
/ro/dir/file was the fact that very many users, although they don't know
English, they use the browser with the default configurations so they see
the pages in English and then they don't like it and want to change it.

So I use the following rules (in order) for choosing the current language:
1. The language chosen by the user by clicking the wanted flag;
2. The language which is specified in the URL like /ro/dir/file;
3. The language prefered by the browser;
4. The default language (if the site doesn't offer translation for the
browser-prefered language, or if there is no browser-prefered language).

Using different URLS for different pages might help search engines to index
the site, because otherwise the search engines might not try to access the
site with all possible languages in order to see if the web site offers
content in those languages.

(There may be other solutions for this, like specifying the alternate
versions of the page as meta tags or something like that, but I don't know
how to do that or if it is possible.)

Octavian







_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


stephan at stejau

Nov 15, 2009, 8:31 AM

Post #6 of 15 (1993 views)
Permalink
Re: Language selection in URLs [In reply to]

Hi,

I would suggest you:
http://search.cpan.org/~stephanj/Catalyst-TraitFor-Request-PerLanguageDomains-0.01/lib/Catalyst/TraitFor/Request/PerLanguageDomains.pm



Octavian Râsnita schrieb:
> From: "Joel Bernstein" <joel [at] fysh>
>
> On 15 Nov 2009, at 15:06, Bill Moseley wrote:
>
>>
>> What's your preferred approach to specifying a language tag in a URL?
>> Is there strong argument for one over the other?
>>
>> http://example.com/en_us/path/to/some/index.html # language prefix
>>
>> http://example.com/path/to/some/index.html?lang=en_us
>
> No no no! Allow the client and server to negotiate what content to serve
> for the resource identified. As a URI to a resource which may vary
> according to many dimensions, /path/to/some/content is fine.
>
> GET /path/to/content HTTP/1.1
> Accept-Language: en
> Accept: text/html
>
> A better question is: what kind of problems are you solving where
> server-driven or agent-driven content negotiation as described in the
> HTTP 1.1 RFC (an excellent and very readable document, honestly) are
> insufficient?
>
> /joel
>
> The most important reason I needed to use URLS like /en/dir/file,
> /ro/dir/file was the fact that very many users, although they don't know
> English, they use the browser with the default configurations so they
> see the pages in English and then they don't like it and want to change it.
>
> So I use the following rules (in order) for choosing the current language:
> 1. The language chosen by the user by clicking the wanted flag;
> 2. The language which is specified in the URL like /ro/dir/file;
> 3. The language prefered by the browser;
> 4. The default language (if the site doesn't offer translation for the
> browser-prefered language, or if there is no browser-prefered language).
>
> Using different URLS for different pages might help search engines to
> index the site, because otherwise the search engines might not try to
> access the site with all possible languages in order to see if the web
> site offers content in those languages.
>
> (There may be other solutions for this, like specifying the alternate
> versions of the page as meta tags or something like that, but I don't
> know how to do that or if it is possible.)
>
> Octavian
>
>
>
>
>
>
>
> _______________________________________________
> List: Catalyst [at] lists
> Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
> Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
> Dev site: http://dev.catalyst.perl.org/


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


moseley at hank

Nov 15, 2009, 8:43 AM

Post #7 of 15 (1994 views)
Permalink
Re: Language selection in URLs [In reply to]

On Sun, Nov 15, 2009 at 7:23 AM, Joel Bernstein <joel [at] fysh> wrote:

> On 15 Nov 2009, at 15:06, Bill Moseley wrote:
>
> >
> > What's your preferred approach to specifying a language tag in a URL? Is
> there strong argument for one over the other?
> >
> > http://example.com/en_us/path/to/some/index.html # language prefix
> >
> > http://example.com/path/to/some/index.html?lang=en_us
>
> No no no! Allow the client and server to negotiate what content to serve
> for the resource identified. As a URI to a resource which may vary according
> to many dimensions, /path/to/some/content is fine.
>

I use that as a fallback, but in practice auto-detection doesn't work as
well as one would hope. In fact I think I disabled that code because of
all the headaches. Turns out the browser is not very good at understanding
the user running the browser at that moment, or the user's preference for a
given page. We also had problems with caching of pages in the wrong
language. At one point we would set a flag in the session, too, but that
had problems (with proxy caching IIRC).



> If you really must stick it in the URL, I'd go for something like:
> /path/to/content/en
> /path/to/content/pt_BR
> etc
>

I prefer the prefix to that. Makes it transparent to the applicaiton and
relative URLs work.



>
> A better question is: what kind of problems are you solving where
> server-driven or agent-driven content negotiation as described in the HTTP
> 1.1 RFC (an excellent and very readable document, honestly) are
> insufficient?
>


The quesiton was just between using a query parameter vs. a path segment.



--
Bill Moseley
moseley [at] hank


orasnita at gmail

Nov 15, 2009, 9:43 AM

Post #8 of 15 (1993 views)
Permalink
Re: Language selection in URLs [In reply to]

From: "Stephan Jauernick" <stephan [at] stejau>

Hi,

I would suggest you:
http://search.cpan.org/~stephanj/Catalyst-TraitFor-Request-PerLanguageDomains-0.01/lib/Catalyst/TraitFor/Request/PerLanguageDomains.pm


I have seen it, but it recognize the languages from the domain names and if
I'd use this type of URL naming it would be harder to use an SSL key. I know
that there are SSL keys that can be used for more subdomains, but I don't
have one like that.

Octavian


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


melo at simplicidade

Nov 16, 2009, 4:24 AM

Post #9 of 15 (1940 views)
Permalink
Re: Language selection in URLs [In reply to]

Hi,

On 2009/11/15, at 15:23, Joel Bernstein wrote:

> On 15 Nov 2009, at 15:06, Bill Moseley wrote:
>> What's your preferred approach to specifying a language tag in a
>> URL? Is there strong argument for one over the other?
>>
>> http://example.com/en_us/path/to/some/index.html # language prefix
>>
>> http://example.com/path/to/some/index.html?lang=en_us
>
> No no no! Allow the client and server to negotiate what content to
> serve for the resource identified. As a URI to a resource which may
> vary according to many dimensions, /path/to/some/content is fine.
>
> GET /path/to/content HTTP/1.1
> Accept-Language: en
> Accept: text/html

This is the proper solution, in a perfect world.

The resource /path/to/some/index.html has multiple representations and
ideally you would use Accept* headers to choose between them, but it
might make sense to provide several ways to access your data:

/path/to/some/index.html => give me HTML and pick the language based
on content negotiation
/path/to/some/index.en.html => give me HTML in english
/path/to/some/index.pl.pdf => give me a PDF in polish.
/path/to/some/index.bork.json => give me a json file for swedish chefs

The reason to support this is to use with external web services. If I
want to translate to russian, and I know that this cool web service
has a very good polish-to-russian translator, but only accepts PDF
files, I have a URL for that: /path/to/some/index.pl.pdf

So yes, please use content negotiation be default, but leave the door
open to explicit addressing to specific representations of a resource
to be used when content negotiation is not an option.

IMHO, it is not a mistake to add such alternative addresses.

Bye,

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


joel at fysh

Nov 16, 2009, 4:35 AM

Post #10 of 15 (1936 views)
Permalink
Re: Language selection in URLs [In reply to]

2009/11/16 Pedro Melo <melo [at] simplicidade>:
> So yes, please use content negotiation be default, but leave the door open
> to explicit addressing to specific representations of a resource to be used
> when content negotiation is not an option.

Absolutely. I agree with this entirely. But these are mechanisms to
express in the application layer that the representation that the UA
requests of the resource differs from those the UA claims to accept.
Where the browser is able to express what it wants, we should listen
to it. We should not default to expressing in the application layer
what is ignored in the protocol layer.

> IMHO, it is not a mistake to add such alternative addresses.

I agree. The operative word here is "alternative" -- these are ways to
address a resource other than the canonical URI + request headers, and
one ought to strive to remove these where practical.

/joel

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


fayland at gmail

Nov 17, 2009, 5:44 PM

Post #11 of 15 (1881 views)
Permalink
Re: Language selection in URLs [In reply to]

why shouldn't you use domain as the part of the language? like
en.example.com, cn.example.com and something like that?

Thanks.

Octavian Râsnita wrote:
> From: "Bill Moseley" <moseley [at] hank>
>> What's your preferred approach to specifying a language tag in a URL? Is
>> there strong argument for one over the other?
>>
>> http://example.com/en_us/path/to/some/index.html # language prefix
>>
>> http://example.com/path/to/some/index.html?lang=en_us
>
> I prefer the former way because the URL looks nicer.
> (Not a very "strong" argument:)
>
>> Are pages in different languages different resources or different
>> versions
>> of the same resource?
>
> In most cases I think it is the same content with a different
> presentation style, language...
>
>> Obviously, the prefix is easier if you use relative URLs, but uri_for
>> makes
>> adding the query parameter easy. Although, probably could argue that the
>> prefix approach is more efficient than wrapping uri_for for every
>> generated
>> link.
>
> There is an example on the Catalyst wiki for overriding prepare_path()
> in order to use urls like /en/path/to/another/file.html without needing
> to change all controllers.
> Octavian
>
>
> _______________________________________________
> List: Catalyst [at] lists
> Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
> Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
> Dev site: http://dev.catalyst.perl.org/
>


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


orasnita at gmail

Nov 17, 2009, 10:14 PM

Post #12 of 15 (1868 views)
Permalink
Re: Re: Language selection in URLs [In reply to]

From: "Fayland Lam" <fayland [at] gmail>

> why shouldn't you use domain as the part of the language? like
> en.example.com, cn.example.com and something like that?
>
> Thanks.
>

Because each sub-domain would require another SSL key (or a special group
SSL key that can be used with more subdomains.

Octavian


_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


mb at cattlegrid

Nov 18, 2009, 12:40 AM

Post #13 of 15 (1869 views)
Permalink
Re: Re: Language selection in URLs [In reply to]

Hello!

>> why shouldn't you use domain as the part of the language? like
>> en.example.com, cn.example.com and something like that?
>>
>> Thanks.
>>
> Because each sub-domain would require another SSL key (or a special
> group SSL key that can be used with more subdomains.

Moreover, I don't see that great advantage (not even in elegance) in using:

http://en-gb.mysite.com/resource/list

instead of:

http://mysite.com/en-gb/resource/list

If you're using the Chained dispatching, it's very straightforward to
map the secondo URL to a language.

Michele.

--
Michele Beltrame
http://www.italpro.net/ - mb [at] italpro
Skype: arthas77 - Twitter: _arthas

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


pagaltzis at gmx

Nov 18, 2009, 9:55 AM

Post #14 of 15 (1856 views)
Permalink
Re: Language selection in URLs [In reply to]

* Joel Bernstein <joel [at] fysh> [2009-11-15 16:30]:
> No no no! Allow the client and server to negotiate what content
> to serve for the resource identified. As a URI to a resource
> which may vary according to many dimensions,
> /path/to/some/content is fine.
>
> GET /path/to/content HTTP/1.1
> Accept-Language: en
> Accept: text/html

Conneg sucks. It’s a good idea for non-human-readable content
served in a variety of formats, but for variants of anything
that’s like a “page” you should have separate URIs, so that
people can reliably bookmark one of them, or send someone else
a link to talk about it and not have the other person see
a completely different page (or file or whatever), etc.

It’s OK to accept conneg on neutral URIs and then *redirect* to
specific URIs based on the Accept-* headers. But don’t make
conneg the *only* way to pick a specific version of a resource.

> They are different *representations* of the same *resource*.

They are almost never *exact* equivalents. There are almost always
slight differences in content depending on language. _The medium
is the message._

And there is no reason not to have more than one URI for the same
resource anyway. Yes, you should pick one of them as canonical,
and unless you have good reason for doing otherwise, all the non-
canonical URIs be redirects. But these are merely good ideas, not
hard and fast rules.

> If you really must stick it in the URL, I'd go for something like:
> /path/to/content/en
> /path/to/content/pt_BR
> etc

Worst of all worlds, IMO. The query parameter is easiest to
implement for the server, while the path prefix allows the user
to hack the URI conveniently (so the latter is what I would do).
Your suggestion is harder to implement than both and makes URIs
annoying to hack.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>

_______________________________________________
List: Catalyst [at] lists
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst [at] lists/
Dev site: http://dev.catalyst.perl.org/


moseley at hank

Nov 18, 2009, 11:19 AM

Post #15 of 15 (1855 views)
Permalink
Re: Re: Language selection in URLs [In reply to]

On Wed, Nov 18, 2009 at 9:55 AM, Aristotle Pagaltzis <pagaltzis [at] gmx>wrote:

> * Joel Bernstein <joel [at] fysh> [2009-11-15 16:30]:
> > No no no! Allow the client and server to negotiate what content
> > to serve for the resource identified. As a URI to a resource
> > which may vary according to many dimensions,
> > /path/to/some/content is fine.
> >
> > GET /path/to/content HTTP/1.1
> > Accept-Language: en
> > Accept: text/html
>
> Conneg sucks. It’s a good idea for non-human-readable content
> served in a variety of formats, but for variants of anything
> that’s like a “page” you should have separate URIs, so that
> people can reliably bookmark one of them, or send someone else
> a link to talk about it and not have the other person see
> a completely different page (or file or whatever), etc.
>
> It’s OK to accept conneg on neutral URIs and then *redirect* to
> specific URIs based on the Accept-* headers. But don’t make
> conneg the *only* way to pick a specific version of a resource.
>

I think this is very good advice.



>
> > If you really must stick it in the URL, I'd go for something like:
> > /path/to/content/en
> > /path/to/content/pt_BR
> > etc
>
> Worst of all worlds, IMO. The query parameter is easiest to
> implement for the server, while the path prefix allows the user
> to hack the URI conveniently (so the latter is what I would do).
> Your suggestion is harder to implement than both and makes URIs
> annoying to hack.
>

I think Catalyst makes the path prefix the easiest. WIth the query
parameter (which I'm doing now to be compliant with a legacy app) it's
trivial to by using "around uri_for", but I'd much rather do something once
(modify $->req->base) than override every uri_for.

I do have a slight fear of the query parameter messing with caching. I
doubt it's much of an issue these days, though.


--
Bill Moseley
moseley [at] hank

Catalyst users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.