Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Gentoo: Dev

UTF-8 locale by default

 

 

First page Previous page 1 2 Next page Last page  View All Gentoo dev RSS feed   Index | Next | Previous | View Threaded


Sascha-ML at babbelbox

Jul 19, 2012, 2:39 PM

Post #1 of 49 (845 views)
Permalink
UTF-8 locale by default

I recently discovered that I for some reason haven't noticed the warning about
setting the locale to utf-8 in the gentoo handbook for obviously several
years; thus i was still running all my systems in a POSIX locale since i never
cared much about it.

However, since I noticed, I talked to several people about it; all of them
stating as first response: "Not shipping with a utf-8 locale turned on by
default nowadays probably is a bug in your distro".

While thinking about this and recognizing that indeed recent distributions
ship with some UTF-8 locale by default, I tend to agree on that statement.

Though, google brings up a lot of good documentation about how to change the
locale, I couldn't find something that tells why stage3 is still delivered
with posix locale set.

Is there a reason for not using at least en_US.UTF-8 as a "sane" default
value?

BR,
SaCu


chithanh at gentoo

Jul 19, 2012, 3:23 PM

Post #2 of 49 (847 views)
Permalink
Re: UTF-8 locale by default [In reply to]

Sascha Cunz schrieb:
> Is there a reason for not using at least en_US.UTF-8 as a "sane" default
> value?

It has been discussed some time ago already. Setting LANG="en_US.UTF-8"
would mess with collation rules, measurement&paper units etc. which has
the potential to make users outside USA unhappy.

It might make sense to set LC_CTYPE="en_US.UTF8" but even so,
transliteration may give you unexpected results.

To illustrate this, try running

echo äå | LC_CTYPE=en_US.UTF-8 iconv -t ASCII//TRANSLIT -f UTF-8
echo äå | LC_CTYPE=da_DK.UTF-8 iconv -t ASCII//TRANSLIT -f UTF-8
echo äå | LC_CTYPE=de_DE.UTF-8 iconv -t ASCII//TRANSLIT -f UTF-8

and compare the output.
For the previous discussion, see this thread:
http://archives.gentoo.org/gentoo-dev/msg_2ffb7ea72e6209439600c371f6fc071d.xml


Best regards,
Chí-Thanh Christopher Nguyễn


ulm at gentoo

Jul 19, 2012, 3:28 PM

Post #3 of 49 (845 views)
Permalink
Re: UTF-8 locale by default [In reply to]

>>>>> On Thu, 19 Jul 2012, Sascha Cunz wrote:

> Is there a reason for not using at least en_US.UTF-8 as a "sane"
> default value?

Because there's no one-size-fits-all locale, but it is specific to
every system so the user must configure it?

The matter was recently discussed in this mailing list [1] and also in
the March 2012 council meeting [2], and as a result the docs team has
amended the respective section [3] of the handbook.

Ulrich

[1] <http://archives.gentoo.org/gentoo-dev/msg_2ffb7ea72e6209439600c371f6fc071d.xml>
[2] <http://www.gentoo.org/proj/en/council/meeting-logs/20120313.txt>
[3] <http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=1&chap=8>


yngwin at gentoo

Jul 26, 2012, 11:42 PM

Post #4 of 49 (832 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On 20 July 2012 06:28, Ulrich Mueller <ulm [at] gentoo> wrote:
>>>>>> On Thu, 19 Jul 2012, Sascha Cunz wrote:
>
>> Is there a reason for not using at least en_US.UTF-8 as a "sane"
>> default value?
>
> Because there's no one-size-fits-all locale, but it is specific to
> every system so the user must configure it?

While this is understandable, the fact remains that not having a
UTF-8 locale by default in our stage3 environment is sub-optimal.

I understand why the council rejected Debian's C.UTF-8 option,
but is there really no better default that we can use?

Without any default locale set, in practically all cases that means
that the user is presented with English, and mostly the American
variant. So, in practice, we are defaulting to en_US, just not in a
unicode environment. Correct me if I'm wrong.

Also, in most other places (such as our website, GLEPs, ebuilds)
we default to en_US.UTF-8.

So let's upgrade to en_US.UTF-8, which is for most users more
desirable than the current situation. Of course we will still advise
them to set their desired locales in /etc/locale.gen. But at least
they will start with a unicode environment, as expected anno 2012.


> The matter was recently discussed in this mailing list [1] and also in
> the March 2012 council meeting [2], and as a result the docs team has
> amended the respective section [3] of the handbook.
>
> Ulrich
>
> [1] <http://archives.gentoo.org/gentoo-dev/msg_2ffb7ea72e6209439600c371f6fc071d.xml>
> [2] <http://www.gentoo.org/proj/en/council/meeting-logs/20120313.txt>
> [3] <http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=1&chap=8>
>

--
Cheers,

Ben | yngwin
Gentoo developer
Gentoo Qt project lead, Gentoo Wiki admin


ulm at gentoo

Jul 27, 2012, 12:08 AM

Post #5 of 49 (826 views)
Permalink
Re: UTF-8 locale by default [In reply to]

>>>>> On Fri, 27 Jul 2012, Ben de Groot wrote:

> I understand why the council rejected Debian's C.UTF-8 option,
> but is there really no better default that we can use?

> Without any default locale set, in practically all cases that means
> that the user is presented with English, and mostly the American
> variant. So, in practice, we are defaulting to en_US, just not in a
> unicode environment. Correct me if I'm wrong.

See below. We're not defaulting to en_US for things like the number
format.

> Also, in most other places (such as our website, GLEPs, ebuilds)
> we default to en_US.UTF-8.

> So let's upgrade to en_US.UTF-8, which is for most users more
> desirable than the current situation. Of course we will still advise
> them to set their desired locales in /etc/locale.gen. But at least
> they will start with a unicode environment, as expected anno 2012.

As I had pointed out before [1], changing from POSIX to an en_US
locale will have undesirable side effects, like commas as thousands
separators in numbers (because of LC_NUMERIC). Also the defaults of
en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.

So if we change the default (but I still don't see the need), we
should go for a less intrusive setting like:

LANG="POSIX"
LC_CTYPE="en_US.utf8"

Ulrich

[1] <http://archives.gentoo.org/gentoo-dev/msg_56a438adde8efebd467ada5f858048ba.xml>


zerochaos at gentoo

Jul 27, 2012, 12:19 AM

Post #6 of 49 (828 views)
Permalink
Re: UTF-8 locale by default [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/27/2012 03:08 AM, Ulrich Mueller wrote:
>
> As I had pointed out before [1], changing from POSIX to an en_US
> locale will have undesirable side effects, like commas as thousands
> separators in numbers (because of LC_NUMERIC). Also the defaults of
> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
>
> So if we change the default (but I still don't see the need), we
> should go for a less intrusive setting like:
>
> LANG="POSIX"
> LC_CTYPE="en_US.utf8"

I would love to see a utf8 default, if the above is agreeable then I say +1

- -Zero
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJQEkD6AAoJEKXdFCfdEflKt8MP/3wRoExV11rO5aV5952hwKhd
x9AG3wGJQqGFLkKW++gU1RLX8rhxZE+W8cRlp3/4Q1b6yLGFp7UihZv/rQj1SJra
Uz4OWqzzdYAkfkzr2MOgB94iODXInuuSbZmhcvOg8d7cgbhW3p0aIQ59uqkqom6W
U0a8BohmGtTEMvWurMtvz705atv0z8aRUsoBUkagCUmRqg96j8HJRbMibNFKcHaa
tzilNblkCouPmh5VZNuoCNIVrs6ADOT+kXmhZ8DeuOOdM88irPr41gz557K97J4l
u9ZWElpLY8zse+dHSioybE57cb9ISNph9B3OjmrzEmxMYO/Vs8+8ZRIgX4A4U2FZ
BDISvf2u77ZUhv48gCuC6pj+np7IMAUgRgk1xWiSkPIWxvlcPcvFo/K1dle3FofL
iNAxf0XcLj+crfBemhnvDWTB0ZCIIBcyn0MYax70lzcwR0t0q+xJ8XBN1hF3xWob
LOUSCd1sibc2a65D5olc/qKSjINM5KY3D+CVXhojhD1YzklmrKBb9K5gk6ziZr2y
w4OMOIkDc+iHYq0xhcYRAJU38+cuX9ViNq9O4H3ILpQXi+KRKlk4PmlLIm2v9evb
P+JNsRSl+1sxUkn2ZthBh+83vj/WtnR0s1sXEzc+6riBomBGsc0Hbsoa9Z+JgNhF
FzvV5OHsfNiuHvAzayww
=ZiLb
-----END PGP SIGNATURE-----


ormaaj at gmail

Jul 27, 2012, 1:06 AM

Post #7 of 49 (830 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On Friday, July 27, 2012 09:08:36 AM Ulrich Mueller wrote:
> >>>>> On Fri, 27 Jul 2012, Ben de Groot wrote:
>
> > I understand why the council rejected Debian's C.UTF-8 option,
> > but is there really no better default that we can use?
>
> > Without any default locale set, in practically all cases that means
> > that the user is presented with English, and mostly the American
> > variant. So, in practice, we are defaulting to en_US, just not in a
> > unicode environment. Correct me if I'm wrong.
>
> See below. We're not defaulting to en_US for things like the number
> format.
>
> > Also, in most other places (such as our website, GLEPs, ebuilds)
> > we default to en_US.UTF-8.
>
> > So let's upgrade to en_US.UTF-8, which is for most users more
> > desirable than the current situation. Of course we will still advise
> > them to set their desired locales in /etc/locale.gen. But at least
> > they will start with a unicode environment, as expected anno 2012.
>
> As I had pointed out before [1], changing from POSIX to an en_US
> locale will have undesirable side effects, like commas as thousands
> separators in numbers (because of LC_NUMERIC). Also the defaults of
> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
>
> So if we change the default (but I still don't see the need), we
> should go for a less intrusive setting like:
>
> LANG="POSIX"
> LC_CTYPE="en_US.utf8"
>
> Ulrich
>

You're concerned about the commas breaking things? Given that you usually need
to specifically ask for them (i.e., printf ' flag), and that kind of output is
usually going to be for human consumption only that seems unlikely. If
anything does rely upon the format, can't tolerate different locales, and fails
to specify LC_NUMERIC then it's broken anyway.

LC_MONETARY / LC_MEASUREMENT as en_US are probably slightly more annoying
defaults for some people. What do users of other distros think? Is this really
a serious problem for anyone?

LC_CTYPE=en_US.utf8 would be a bare minimum. The important bit is getting utf8
by default. I can live with LANG=POSIX.
--
Dan Douglas
Attachments: signature.asc (0.19 KB)


yngwin at gentoo

Jul 27, 2012, 1:34 AM

Post #8 of 49 (835 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On 27 July 2012 16:06, Dan Douglas <ormaaj [at] gmail> wrote:
> On Friday, July 27, 2012 09:08:36 AM Ulrich Mueller wrote:
>> >>>>> On Fri, 27 Jul 2012, Ben de Groot wrote:
>>
>> > I understand why the council rejected Debian's C.UTF-8 option,
>> > but is there really no better default that we can use?
>>
>> > Without any default locale set, in practically all cases that means
>> > that the user is presented with English, and mostly the American
>> > variant. So, in practice, we are defaulting to en_US, just not in a
>> > unicode environment. Correct me if I'm wrong.
>>
>> See below. We're not defaulting to en_US for things like the number
>> format.
>>
>> > Also, in most other places (such as our website, GLEPs, ebuilds)
>> > we default to en_US.UTF-8.
>>
>> > So let's upgrade to en_US.UTF-8, which is for most users more
>> > desirable than the current situation. Of course we will still advise
>> > them to set their desired locales in /etc/locale.gen. But at least
>> > they will start with a unicode environment, as expected anno 2012.
>>
>> As I had pointed out before [1], changing from POSIX to an en_US
>> locale will have undesirable side effects, like commas as thousands
>> separators in numbers (because of LC_NUMERIC). Also the defaults of
>> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
>>
>> So if we change the default (but I still don't see the need), we
>> should go for a less intrusive setting like:
>>
>> LANG="POSIX"
>> LC_CTYPE="en_US.utf8"
>>
>> Ulrich
>>
>
> You're concerned about the commas breaking things? Given that you usually need
> to specifically ask for them (i.e., printf ' flag), and that kind of output is
> usually going to be for human consumption only that seems unlikely. If
> anything does rely upon the format, can't tolerate different locales, and fails
> to specify LC_NUMERIC then it's broken anyway.
>
> LC_MONETARY / LC_MEASUREMENT as en_US are probably slightly more annoying
> defaults for some people. What do users of other distros think? Is this really
> a serious problem for anyone?
>
> LC_CTYPE=en_US.utf8 would be a bare minimum. The important bit is getting utf8
> by default. I can live with LANG=POSIX.
> --
> Dan Douglas

How about the below?

LANG=en_GB.utf8
LC_COLLATE=C
LC_CTYPE=en_GB.utf8

That will give us A4 paper size and the metric system. If LC_NUMERIC is
really a problem, we can set it to something more desirable.
--
Cheers,

Ben | yngwin
Gentoo developer
Gentoo Qt project lead, Gentoo Wiki admin


c.nicolas at gmail

Jul 27, 2012, 1:38 AM

Post #9 of 49 (827 views)
Permalink
Re: UTF-8 locale by default [In reply to]

Ulrich Mueller wrote:
>> On Fri, 27 Jul 2012, Ben de Groot wrote:
>>
>> So let's upgrade to en_US.UTF-8, which is for most users more
>> desirable than the current situation. Of course we will still advise
>> them to set their desired locales in /etc/locale.gen. But at least
>> they will start with a unicode environment, as expected anno 2012.
>
> As I had pointed out before [1], changing from POSIX to an en_US
> locale will have undesirable side effects, like commas as thousands
> separators in numbers (because of LC_NUMERIC). Also the defaults of
> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.

For this very reason by system locale is en_IE.UTF-8. Still English but
using Euro Monetary, Metric units, A4 paper, etc.

It might suit needs for most European installs, but not for everyone.

--
Cyprien / Fulax
Gentoo Lisp Project contributor


mgorny at gentoo

Jul 27, 2012, 1:47 AM

Post #10 of 49 (827 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On Fri, 27 Jul 2012 10:38:30 +0200
Cyprien Nicolas <c.nicolas [at] gmail> wrote:

> Ulrich Mueller wrote:
> >> On Fri, 27 Jul 2012, Ben de Groot wrote:
> >>
> >> So let's upgrade to en_US.UTF-8, which is for most users more
> >> desirable than the current situation. Of course we will still
> >> advise them to set their desired locales in /etc/locale.gen. But
> >> at least they will start with a unicode environment, as expected
> >> anno 2012.
> >
> > As I had pointed out before [1], changing from POSIX to an en_US
> > locale will have undesirable side effects, like commas as thousands
> > separators in numbers (because of LC_NUMERIC). Also the defaults of
> > en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
>
> For this very reason by system locale is en_IE.UTF-8. Still English
> but using Euro Monetary, Metric units, A4 paper, etc.
>
> It might suit needs for most European installs, but not for everyone.

Still uses ',' for thousands sep.

--
Best regards,
Michał Górny
Attachments: signature.asc (0.31 KB)


mgorny at gentoo

Jul 27, 2012, 1:49 AM

Post #11 of 49 (825 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On Fri, 27 Jul 2012 16:34:01 +0800
Ben de Groot <yngwin [at] gentoo> wrote:

> On 27 July 2012 16:06, Dan Douglas <ormaaj [at] gmail> wrote:
> > On Friday, July 27, 2012 09:08:36 AM Ulrich Mueller wrote:
> >> >>>>> On Fri, 27 Jul 2012, Ben de Groot wrote:
> >>
> >> > I understand why the council rejected Debian's C.UTF-8 option,
> >> > but is there really no better default that we can use?
> >>
> >> > Without any default locale set, in practically all cases that
> >> > means that the user is presented with English, and mostly the
> >> > American variant. So, in practice, we are defaulting to en_US,
> >> > just not in a unicode environment. Correct me if I'm wrong.
> >>
> >> See below. We're not defaulting to en_US for things like the number
> >> format.
> >>
> >> > Also, in most other places (such as our website, GLEPs, ebuilds)
> >> > we default to en_US.UTF-8.
> >>
> >> > So let's upgrade to en_US.UTF-8, which is for most users more
> >> > desirable than the current situation. Of course we will still
> >> > advise them to set their desired locales in /etc/locale.gen. But
> >> > at least they will start with a unicode environment, as expected
> >> > anno 2012.
> >>
> >> As I had pointed out before [1], changing from POSIX to an en_US
> >> locale will have undesirable side effects, like commas as thousands
> >> separators in numbers (because of LC_NUMERIC). Also the defaults of
> >> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> >>
> >> So if we change the default (but I still don't see the need), we
> >> should go for a less intrusive setting like:
> >>
> >> LANG="POSIX"
> >> LC_CTYPE="en_US.utf8"
> >>
> >> Ulrich
> >>
> >
> > You're concerned about the commas breaking things? Given that you
> > usually need to specifically ask for them (i.e., printf ' flag),
> > and that kind of output is usually going to be for human
> > consumption only that seems unlikely. If anything does rely upon
> > the format, can't tolerate different locales, and fails to specify
> > LC_NUMERIC then it's broken anyway.
> >
> > LC_MONETARY / LC_MEASUREMENT as en_US are probably slightly more
> > annoying defaults for some people. What do users of other distros
> > think? Is this really a serious problem for anyone?
> >
> > LC_CTYPE=en_US.utf8 would be a bare minimum. The important bit is
> > getting utf8 by default. I can live with LANG=POSIX.
> > --
> > Dan Douglas
>
> How about the below?
>
> LANG=en_GB.utf8
> LC_COLLATE=C
> LC_CTYPE=en_GB.utf8
>
> That will give us A4 paper size and the metric system. If LC_NUMERIC
> is really a problem, we can set it to something more desirable.

LC_NUMERIC=pl_PL.utf8

--
Best regards,
Michał Górny
Attachments: signature.asc (0.31 KB)


chithanh at gentoo

Jul 27, 2012, 5:13 AM

Post #12 of 49 (809 views)
Permalink
Re: UTF-8 locale by default [In reply to]

Ulrich Mueller schrieb:
> As I had pointed out before [1], changing from POSIX to an en_US
> locale will have undesirable side effects, like commas as thousands
> separators in numbers (because of LC_NUMERIC). Also the defaults of
> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
>
> So if we change the default (but I still don't see the need), we
> should go for a less intrusive setting like:
>
> LANG="POSIX"
> LC_CTYPE="en_US.utf8"

This would be better than LANG="en_US.utf8" but I would still prefer not
to have any country/region attached to the locale. The C.UTF-8 locale
which Debian uses for this purpose (a UTF-8 locale without side effects)
appears more suitable to me.


Best regards,
Chí-Thanh Christopher Nguyễn


vapier at gentoo

Jul 27, 2012, 10:24 AM

Post #13 of 49 (813 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On Friday 27 July 2012 08:13:16 Chí-Thanh Christopher Nguyễn wrote:
> Ulrich Mueller schrieb:
> > As I had pointed out before [1], changing from POSIX to an en_US
> > locale will have undesirable side effects, like commas as thousands
> > separators in numbers (because of LC_NUMERIC). Also the defaults of
> > en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> >
> > So if we change the default (but I still don't see the need), we
> >
> > should go for a less intrusive setting like:
> > LANG="POSIX"
> > LC_CTYPE="en_US.utf8"
>
> This would be better than LANG="en_US.utf8" but I would still prefer not
> to have any country/region attached to the locale. The C.UTF-8 locale
> which Debian uses for this purpose (a UTF-8 locale without side effects)
> appears more suitable to me.

yes, and i'm waiting on the POSIX group to formalize C.UTF-8. that's the only
real option in my mind for making unicode the default. any other
amalgamations of various locales is ugly as sin.
-mike
Attachments: signature.asc (0.82 KB)


pacho at gentoo

Jul 27, 2012, 11:29 AM

Post #14 of 49 (812 views)
Permalink
Re: UTF-8 locale by default [In reply to]

El vie, 27-07-2012 a las 13:24 -0400, Mike Frysinger escribió:
> On Friday 27 July 2012 08:13:16 Chí-Thanh Christopher Nguyễn wrote:
> > Ulrich Mueller schrieb:
> > > As I had pointed out before [1], changing from POSIX to an en_US
> > > locale will have undesirable side effects, like commas as thousands
> > > separators in numbers (because of LC_NUMERIC). Also the defaults of
> > > en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> > >
> > > So if we change the default (but I still don't see the need), we
> > >
> > > should go for a less intrusive setting like:
> > > LANG="POSIX"
> > > LC_CTYPE="en_US.utf8"
> >
> > This would be better than LANG="en_US.utf8" but I would still prefer not
> > to have any country/region attached to the locale. The C.UTF-8 locale
> > which Debian uses for this purpose (a UTF-8 locale without side effects)
> > appears more suitable to me.
>
> yes, and i'm waiting on the POSIX group to formalize C.UTF-8. that's the only
> real option in my mind for making unicode the default. any other
> amalgamations of various locales is ugly as sin.
> -mike

Do you have any idea about how much time could that formalization take?
If it will take a long time, maybe we could go to that amalgamations :-/
Attachments: signature.asc (0.19 KB)


titanofold at gentoo

Jul 27, 2012, 1:16 PM

Post #15 of 49 (811 views)
Permalink
Re: UTF-8 locale by default [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 07/27/2012 02:29 PM, Pacho Ramos wrote:
> El vie, 27-07-2012 a las 13:24 -0400, Mike Frysinger escribió:
>> On Friday 27 July 2012 08:13:16 Chí-Thanh Christopher Nguyễn
>> wrote:
>>> Ulrich Mueller schrieb:
>>>> As I had pointed out before [1], changing from POSIX to an
>>>> en_US locale will have undesirable side effects, like commas
>>>> as thousands separators in numbers (because of LC_NUMERIC).
>>>> Also the defaults of en_US for LC_MEASUREMENT and LC_PAPER
>>>> are only useful in the U.S.
>>>>
>>>> So if we change the default (but I still don't see the need),
>>>> we
>>>>
>>>> should go for a less intrusive setting like: LANG="POSIX"
>>>> LC_CTYPE="en_US.utf8"
>>>
>>> This would be better than LANG="en_US.utf8" but I would still
>>> prefer not to have any country/region attached to the locale.
>>> The C.UTF-8 locale which Debian uses for this purpose (a UTF-8
>>> locale without side effects) appears more suitable to me.
>>
>> yes, and i'm waiting on the POSIX group to formalize C.UTF-8.
>> that's the only real option in my mind for making unicode the
>> default. any other amalgamations of various locales is ugly as
>> sin. -mike
>
> Do you have any idea about how much time could that formalization
> take? If it will take a long time, maybe we could go to that
> amalgamations :-/
>

Really, how much of an inconvenience is it that we don't use UTF-8 as
a default?

In my mind, it is sufficient that we instruct users how to set the
locale in the handbook.

No user will be happy with whatever we decide to use as a default. I
will be especially upset if we use the metric system instead of the
*STANDARD* system. It has 'standard' in the name for a reason people.
(^_^)

- --
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email : titanofold [at] gentoo
GnuPG FP : 2C00 7719 4F85 FB07 A49C 0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAlAS9xEACgkQVxOqA9G7/aDXmQEAmKW1MNgHDZpjE0JBWsWssq0h
LR32rvm0CrafIhD6v3UA/Aiuq6BTGxfJ3pO6+pP5xtQ5RD0ML5+89sSfKX6R1DEo
=JtMV
-----END PGP SIGNATURE-----


flameeyes at flameeyes

Jul 27, 2012, 1:55 PM

Post #16 of 49 (811 views)
Permalink
Re: UTF-8 locale by default [In reply to]

Il 27/07/2012 13:16, Aaron W. Swenson ha scritto:
> Really, how much of an inconvenience is it that we don't use UTF-8 as
> a default?

Given that there are a ton and a half of Python packages that do not
work with a non-utf8 locale, I'd say it's quite a thing.

So either we go with an UTF-8 default or somebody has to fix the
packages not working without it....

--
Diego Elio Pettenò — Flameeyes
flameeyes [at] flameeyes — http://blog.flameeyes.eu/
Attachments: signature.asc (0.54 KB)


michael at orlitzky

Jul 30, 2012, 7:35 AM

Post #17 of 49 (796 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On 07/27/12 16:16, Aaron W. Swenson wrote:
>
> No user will be happy with whatever we decide to use as a default.

The defaults should be what's best for the most people, with a bias
towards safety. Why don't we just take a survey and choose the most
common utf8 response?


mgorny at gentoo

Jul 30, 2012, 7:41 AM

Post #18 of 49 (798 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On Mon, 30 Jul 2012 10:35:36 -0400
Michael Orlitzky <michael [at] orlitzky> wrote:

> On 07/27/12 16:16, Aaron W. Swenson wrote:
> >
> > No user will be happy with whatever we decide to use as a default.
>
> The defaults should be what's best for the most people, with a bias
> towards safety. Why don't we just take a survey and choose the most
> common utf8 response?

How can you take a survey like that? How will you ensure it actually
hits the majority? How will you define the majority?

--
Best regards,
Michał Górny
Attachments: signature.asc (0.31 KB)


mikemol at gmail

Jul 30, 2012, 7:42 AM

Post #19 of 49 (798 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On Mon, Jul 30, 2012 at 10:35 AM, Michael Orlitzky <michael [at] orlitzky> wrote:
> On 07/27/12 16:16, Aaron W. Swenson wrote:
>>
>> No user will be happy with whatever we decide to use as a default.
>
> The defaults should be what's best for the most people, with a bias
> towards safety. Why don't we just take a survey and choose the most
> common utf8 response?

You'd really want to a "which do you prefer, which can you use"
survey, then; You don't really want to choose the result preferred by
the most people, rather you want the result which is usable by the
most people.

--
:wq


michael at orlitzky

Jul 30, 2012, 7:50 AM

Post #20 of 49 (800 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On 07/30/12 10:41, Michał Górny wrote:
> On Mon, 30 Jul 2012 10:35:36 -0400
> Michael Orlitzky <michael [at] orlitzky> wrote:
>
>> On 07/27/12 16:16, Aaron W. Swenson wrote:
>>>
>>> No user will be happy with whatever we decide to use as a default.
>>
>> The defaults should be what's best for the most people, with a bias
>> towards safety. Why don't we just take a survey and choose the most
>> common utf8 response?
>
> How can you take a survey like that? How will you ensure it actually
> hits the majority? How will you define the majority?
>

Considering that the alternative is to force everyone to change it
manually, you can do it however you want and it'll be an improvement.

1) Create a webpage with a bunch of options, count the results

2) Ask the g.o mailing lists, count responses manually

3) Use google docs like the website survey that went out a few days
ago

It won't hit everyone, but no survey ever does. As long as you get a
large enough unbiased sample, it doesn't matter. And anything would be
an improvement, so it doesn't matter anyway.


mikemol at gmail

Jul 30, 2012, 8:04 AM

Post #21 of 49 (800 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On Mon, Jul 30, 2012 at 10:41 AM, Michał Górny <mgorny [at] gentoo> wrote:
> On Mon, 30 Jul 2012 10:35:36 -0400
> Michael Orlitzky <michael [at] orlitzky> wrote:
>
>> On 07/27/12 16:16, Aaron W. Swenson wrote:
>> >
>> > No user will be happy with whatever we decide to use as a default.
>>
>> The defaults should be what's best for the most people, with a bias
>> towards safety. Why don't we just take a survey and choose the most
>> common utf8 response?
>
> How can you take a survey like that? How will you ensure it actually
> hits the majority? How will you define the majority?

Serverside script on gentoo.org. Push out a news item with the URL and
a last-call date. Tabulate the results, using browser fingerprints to
weed out the bulk of duplicates.

--
:wq


rich0 at gentoo

Jul 30, 2012, 8:29 AM

Post #22 of 49 (801 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On Mon, Jul 30, 2012 at 10:42 AM, Michael Mol <mikemol [at] gmail> wrote:
>
> You'd really want to a "which do you prefer, which can you use"
> survey, then; You don't really want to choose the result preferred by
> the most people, rather you want the result which is usable by the
> most people.

I tend to agree. Donnie said something in his manifesto which I think
applies here: any of the proposed solutions is probably better than
doing nothing.

If I forget to tweak my locale and I end up with a comma as a decimal
mark it isn't the end of the world, and neither is some output in
metric units. I've ended up working on many a global system where
times get reported in GMT and people put up with the inconvenience
because they realize that any standard is better than no standard.

What is the real end-user impact of any of this stuff anyway? During
the install the thing that matters is being able to partition disks
and compile kernels and such. I doubt that too many users will be
dependent on installer locale settings for displaying weather reports
or such. If they don't set locale, then it is like not setting
localtime - you just get to live with some default. I would imagine
that at least by having a UTF-8 locale users would be able to do
things like set full names of users using unicode, etc.

Rich


titanofold at gentoo

Jul 30, 2012, 8:51 AM

Post #23 of 49 (798 views)
Permalink
Re: UTF-8 locale by default [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 07/30/2012 11:04 AM, Michael Mol wrote:
> On Mon, Jul 30, 2012 at 10:41 AM, Michał Górny <mgorny [at] gentoo>
> wrote:
>> On Mon, 30 Jul 2012 10:35:36 -0400 Michael Orlitzky
>> <michael [at] orlitzky> wrote:
>>
>>> On 07/27/12 16:16, Aaron W. Swenson wrote:
>>>>
>>>> No user will be happy with whatever we decide to use as a
>>>> default.
>>>
>>> The defaults should be what's best for the most people, with a
>>> bias towards safety. Why don't we just take a survey and choose
>>> the most common utf8 response?
>>
>> How can you take a survey like that? How will you ensure it
>> actually hits the majority? How will you define the majority?
>
> Serverside script on gentoo.org. Push out a news item with the URL
> and a last-call date. Tabulate the results, using browser
> fingerprints to weed out the bulk of duplicates.
>

I still advocate continuing how we have been.

However, the survey should be one question: What is the output of
`locale' on your workstation/desktop/laptop?

The less painful we make the survey, the more respondents we'll get,
and the less biased the results will be. Additionally, it makes the
responses easy to parse with a script.

Servers are excluded because special things take place there that may
not actually line up with what the user prefers.

If it turns out that C or POSIX is the most common response, we should
then default the locale to en_US.UTF-8 if we really want to default to
a UTF-8 setting. The reason being it makes sense to have the default
locale set to the country of origin, which in our case is the United
States.

Yes, it may irk those whose native locale is not en_US.UTF-8, but like
I said, no one will be happy. Except for those whose native locale
happens to be the default.

Start at a default, doesn't really matter which as long as the default
is the lingua franca of international business, and instruct the user,
as we already do, how to change it during the setup.

- --
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email : titanofold [at] gentoo
GnuPG FP : 2C00 7719 4F85 FB07 A49C 0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAlAWrXAACgkQVxOqA9G7/aCmowD6A8+9giw1BhhxvAag7Cmeom7o
mHVW49AfEDSo6ReknZkBAIa09FZ62SU66BCCi6m3Qisk5SW7P3YDLNbkMDS38/CZ
=lFc0
-----END PGP SIGNATURE-----


mgorny at gentoo

Jul 30, 2012, 9:28 AM

Post #24 of 49 (800 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On Mon, 30 Jul 2012 10:50:29 -0400
Michael Orlitzky <michael [at] orlitzky> wrote:

> On 07/30/12 10:41, Michał Górny wrote:
> > On Mon, 30 Jul 2012 10:35:36 -0400
> > Michael Orlitzky <michael [at] orlitzky> wrote:
> >
> >> On 07/27/12 16:16, Aaron W. Swenson wrote:
> >>>
> >>> No user will be happy with whatever we decide to use as a default.
> >>
> >> The defaults should be what's best for the most people, with a bias
> >> towards safety. Why don't we just take a survey and choose the most
> >> common utf8 response?
> >
> > How can you take a survey like that? How will you ensure it actually
> > hits the majority? How will you define the majority?
> >
>
> Considering that the alternative is to force everyone to change it
> manually, you can do it however you want and it'll be an improvement.

My point here is that you want the thing to change. So you first try to
convince people here to change. We practically did a small survey here
and in the result we didn't agree on doing the change.

So you're saying we should do another survey on another group, hoping
that this time the result will be on your side.

> 1) Create a webpage with a bunch of options, count the results
>
> 2) Ask the g.o mailing lists, count responses manually
>
> 3) Use google docs like the website survey that went out a few days
> ago
>
> It won't hit everyone, but no survey ever does. As long as you get a
> large enough unbiased sample, it doesn't matter. And anything would be
> an improvement, so it doesn't matter anyway.

It depends on who the 'unbiased sample' is. Are you interested only in
opinion of Gentoo users who visit the website? Who sync once a day?
Once a week? Who follow Gentoo Planet? Who participate in the forums?

We can create the survey and announce it everywhere. But it still won't
catch many old-time Gentoo users who can actually have something
opposite to say. It won't be unbiased.

--
Best regards,
Michał Górny
Attachments: signature.asc (0.31 KB)


mikemol at gmail

Jul 30, 2012, 9:57 AM

Post #25 of 49 (804 views)
Permalink
Re: UTF-8 locale by default [In reply to]

On Mon, Jul 30, 2012 at 12:28 PM, Michał Górny <mgorny [at] gentoo> wrote:
> On Mon, 30 Jul 2012 10:50:29 -0400
> Michael Orlitzky <michael [at] orlitzky> wrote:
>
>> On 07/30/12 10:41, Michał Górny wrote:
>> > On Mon, 30 Jul 2012 10:35:36 -0400
>> > Michael Orlitzky <michael [at] orlitzky> wrote:
>> >
>> >> On 07/27/12 16:16, Aaron W. Swenson wrote:
>> >>>
>> >>> No user will be happy with whatever we decide to use as a default.
>> >>
>> >> The defaults should be what's best for the most people, with a bias
>> >> towards safety. Why don't we just take a survey and choose the most
>> >> common utf8 response?
>> >
>> > How can you take a survey like that? How will you ensure it actually
>> > hits the majority? How will you define the majority?
>> >
>>
>> Considering that the alternative is to force everyone to change it
>> manually, you can do it however you want and it'll be an improvement.
>
> My point here is that you want the thing to change. So you first try to
> convince people here to change. We practically did a small survey here
> and in the result we didn't agree on doing the change.
>
> So you're saying we should do another survey on another group, hoping
> that this time the result will be on your side.
>
>> 1) Create a webpage with a bunch of options, count the results
>>
>> 2) Ask the g.o mailing lists, count responses manually
>>
>> 3) Use google docs like the website survey that went out a few days
>> ago
>>
>> It won't hit everyone, but no survey ever does. As long as you get a
>> large enough unbiased sample, it doesn't matter. And anything would be
>> an improvement, so it doesn't matter anyway.
>
> It depends on who the 'unbiased sample' is. Are you interested only in
> opinion of Gentoo users who visit the website? Who sync once a day?
> Once a week? Who follow Gentoo Planet? Who participate in the forums?
>
> We can create the survey and announce it everywhere. But it still won't
> catch many old-time Gentoo users who can actually have something
> opposite to say. It won't be unbiased.

I was thinking about this, and I suspect that a survey period of 1-2
months is likely fine. It should also be enough to scoop up people who
run servers and monitor those servers for security updates.

--
:wq

First page Previous page 1 2 Next page Last page  View All Gentoo dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.