Bug 89452 – remove ability to login in C locale

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 89452 - remove ability to login in C locale


Summary:	remove ability to login in C locale


Status:	RESOLVED FIXED

Product:	gdm
Classification:	Core
Component:	general
Version:	2.4.0.x
Hardware:	Other Linux

Importance:	Normal minor
Target Milestone:	---
Assigned To:	GDM maintainers
QA Contact:	GDM maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2002-07-30 19:57 UTC by Havoc Pennington
Modified:	2007-05-09 04:36 UTC

See Also:
GNOME target:	---
GNOME version:	2.0

Description Havoc Pennington 2002-07-30 19:57:39 UTC

I don't know what this does or why ;-) Owen probably does.

--- gdm-2.4.0.0/gui/gdmlanguages.c.clocale      Tue Jul 16 17:06:07 2002
+++ gdm-2.4.0.0/gui/gdmlanguages.c      Tue Jul 16 17:06:28 2002
@@ -397,9 +397,6 @@
        g_hash_table_foreach (dupcheck, (GHFunc) g_free, NULL);
        g_hash_table_destroy (dupcheck);
 
-       if ( ! got_C)
-               langs = g_list_prepend (langs, g_strdup ("C"));
-
        curlocale = setlocale (LC_MESSAGES, NULL);
        if (curlocale != NULL &&
            strcmp (curlocale, "C") != 0 &&

Comment 1 George Lebl 2002-07-30 20:13:13 UTC

Hmmm, this is an evil evil patch (borrowing some bush terminology). 
That patch is in there so that one can login with the 'C' locale. 
This is a failsafe really in case the locale alias file is screwed up,
so that you can still log in and not be utterly confused by some
strange language.

Comment 2 Owen Taylor 2002-07-30 20:15:55 UTC

POSIX/C English is not a valid locale for any practical purposes,
and offering it allows people to select a locale where
many things (fonts, sorting, etc.) will work incorrectly.

Please consider adding en_US instead; if you think a fallback 
locale is necessary. If the system is so screwed up not
to have en_US, then it will fallback to the "C" locale
anyways.

Comment 3 Luis Villa 2002-07-30 20:27:14 UTC

retitling (this was discused on a list lately and I'd like it to be
easily findable.) I'd add 'PATCH' except I get the impression this
doesn't really do what George wants :)

Comment 4 George Lebl 2002-07-30 20:47:33 UTC

Hmmm ... I suppose a fallback to en_US is a good idea .. I'll go
change that right now

Comment 5 George Lebl 2002-07-30 21:06:08 UTC

Just fixed in CVS

Comment 6 Loïc Minier 2007-05-08 10:03:56 UTC

(In reply to comment #2)
> POSIX/C English is not a valid locale for any practical purposes,
> and offering it allows people to select a locale where
> many things (fonts, sorting, etc.) will work incorrectly.

Could you clarify why C isn't a valid locale?  I certainly think it is, and it's the only locale which we can assume to always be there.

Yes, it will cause problems and there might be some 8-bits issues in shells for example, but this is only in case your system is seriously broken and you still want to login to fix things.

> Please consider adding en_US instead; if you think a fallback 
> locale is necessary. If the system is so screwed up not
> to have en_US, then it will fallback to the "C" locale
> anyways.

en_US is certainly not available on all systems, in fact a lot of correctly installed systems will only have a single local locale (sigh) plus support for C/POSIX obviously; plus on Debian you have to pick one of the three en_US.

I think setting en_US if it isn't available can cause serious problems such as Debian bug http://bugs.debian.org/51846 or http://bugs.debian.org/52321.

I'm not sure whether gdm correctly checks whether en_US is truly available, but I'd prefer simply defaulting to C properly.

Comment 7 Owen Taylor 2007-05-08 11:03:50 UTC

- Clearly, setting LANG to the literal string "(null)" is wrong, 
  which is what your bugs seem to be about. That has little or
  nothing to do with this bug. Leaving it unset would be identical
  to setting it to "C".

- Recovery support for screwed up system can't be allowed to compromise
  the user interface for the normal case:

  There's a difference between always offering the people the choice of
  the C locale, which was GDM used to do (if I remember correctly, five
  years later), and falling back to it on a screwed up system. Maybe it 
  would be reasonable to make the C locale the only choice in the list 
  if there were *no* valid locales found, but I'm not sure that situation
  is worth much effort.

- I don't want to enumerate all the ways that C is a broken locale,
  but two major ones:

   - The charset for the C locale is ASCII. The libc printf will 
     actually turn characters > 128 into ? in some cases in the C
     locale, turning, say "Loïc" into garbage. en_US will typically
     be ISO-8859-1, so will pass characters > 128 cleanly, even 
     if the text is actually in UTF-8.

   - Collation in the C locale is in codepoint order, so even 
     raw ASCII text doesn't sort correctly. ("Zebra" sorts before
     "apple".) Nautilus will list files in the wrong order,
     "make check" for GLib fails, etc.

- I'm not sure what you mean by "the three en_US". en_US.UTF-8 is 
  generally a much better locale than en_US for modern purposes if
  there is a choice, but in a fallback situation, just setting
  to "en_US" and letting the system pick the encoding available
  should be fine.

Comment 8 Loïc Minier 2007-05-08 12:42:52 UTC

(In reply to comment #7)
> - Clearly, setting LANG to the literal string "(null)" is wrong, 
>   which is what your bugs seem to be about. That has little or
>   nothing to do with this bug. Leaving it unset would be identical
>   to setting it to "C".

(The reason I'm commenting here is that I'm triaging very old changes found in the Debian package, one of them being to change "en_US" into "C" which seems to have been justified by the Debian bugs I pointed at.)

> - Recovery support for screwed up system can't be allowed to compromise
>   the user interface for the normal case:
> 
>   There's a difference between always offering the people the choice of
>   the C locale, which was GDM used to do (if I remember correctly, five
>   years later), and falling back to it on a screwed up system. Maybe it 
>   would be reasonable to make the C locale the only choice in the list 
>   if there were *no* valid locales found, but I'm not sure that situation
>   is worth much effort.

Sounds good.

> - I don't want to enumerate all the ways that C is a broken locale,
>   but two major ones:
> 
>    - The charset for the C locale is ASCII. The libc printf will 
>      actually turn characters > 128 into ? in some cases in the C
>      locale, turning, say "Loïc" into garbage. en_US will typically
>      be ISO-8859-1, so will pass characters > 128 cleanly, even 
>      if the text is actually in UTF-8.
> 
>    - Collation in the C locale is in codepoint order, so even 
>      raw ASCII text doesn't sort correctly. ("Zebra" sorts before
>      "apple".) Nautilus will list files in the wrong order,
>      "make check" for GLib fails, etc.

Sure, I'm aware of these, but we're talking about coming into a fallback system, and en_US has not any more guarantee of being available than fr_FR.

> - I'm not sure what you mean by "the three en_US". en_US.UTF-8 is 
>   generally a much better locale than en_US for modern purposes if
>   there is a choice, but in a fallback situation, just setting
>   to "en_US" and letting the system pick the encoding available
>   should be fine.

I was mentionning the three en_US because I have all three available ones in Debian on of mine:
en_US ISO-8859-1
en_US.ISO-8859-15 ISO-8859-15
en_US.UTF-8 UTF-8

But sure, "en_US" will work and pick one of these, but we might come back to encoding problems with some of them too.


So, to sum up:
- I still think en_US should not be special cased, that C should be used instead
- it's a nice idea to offer it as a choice if nothing else is available, but as you said: it's probably used by default anyway

(And it's probably quite hard to tell how much of the bugs I mentionned remain valid 7 years later indeed.)

Comment 9 Brian Cameron 2007-05-09 04:36:05 UTC

Note related bug 436811.