GNOME Bugzilla – Bug 618108
explode_locale and the awesome tt_RU@iqtelif.UTF-8 identifier
Last modified: 2010-05-09 10:40:16 UTC
Created attachment 160573 [details] [review] suggested replacement an identifier like the awesome export LANG=tt_RU@iqtelif.UTF-8 confuses the current explode_locale implementation into... language = "tt" territory = "_RU@iqtelif" codeset = ".UTF-8" modifier = NULL but this should be split as... language = "tt" territory = "_RU" codeset = NULL modifier = "@iqtelif.UTF-8" i.e. can't just assume that the first . or _ will always denote the beginning of the codeset or language.
export LANG=tt_RU@iqtelif.UTF-8 this may be awesome, but it is not what the function parses. From 'info libc': There is one more or less standardized form, originally from the X/Open specification: `language[_territory[.codeset]][@modifier]' and in fact, tt_RU.UTF-8@iqtelif works much better.
The issue is not whether tt_RU.UTF-8@iqtelif is a better identifier, it clearly is (See https://bugzilla.redhat.com/show_bug.cgi?id=589138 for trying to get that fixed). The issue is that if the optional [_territory] or [.codeset] is not present in the identifier, but the optional [@modifier] is present, and that effectively free-form modifier happens to contain a . or _ then the parse will consume into the modifier, i.e. forget about the actual "UTF-8" chunk of text and say take something like e.g. ga_IE@roman.4thcentury As an aside, POSIX:2008 gives a subtly different language[_territory][.codeset][@modifier] rather than language[_territory[.codeset]][@modifier] i.e. a codeset may be present even if a territory is not present, but that doesn't materially change much.