Bug 618108 – explode_locale and the awesome tt_RU@iqtelif.UTF-8 identifier

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 618108 - explode_locale and the awesome tt_RU@iqtelif.UTF-8 identifier


Summary:	explode_locale and the awesome tt_RU@iqtelif.UTF-8 identifier


Status:	RESOLVED NOTABUG

Product:	glib
Classification:	Platform
Component:	general
Version:	2.25.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gtkdev
QA Contact:	gtkdev

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2010-05-08 13:25 UTC by Caolan McNamara
Modified:	2010-05-09 10:40 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
suggested replacement (1.91 KB, patch) 2010-05-08 13:25 UTC, Caolan McNamara	none	Details \| Review

Description Caolan McNamara 2010-05-08 13:25:14 UTC

Created attachment 160573 [details] [review]
suggested replacement

an identifier like the awesome

export LANG=tt_RU@iqtelif.UTF-8

confuses the current explode_locale implementation into...

language = "tt" 
territory = "_RU@iqtelif" 
codeset = ".UTF-8"
modifier = NULL

but this should be split as...

language = "tt" 
territory = "_RU"
codeset = NULL
modifier = "@iqtelif.UTF-8"

i.e. can't just assume that the first . or _ will always denote the beginning of the codeset or language.

Comment 1 Matthias Clasen 2010-05-09 00:56:58 UTC

export LANG=tt_RU@iqtelif.UTF-8

this may be awesome, but it is not what the function parses.
From 'info libc':

There is one more or less standardized form, originally from the X/Open
specification:

   `language[_territory[.codeset]][@modifier]'


and in fact, tt_RU.UTF-8@iqtelif works much better.

Comment 2 Caolan McNamara 2010-05-09 10:40:16 UTC

The issue is not whether tt_RU.UTF-8@iqtelif is a better identifier, it clearly is (See https://bugzilla.redhat.com/show_bug.cgi?id=589138 for trying to get that fixed). The issue is that if the optional [_territory] or [.codeset] is not present in the identifier, but the optional [@modifier] is present, and that effectively free-form modifier happens to contain a . or _ then the parse will consume into the modifier, i.e. forget about the actual "UTF-8" chunk of text and say take something like e.g. ga_IE@roman.4thcentury 

As an aside, POSIX:2008 gives a subtly different language[_territory][.codeset][@modifier] rather than language[_territory[.codeset]][@modifier] i.e. a codeset may be present even if a territory is not present, but that doesn't materially change much.