Bug 610081 – gdm could sort language names to separate latin names from non-latin names.

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 610081 - gdm could sort language names to separate latin names from non-latin names.


Summary:	gdm could sort language names to separate latin names from non-latin names.


Status:	RESOLVED FIXED

Product:	gdm
Classification:	Core
Component:	general
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	GDM maintainers
QA Contact:	GDM maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2010-02-16 09:54 UTC by Takao Fujiwara
Modified:	2010-09-16 00:59 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Patch for gdm-chooser-widget.c (3.88 KB, patch) 2010-02-16 09:58 UTC, Takao Fujiwara	none	Details \| Review
Patch for gdm-chooser-widget.c (5.17 KB, patch) 2010-02-18 10:05 UTC, Takao Fujiwara	none	Details \| Review
Patch for gdm-chooser-widget.c (5.48 KB, patch) 2010-02-24 05:50 UTC, Takao Fujiwara	committed	Details \| Review

Description Takao Fujiwara 2010-02-16 09:54:56 UTC

Currently gdm uses g_utf8_collate() to sort language names.
g_utf8_collate() uses wcscoll() or strcoll() internally.

There is a problem if we use wcscoll() or strcoll() simply for the multilingual strings.

E.g. collation on ja_JP.UTF-8.
U+D55C (Korean) < U+0041 (Albanian) < U+65E5 (Japanese) < U+00CD (Icelandic).

So the current collation shows 'non-Latin < ASCII < non-Latin < Latin' on ja_JP.UTF-8.

I tried to fix glibc at first and I found ja_JP.UTF-8 defines the collation:
http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=localedata/locales/ja_JP;hb=HEAD
U+00CD is included in ja_JP collation and the collation is U+0041 <
U+65E5 < U+00CD.

ja_JP collation is defined and based on charsets instead of human readable sets.
I think probably this problem is not fixed in glibc because it's the Japanese specification:
http://www.linux.or.jp/JF/JFdocs/Japanese-Locale-Policy.txt

I think the appearance of 'non-Latin < ASCII < non-Latin < Latin' is not good on GDM GUI.
I'd like to separating the sort of latin from one of non-latin for human readable multilingual names on CJK locales.

I think you'd like to use wcscoll() or strcoll() to sort latin chars, e.g. 'D' and Composed 'A'(U+00C1).

Probably pre-sorting language names 'Latin < non-Latin' is no problem on all locales.
I'm attaching the patch.

Comment 1 Takao Fujiwara 2010-02-16 09:58:57 UTC

Created attachment 153905 [details] [review]
Patch for gdm-chooser-widget.c

The patch separates latin only strings from others.

Comment 2 Takao Fujiwara 2010-02-18 10:05:07 UTC

Created attachment 154115 [details] [review]
Patch for gdm-chooser-widget.c

Revised the patch.
This can sort Latin < non Latin multi-byte language names wit the human readable collation.

Another idea might be to set en_US.UTF-8 for all locales.
However I think some European locales have different collations in latin chars.

Comment 3 Takao Fujiwara 2010-02-24 05:50:54 UTC

Created attachment 154570 [details] [review]
Patch for gdm-chooser-widget.c

Revised the patch to use g_strdup for setlocale.

Comment 4 Ray Strode [halfline] 2010-09-15 19:15:43 UTC

Comment on attachment 154570 [details] [review]
Patch for gdm-chooser-widget.c

thanks.

Comment 5 Takao Fujiwara 2010-09-16 00:59:11 UTC

Thanks for your review.