GNOME Bugzilla – Bug 638874
sorting as locale foo is broken
Last modified: 2011-01-09 21:57:12 UTC
Trying to sort words written in Turkish, setting the sort dialog's Locale option to Turkey(tr_TR). However, Turkish characters appear at the end of all English characters. my system's locale is en_US.UTF-8 Reproduce problem: For instance sorting these characters: ç a b c d e expected outcome: a b c ç d e actual outcome: a b c d e ç
This looks like a caching problem to me. I use the 3 letters a U00e7 e and sort in C and en_CA. Whatever I use first works, but then it doesn't for the second one. en_CA should result in a U00e7 e and C should result in a e U00e7.
go_string_cmp* as called by value_cmp will cache the collate key.
Might be a duplicate of #631504. I can't actually reproduce with current HEAD. I obtain the anticipated order both in fr and en locales.
Jean, the sorting in all fr locales and some (at least) en locales is the same so you would not see a caching effect. For me the caching effect is visible in en_CA versus C.
It's not a duplicate of bug 631504. It's related in a weak sense, though.
So currently if I sort a range using a non-default locale then all comparisons of strings in that range may be incorrect for the remainder of the session. This sounds like the potential for a nightmare. If this is not fixed by the next release we need to disable the locale selector in the sort dialog.
I see two solutions: 1. use cache only for the default locale (the one active when go_string_init() is called; 2. break the GOString API and implement a per locale cache mechanism.
3. Have go_setlocale clear all cached locale-dependent caches. (That might make it a quite heavy operation -- I'm not sure I like that.) 4. Don't use go_string_cmp/go_string_cmp_ignorecase in value_compare when a locale has been set. That would require an extra argument.
I like #4 the best. #1 would create an unnecessary performance hit for most sorts (in default locale) since this would require locale checks. #2 is unnecessary complicated since most sorts will be in the default locale. #3 appears to be overkill.
Created attachment 177861 [details] [review] poposed patch Proposed patch to fix the caching issue. For locales that are not the current locale the patch is not caching the collation keys.
Note that with this patch the sorting in tr_TR works fine for me.
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.