GNOME Bugzilla – Bug 670403
g_utf8_collate returns 0 on U+C5D0 vs U+CD94
Last modified: 2018-05-24 13:47:50 UTC
U+C5D0 "에" U+CD94 "추" According to g_utf8_collate these two are identical. They don't look the same, so I don't think that is correct.
glib just defers to libc (after translating from utf8 to locale encoding if necessary). So either (a) NOTGNOME, it's a libc bug, or (b) NOTABUG, this behavior is correct. Not sure which...
That's a bit too easy. If (b) you should worry about how much glib/gtk code that is safe in the presence of an inconsistent comparison. qsort, for example, has undefined behaviour and can crash.
It's not inconsistent (in any way qsort would care about). The two characters are equal to each other regardless of which order you compare them in.
Dan, you forgot (c) g_utf8_collate incorrectly ignores that strcoll is setting errno: The strcoll() function may fail if: [EINVAL] The s1 or s2 arguments contain characters outside the domain of the collating sequence.
Same issue in the wcscoll branch.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/517.