GNOME Bugzilla – Bug 114681
Unicode 4.0 Turkic and Lithuanian special casing
Last modified: 2011-02-18 16:07:18 UTC
Part of the Unicode 4.0 update is some new rules for Turkic and Lithuanian casing, which are hard-coded in glib (they pretty much have to be). I pasted some of SpecialCasing.txt in bug #107974.
Created attachment 17306 [details] [review] proposed patch
Adding PATCH keyword and setting priority to high because of the patch.
Don't really have time to look at this in detail; if you feel comfortable, just go ahead and commit. Otherwise, you might want to find someone else who knows something about Unicode to review. A couple of superficial comments: - I don't really like the extern declaration of the function, though I've done it a couple of times myself. You might want to just add a gunicodeprivate.h. - We tend to deviate from the GNU standards and put && (and other operators) at line breaks on the end of the previous line rather than the start of the new line. than the
Created attachment 19792 [details] [review] updated patch
The updated patch applies to current cvs head, and does the gunicodeprivate.h thing. It also has each of the Lithuanian and Turkic tests in gen-casemap-txt.pl try both xx_YY and xx_YY.UTF-8 (I have the UTF-8 locales installed on my system, but not the others, and it got to be a pain to manually edit the file repeatedly). I'm pretty comfortable with this patch, but I'll wait a few days to apply in case Roozbeh or Jungshik or anybody wants to give it a look, which would be much appreciated. Simply checking that the test cases I added are correct would help a lot to verify that the patch is right.
Created attachment 19794 [details] [review] fix a couple mistakes in the last patch
I'd say go ahead and commit this; one more superficial comment is that to avoid prototype mismatches, you should include gunicodeprivate.h in .c files where you define the functions, not just in .c files where you use them.
Right you are with that superficial comment. :) 2003-09-10 Noah Levitt <nlevitt@columbia.edu> * glib/gunicodeprivate.h: * glib/gunicollate.c: * glib/gunidecomp.c: * glib/guniprop.c: * tests/casemap.txt: * tests/gen-casemap-txt.pl: Unicode 4.0 special casing. (#114681) * glib/gunicodeprivate.h: Use a private header instead of extern function declarations (_g_utf8_normalize_wc, _g_unichar_combining_class).