GNOME Bugzilla – Bug 313583
Update Unicode tables to Unicode 4.1
Last modified: 2011-02-18 15:49:11 UTC
I'm attaching patch to update glib's data and test tables to Unicode 4.1. Pango 1.10 contains Unicode 4.1 data, so would be good to have Glib with 4.1 too.
Created attachment 50753 [details] [review] Unicode 4.1 patch The patch adds 5 entries to the Line Breaking enum, and updates the data files. No code changes other than the enum.
Note that this patch breaks Pango: Pango-ERROR **: file break.c: line 780 (pango_default_break): assertion failed: (IN_BREAK_TABLE (break_type)) I'm working on a patch for Pango, but the fact that it breaks older Pangos is a bit, well...
Behdad, the pango patch should make sure that it prevents similar problems in the future. And the glib patch should probably add a warning to the docs, that the enumerations might grow due to additions in future Unicode versions.
Thanks Matthias. Safety patch for pango: bug #313857
Created attachment 50933 [details] [review] requested warning patch Tiny patch to add a warning to the docs about future additions and recommending to regard unknown values as G_UNICODE_BREAK_UNKNOWN.
Thank you very much for your bug report ! However, I was not able to reproduce this bug. I applied the unicode 4.1 patch to glib HEAD and it didn't break pango as you describe in your comments. Can you please tell us what program trigger the issue ? Also, could you please specify the glib version that is affected ?
All glib versions are affected. The bug triggers whenever a Korean Hangul character is tried to be rendered, since they are the characters that use the newly defined G_UNICODE_BREAK_* types. Just run ./pango-*view ./HELLO.utf8 in pango/examples and you get the abort. Bug 313857 contains a patch for Pango to not abort. Bug 313907 contains a patch for Pango to use the new line-breaking types.
Thank you very much. I reproduced the bug, applied your fixes to my working copies and it works great.
I chose to set GNOME Version to 2.11/2.12 even if we are in string freeze since there is no next version in the GNOME Version list box.
Forget my GNOME Version problem, I was triaging two bugs at the same time, and choose the wrong GNOME Version for this one. Sorry.
Behdad, whats the status of this ? I would actually like the warning to be added to the api docs, not just in a comment in the header.
I guess we should probably not do this on the stable branch, to avoid breaking pango, but we should do it soon in HEAD.
Created attachment 52681 [details] [review] warning patch for docs Ah, sorry, thought gtkdoc picks up the doc from the comments. This patches docs now.
The three patches together do the job. We have already applied fix to pango to not break (not released though), but yes, HEAD only should be fine. Can be applied IMO.
I would like to see a paragraph added in the long description of unicode.sgml which spells out the supported Unicode versions. Something like " GLib 2.8 supports Unicode 4.0, GLib 2.10 supports Unicode 4.1." And maybe explain a little bit where these version differences may show up in the API. Can you commit it to had with that extra documentation, Behdad ?
Sure. Later today.
Committed after reworking the documentation, and updating the enum values in the docs too. 2005-10-01 Behdad Esfahbod <behdad@gnome.org> * docs/reference/glib/tmpl/unicode.sgml: * glib/gen-unicode-tables.pl: * glib/gunibreak.h: * glib/gunichartables.h: * glib/gunicode.h: * tests/casefold.txt: * tests/casemap.txt: Updated to Unicode 4.1. There are five new GUnicodeBreakType types. That may break some applications, like Pango <= 1.10.