GNOME Bugzilla – Bug 355563
glib needs to support internationalized numbers
Last modified: 2018-05-24 10:52:34 UTC
There is need to support international numbers (like those used in Arabic, Persian, Indic languages, etc.) in glib. Some of glib numerical functions such as 'g_strtod' are Unicode-based functions and so, one expects them to have internationalized behavior. There is a need for having functions 'g_strtoll' and 'g_strtoull', and also reverse functions 'g_dtostr', 'g_lltostr', and 'g_ulltostr', these would be very helpful for writing properly internationalized applications using glib and GNOME libraries. There is one option on how to implement the string-to-number functions. There are two ways to implement them: 1. Accept only numbers from the active locale and English/ASCII 2. Accept numbers from all locales The first will be easier to implement for me, as I am planning to use gettext mechanisms to implement the support. I am volunteering to prepare the required patches and test cases.
Created attachment 73474 [details] [review] This Patch adds implementation for international numbers in glib This patch is intended to implement the functions mentioned at the bug report. Also fixing g_strtod to support local numbers. The Patch uses glib translation system, so it will work for locales with number system translations.
Created attachment 73475 [details] [review] Support for persian numbers translation + test making This patch adds persian numberic system translation for glib. Also changes the Makefile.am inside tests dir to include test for persian numbers. This patch is followed by an other attachment ('persiannumbers-test.c' file) which is required for this patch to work.
Created attachment 73476 [details] Persian Numbers Test This file is the test for persian numbers support in glib. Requires patches 73474, 73475 to work. Also a complementory for 73475(which modified the Makefile.am).
From a quick glance at the main patch: - You should make your helper functions static. - The code is barely readable, because of lots of single-letter variable names. - The g_local_*() functions should mostly be renamed to g_utf8_*(), or at least g_locale_*().
Created attachment 78202 [details] [review] The replacement for the previous patch Comments proposed by B. Esphahbod are taken to consideration, Also some small optimization task are performed. Apply the patch in level '0'.
(In reply to comment #4) > From a quick glance at the main patch: > > - You should make your helper functions static. > Fixed in the next patch. > - The code is barely readable, because of lots of single-letter variable > names. Some of the are fixed but using full name variables instead of 'p' and 'c' makes these functions more difficult. > - The g_local_*() functions should mostly be renamed to g_utf8_*(), or at > least g_locale_*(). > Fixed in the next patch. Anyway, thanks for comments and advices.
Some random comments on the latest patch which may help in getting this applied earlier: For the one-characters locale data strings: * You should use Q_() instead of _() for the short strings. * The translator comments is not very understandable for a random translator/localizer. For example, I suggest something along these lines to be used (adapt for every: /* Translators: This message is for handling localized digits. If the preferred digit forms to be used for displaying decimal numbers in your locale are not the European digits 0, 1, 2, etc, translate this to your local digit zero. glib will assume that the other digits are the next Unicode characters after the local zero. Do not translate the "locale|" part, remove it. In a Spanish locale, for example, this should be translated to "0", while in a Persian locale, it should be translated to "۰". */ return Q_("locale|0"); * It is always possible for translators to make a mistake translating some of the strings. Your code is currently exiting with g_assert when such mistakes are made, while it may be better to give warnings and use default values like "0" instead. Also, you should perhaps check for errors returned from g_utf8_get_char_validated, as you can not assume a meaningful result from the g_unichar_digit_value function when it's not passed a Unicode character. So for example your get_locale_zero_char may look like this: static gunichar get_locale_zero_char () { gunichar zero = g_utf8_get_char_validated (get_locale_zero_str (), -1); if (!(zero & 0x80000000) && g_unichar_digit_value (zero) == 0) { return zero; } else { g_warning(...) return (gunichar) '0'; } } Other random stuff: * Your pow10 function makes me worry. You are doing lots of multiplication there which loses lots of precision, and you are also using a recursive function unnecessarily. Why can't you use the C library's pow? * Your patch does not mention anything about the code authors and the copyright holders. This is specially important if you are basing your patch on some existing functions in glib or glibc. Mention every file your are basing your functions on from whatever library, and make sure you mention their copyright holders and authors.
(In reply to comment #7) > Some random comments on the latest patch which may help in getting > this applied earlier: > > For the one-characters locale data strings: > * You should use Q_() instead of _() for the short strings. OK. > * The translator comments is not very understandable for a random > translator/localizer. For example, I suggest something along these > lines to be used (adapt for every: > > /* Translators: This message is for handling localized digits. If the > preferred digit forms to be used for displaying decimal numbers in your > locale are not the European digits 0, 1, 2, etc, translate this to your > local digit zero. glib will assume that the other digits are the next > Unicode characters after the local zero. > > Do not translate the "locale|" part, remove it. In a Spanish locale, for > example, this should be translated to "0", while in a Persian locale, it > should be translated to "۰". > */ > return Q_("locale|0"); OK. It would be better. > * It is always possible for translators to make a mistake translating > some of the strings. Your code is currently exiting with g_assert when > such mistakes are made, while it may be better to give warnings and > use default values like "0" instead. Also, you should perhaps check > for errors returned from g_utf8_get_char_validated, as you can not > assume a meaningful result from the g_unichar_digit_value function > when it's not passed a Unicode character. So for example your > get_locale_zero_char may look like this: > > static gunichar > get_locale_zero_char () > { > gunichar zero = g_utf8_get_char_validated (get_locale_zero_str (), -1); > if (!(zero & 0x80000000) && g_unichar_digit_value (zero) == 0) > { > return zero; > } > else > { > g_warning(...) > return (gunichar) '0'; > } > } I agree, this may lead critical problem! > Other random stuff: > * Your pow10 function makes me worry. You are doing lots of > multiplication there which loses lots of precision, and you are also > using a recursive function unnecessarily. Why can't you use the C > library's pow? glic pow10 functions needs -lm on link (for math library). This means that Makefile.am should be patched as well and a library would be added to glib just for my patch!! I am not realy sure. Maybe we should ask for a experts would to help us on this. Anyway there may be up to 308 multiplications for double precision and I think binary multilication maybe usfull (using at most 9 instead of 308 is usefull). > * Your patch does not mention anything about the code authors and the > copyright holders. This is specially important if you are basing your > patch on some existing functions in glib or glibc. Mention every file > your are basing your functions on from whatever library, and make sure > you mention their copyright holders and authors. Well nothing is actually imported, I have just used a small idea on checking overflow from glib source. I have just used the idea, I don't think this would have a copyright holding on it.
Created attachment 81060 [details] [review] Support for internatoinal numbers in glib The patch replaces the previous one considering new comments from Roozbeh. Also adding Q_ translation for glib.
Created attachment 81062 [details] [review] Persian numbers Translation
Created attachment 81063 [details] [review] Changing the makefile for adding test for persian numbers.
Roozbeh Pournader changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|glib needs to support |glib needs to support |international numbers |internationalized numbers Or maybe "glib needs to support /localized/ numbers.
Created attachment 83543 [details] [review] The new implementation of localized numbers! + Added POSIX compatibility + Added ISO C90 support + Added support for locales which glibc supports a different decimal point + A few bug fix / code cleaning / redendaunt code removing + no need for pow10 + corrected translation comments (tanx for roozbeh)
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/63.