After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 58195 - alias for iconv module names in g_iconv_open
alias for iconv module names in g_iconv_open
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: general
1.3.x
Other Solaris
: Normal enhancement
: ---
Assigned To: gtkdev
gtkdev
Depends on: 58472
Blocks: 55152
 
 
Reported: 2001-07-27 21:24 UTC by Hidetoshi Tajima
Modified: 2011-02-18 15:47 UTC
See Also:
GNOME target: ---
GNOME version: Unversioned Enhancement


Attachments
try aliases when opening with given code set fails in g_iconv_open (7.62 KB, patch)
2001-07-27 21:52 UTC, Hidetoshi Tajima
none Details | Review
the same patch as the previous but in diff -u output (8.01 KB, patch)
2001-07-27 21:54 UTC, Hidetoshi Tajima
none Details | Review
a sample alias file for testing (179 bytes, text/plain)
2001-07-27 22:06 UTC, Hidetoshi Tajima
  Details
a new patch but with g_io_channel_read_line_string issue (7.02 KB, patch)
2001-08-02 21:10 UTC, Hidetoshi Tajima
none Details | Review
alias file without quating outside of list of aliases (647 bytes, text/plain)
2001-08-02 21:14 UTC, Hidetoshi Tajima
  Details
2nd patch using g_io_channel (6.96 KB, patch)
2001-08-03 02:01 UTC, Hidetoshi Tajima
none Details | Review
the 3rd patch - iconv_modules is not static (7.12 KB, patch)
2001-08-03 19:21 UTC, Hidetoshi Tajima
none Details | Review

Description Hidetoshi Tajima 2001-07-27 21:24:05 UTC
When glib is configured without with-libiconv option, 
g_iconv_open() quite often fails on Solaris,  since iconv
module names which are given to g_iconv_open() from
pango x11 module don't match with what exists on Solaris.

To fix this, we want to create an aliasing mechanim
in g_iconv_open() so that  if opening with the initial name
fails, we try known equivalent names.
Comment 1 Hidetoshi Tajima 2001-07-27 21:47:03 UTC
Created a patch to try other known names when
iconv_open() failes with initial codeset names. 
(will create a new attachment for the diff after this
comment.)

I'm afraid if creaing a new file is allowed, but want to
introduce $prefix/etc/glib-2.0/giconv.aliases file
as an external database to define codeset names and
their aliases. When iconv_open() first fails in g_iconv_open(), 
the file is opened and read once to get codeset names and 
their aliases pairs.

Note that most of the code to open and read  the file is created
from pango's pango-utils.c and pangox-fontmap.c as reference, and
file format is similar to $prefix/etc/pango/pangox.aliases.




Comment 2 Hidetoshi Tajima 2001-07-27 21:52:19 UTC
Created attachment 796 [details] [review]
try aliases when opening with given code set fails in g_iconv_open
Comment 3 Hidetoshi Tajima 2001-07-27 21:54:20 UTC
Created attachment 797 [details] [review]
the same patch as the previous but in diff -u output
Comment 4 Hidetoshi Tajima 2001-07-27 22:06:37 UTC
Created attachment 798 [details]
a sample alias file for testing
Comment 5 Owen Taylor 2001-08-02 04:11:11 UTC
I think the approach of reading an external file is OK.

Some quick comments:

 - g_iconv_read_line() would need to be static, but
   I think it would be better to use g_iochannel_read_line()
   here.
 - Code in GLib needs to use static variables in a 
   thread safe fashion, so the initialization and  use
   of 'iconv_modules' needs to be guarded by a mutex.
   Search for G_LOCK in glib for examples.
 - You should g_warning in the case of hitting an error when reading
   the alias file.
 - You leak the GIconvModule structure if you encounter an error.
 - You always leak the two buffers used in g_iconv_read_alias_file
 - I don't think it makes sense to have quoting _outside_ the
   list of aliases. If we want to allow quoting here (and
   I'm not completely sure it's necessary) then I think the
   quoting should be 

    "UTF-8"   "UTF8", "utf,8 ", utF8
   
   Otherwise, we lose all the benefits of allowing quoting except
   for allowing \n and \t, since spaces and commas will be
   lost anyways.
 - You should be using g_ascii_isspace() not isspace() here,
   since we don't want this file to be locale dependent.
Comment 6 Hidetoshi Tajima 2001-08-02 19:58:38 UTC
Created a new patch according to the comments:
(will create a new patch attachment)

 - use g_iochannel
 - G_LOCK 'iconv_modules'
 - g_warning in the case of hitting error
   in opening or reading the alias file
 - free GIconvModule structure in the case
   of hitting an error.
 - fix memory leak of two GString in g_iconv_read_alias_file
 - use g_unichar_isspace() instead of isspace()
 - no quoting outside of list of the aliases

But the new patch gets in a infinite loop which shows:

 testtext (pid:13955): GLib-CRITICAL **: file gstring.c: line 623 
(g_string_erase): assertion `pos + len <= string->len' failed

This is from g_io_channel_read_line_string, and the same
error happens in glib/tests/iochannel-test.
I'll have a further look..
Comment 7 Hidetoshi Tajima 2001-08-02 21:10:46 UTC
Created attachment 837 [details] [review]
a new patch but with g_io_channel_read_line_string issue
Comment 8 Hidetoshi Tajima 2001-08-02 21:14:36 UTC
Created attachment 838 [details]
alias file without quating outside of list of aliases
Comment 9 Hidetoshi Tajima 2001-08-03 02:01:14 UTC
Created attachment 840 [details] [review]
2nd patch using g_io_channel
Comment 10 Hidetoshi Tajima 2001-08-03 02:04:59 UTC
Created a patch again. This  works in the testtext
together with giochannel.c patch for 58472, and 
the alias file in the 08/02/01 17:14 patch.
Comment 11 Hidetoshi Tajima 2001-08-03 19:18:02 UTC
Created new patch. This time, 'iconv_modules' is not static
and the alias file is read every time g_iconv_open
fails with opening with the original charset.

Comment 12 Hidetoshi Tajima 2001-08-03 19:21:22 UTC
Created attachment 843 [details] [review]
the 3rd patch - iconv_modules is not static
Comment 13 Hidetoshi Tajima 2001-08-09 01:18:58 UTC
Note: The current patch does not use g_ascii_isspace but
g_unichar_isspace as I could not find it in the current glib. I've
just
noticed that it will be included in Alex Larsson's patch which is
under
code review in the mailing list.(subject: URIs vs. half-baked URIs
[glib PATCH]) Once it is in, I'll modify the patch using it.

On that condition, how do you think about the other parts of the
patch? I'd prefer to go with the last patch to make iconv_module
non-static.
Comment 14 Hidetoshi Tajima 2001-08-17 18:36:15 UTC
May I commit the 08/03/01 15:21 patch?
If I'm allowed, I'd include g_ascii_isspace in glib/gstrfuncs.c 
and commit together.

The fix of this bug is critical as this blucks cut&pate problem
and also blocks us from building glib with Solaris native iconv.
Comment 15 Hidetoshi Tajima 2001-08-18 04:25:15 UTC
I just noticed GNU gettext, fileutils, and libiconv
take different way to solve the similar problem.
Shoule glib also follow the same way by having 
intl/config.charset in the source tree, and install 
lib/charset.alias?
Comment 16 Owen Taylor 2001-09-26 22:47:49 UTC
I've been planning for a while to use the libcharset code
that the charset.alias comes from for a real implemention
of g_get_charset() so this might be the right approach;
just using the libcharset code would fix the cut-and-paste
problem.

However, I'm not sure that charset.alias will do what we
need for the reverse mapping; it's purpose is to map:
 
 nl_langinfo name => canonical name

And thus it might not have all entries you need to do:

 canonical name => iconv name

For a particular system. It probably will be _close_, so it
might be best to just go ahead, integrate libcharset,
then if necessry, augment charset.config to match our needs.

(One question is: what charsets do we guarantee to support? --
if we only guarantee UTF-8 and the result of g_get_charset()
then this might be sufficient.)



Comment 17 Owen Taylor 2001-09-27 00:30:59 UTC
I have most of the libcharset integration now, need to finish
a bit more cleanup before committing. Then I'll look at
using the aliases from that in the reverse direction for
iconv,
Comment 18 Owen Taylor 2001-09-27 02:50:21 UTC
I've checked in the libcharset based patch to CVS now. I'd appreciate
it if you could look at and see how close it is to fixing the problems
(and also just give it some basic testing ... on Linux I have no
real way of testing this code so it could be completely broken and
segfaulting right now.)
Comment 19 Hidetoshi Tajima 2001-10-12 17:41:48 UTC
My apologies for being late. 

I tested this on today's cvs HEAD build on Solaris8 sparc. Initially
it did not work, but with a one-line patch below,
things start working well. I've so far verified 
locales including C, en_US, zh, zh_TW, ko and ja locales 
for both EUC and UTF-8 encodings. In ja_JP.PCK(SJIS) and
zh_TW.BIG5, it works, too. No warning messages of iconv failure is
seen, and cut&paste operation worked well, too.

Here is the patch. Please review and check in.
Index: gconvert.c
===================================================================
RCS file: /cvs/gnome/glib/glib/gconvert.c,v
retrieving revision 1.23
diff -u -r1.23 gconvert.c
--- gconvert.c  2001/10/01 20:40:05     1.23
+++ gconvert.c  2001/10/12 17:41:02
@@ -111,7 +111,7 @@
   if (!try_conversion (to_codeset, from_codeset, &cd))
     {
       const char **to_aliases = _g_charset_get_aliases (to_codeset);
-      const char **from_aliases = _g_charset_get_aliases
(to_codeset);
+      const char **from_aliases = _g_charset_get_aliases
(from_codeset);
 
       if (from_aliases)
        {

Comment 20 Hidetoshi Tajima 2001-10-17 17:28:22 UTC
Verified the fix on my local build.
Marking the bug "FIXED".