After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 59320 - rewrite of mozilla_locale_to_unicode, mozilla_unicode_to_locale
rewrite of mozilla_locale_to_unicode, mozilla_unicode_to_locale
Status: RESOLVED FIXED
Product: galeon
Classification: Deprecated
Component: i18n
0.x
Other All
: Normal normal
: 1.0
Assigned To: Philip Langdale
Yanko Kaneti
: 60813 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2001-08-21 14:30 UTC by Takayuki KUSANO
Modified: 2004-12-22 21:47 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
patch to mozilla.cpp (6.10 KB, patch)
2001-08-21 14:31 UTC, Takayuki KUSANO
none Details | Review
unified diff version (5.28 KB, patch)
2001-08-21 15:05 UTC, Takayuki KUSANO
none Details | Review
no 'goto' version. (4.84 KB, patch)
2001-08-27 02:20 UTC, Takayuki KUSANO
none Details | Review
replace utf8_to_locale()/locale_to_utf8() with new functions (7.57 KB, patch)
2001-09-05 15:26 UTC, Takayuki KUSANO
none Details | Review
New version of locale <-> utf8 patch (10.18 KB, patch)
2001-09-11 18:37 UTC, Takayuki KUSANO
none Details | Review
Fix bookmark search related bugs. (2.53 KB, patch)
2001-09-21 03:42 UTC, Takayuki KUSANO
none Details | Review

Description Takayuki KUSANO 2001-08-21 14:30:14 UTC
In mozilla_locale_to_unicode() and mozilla_unicode_to_locale() in
mozilla.cpp, wcstombs() and mbstowcs() is used for encoding conversion.
These assume that wchar_t == unicode.

On other than linux platform, wchar_t is not always in unicode. So,
we cannot use these functions to convert unicode strings to/from locale
strings.

So, we must use some encoding conversion functions like iconv(3). But,
iconv(3) may introduce another problem. The problem that may be occur is
the difference of encoding conversion table. Mozilla uses its own encoding
conversion functions, and iconv(3) in glibc (or libiconv, etc.) use anoter
conversion tables. Most part of these tables are same, but some portions of
tables are different.

For example, Mozilla converts 0xA1C1 in EUC-JP (== 0x2141 in JIS X 0208)
to U+FF5E (unicode). And iconv(3) in glibc 2.2.4 converts U+FF5E in Unicode
to SS2 0xA2B7 in EUC-JP (0x2237 in JIS X 0212).
So, tiles of some pages cannot correctly rendered with galeon, and bookmark
of such pages will corrupt.

So, nsIUnicodeEncoder/Decoder in Mozilla will appropirate for thsese purpose.
I've rewrite mozilla_locale_to_unicode() / mozilla_unicode_to_locale() in
mozilla.cpp.
Comment 1 Takayuki KUSANO 2001-08-21 14:31:33 UTC
Created attachment 925 [details] [review]
patch to mozilla.cpp
Comment 2 Philip Langdale 2001-08-21 14:47:44 UTC
heh, funnily enough, we originally used to use the mozilla
encoders and decoders. :-)

Can you please reformat in unified diff and I'll need more convincing
to let a goto in like that. :-)
Comment 3 Takayuki KUSANO 2001-08-21 15:05:30 UTC
Created attachment 926 [details] [review]
unified diff version
Comment 4 Philip Langdale 2001-08-26 22:03:25 UTC
Ok, still, I'm uncomfortable with the goto thing. Can you restructure
it without using gotos?

That said, I do prefer the mozilla based implementation and I want
this in eventually.
Comment 5 Takayuki KUSANO 2001-08-27 02:20:59 UTC
Created attachment 978 [details] [review]
no 'goto' version.
Comment 6 Takayuki KUSANO 2001-08-27 03:02:35 UTC
I've digged the mozilla source code and found 6 Japanese
characters that may cause bookmark/history/title bar
etc. breakage.

The characters are 0x2141, 0x2142, 0x215D, 0x2171, 0x224C (in JIS X 0208).

Mozilla converts them into unicode U+FF5E, U+2225, U+FF0D, U+FFE0,
U+FFE2, respectively.

And iconv(3) of glibc 2.2.4 cannot convert correctly these unicode
characters into JIS X 0208 charcters.

glibc's iconv(3) converts original JIS X 0208 characters into Unicode
U+301C, U+2016, U+2212, U+00A2, U+00AC, respectively. This glibc
behavior is based on the mapping table distributed from unicode.org.

This mapping table is now obsoleted. See
http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/

You can also find another information about 
ambiguities in conversion at
http://www.w3.org/TR/2000/NOTE-japanese-xml-20000414/

Comment 7 Takayuki KUSANO 2001-08-28 02:51:10 UTC
I've checked out latest CVS galeon and using it. The patch commited
seems to be working good.
So I close this bug. Thanks.
Comment 8 Takayuki KUSANO 2001-09-05 15:25:14 UTC
CVS galeon now start to hold strings of bookmarks in UTF-8.
By this change, bookmark/history corruption appear again.

So, I write 2 functions, mozilla_utf8_to_locale() and
mozilla_locale_to_utf8(), and replace some utf8_to_locale() and
locale_to_utf8() with them.

(If rewrite all locale_to_utf8()/utf8_to_locale() with new functions,
galeon cannot run correctly. I've not yet examine why this happens.
May be XPCOM related problems.)
Comment 9 Takayuki KUSANO 2001-09-05 15:26:21 UTC
Created attachment 5060 [details] [review]
replace utf8_to_locale()/locale_to_utf8() with new functions
Comment 10 Philip Langdale 2001-09-06 19:12:02 UTC
I'm almost certain that the problem here is because mozilla is not
initialised when the bookmarks are first loaded, we'd have to change
the init order for this to work.
Comment 11 Yanko Kaneti 2001-09-09 18:48:15 UTC
looks like something we should get right before 1.0
Comment 12 Marco Pesenti Gritti 2001-09-10 22:28:15 UTC
I moved the xpcom startup, can you check if now your patch works ?
Thanks
Comment 13 Takayuki KUSANO 2001-09-11 18:36:24 UTC
I've just checked out new CVS galeon and applied my patch and it
seems to work fine. Also I replaced more locale_to_utf8() and
utf8_to_locale() with new functions and compiled.
This also works fine. I attach new version of this patch.
Comment 14 Takayuki KUSANO 2001-09-11 18:37:34 UTC
Created attachment 5586 [details] [review]
New version of locale <-> utf8 patch
Comment 15 Marco Pesenti Gritti 2001-09-12 19:55:06 UTC
committed, thank you a lot !
Comment 16 Ricardo Fernández Pascual 2001-09-20 18:29:03 UTC
I'm reopening this bug until we fix the problems that some people has
with bookmarks.
Comment 17 Ricardo Fernández Pascual 2001-09-20 18:30:45 UTC
*** Bug 60813 has been marked as a duplicate of this bug. ***
Comment 18 Takayuki KUSANO 2001-09-21 03:42:15 UTC
Created attachment 5647 [details] [review]
Fix bookmark search related bugs.
Comment 19 Takayuki KUSANO 2001-10-03 15:05:42 UTC
I think, boomark search related bug was fixed. So close this.