After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 77564 - 'binary warning' on loading of euc-kr
'binary warning' on loading of euc-kr
Status: RESOLVED FIXED
Product: gedit
Classification: Applications
Component: general
unspecified
Other other
: Normal minor
: ---
Assigned To: Gedit maintainers
gedit QA volunteers
Depends on:
Blocks:
 
 
Reported: 2002-04-04 02:16 UTC by kz
Modified: 2010-05-05 10:59 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
ko.po in euc-kr. (115.96 KB, text/plain)
2002-04-04 10:02 UTC, kz
Details
chat conversation saved from xchat. (4.28 KB, text/plain)
2002-06-26 13:41 UTC, Manuel Clos
Details
broken EUC-KR text loaded in ISO-8859 encoding. (127.16 KB, image/png)
2002-07-15 12:16 UTC, kz
Details

Description kz 2002-04-04 02:17:30 UTC
Package: gedit
Severity: enhancement
Version: 1.116.0
Synopsis: 'binary warning' on loading of euc-kr
Bugzilla-Product: gedit
Bugzilla-Component: general

Description:
"""
Could not open the file "/home/Keizi/src/xchat-20020314/po/ko.po"
because it contains invalid UTF-8 data.

Probably, you are trying to open a binary file.
"""

xchat 1.9 is in stage of porting to gtk2. and .po for gtk2 have to
contain utf8 not native locale.
so I try to convert by gedit2. load euc-kr and save utf8. I believe
gedit2 can make it.

but gedit2 failed to load the ko.po file.
I think to ignore command of user is out of expect.
what about to make 'load anyway' feature?




------- Bug moved to this database by unknown@bugzilla.gnome.org 2002-04-03 21:17 -------

Reassigning to the default owner of the component, maggi@athena.polito.it.

Comment 1 Paolo Maggi 2002-04-04 06:30:55 UTC
The problem here is that GtkTextView can only display valid UTF-8 text.
If the text file is not in UTF-8 format, then gedit tries to convert
it making the assumption that the file was written using the current
user locale.
I really cannot imagine another reliable way to solve this problem,
even if I know it is not perfect.
Comment 2 kz 2002-04-04 08:01:15 UTC
yeah. I understand the complexity between native and utf8.
this is really big problem to i18n, I think.

btw, the file that I failed to load is really in euc-kr only.
and I'm in locale of ko_KR.eucKR. it's funny why gedit2 failed.
Comment 3 Paolo Maggi 2002-04-04 09:04:29 UTC
Really strange.
Please, attach the file you cannot open and your locale configuration.
Comment 4 kz 2002-04-04 10:02:08 UTC
Created attachment 7547 [details]
ko.po in euc-kr.
Comment 5 kz 2002-04-17 18:33:18 UTC
well, the ko.po has ad broken character.
euc-kr 2-byte widechar. so there sometimes be 1-byte broken character
by accident.
so gedit's warning is right.

but I think gedut'd better to support to 'load anyway'.
Comment 6 Manuel Clos 2002-06-26 11:08:57 UTC
Hi,
from yesterday build (25/6/2002) of anoncvs, I'm using gnome2 as
dogfood. The problem is that I have seen this bug with two or three files.

gedit show the same warning that "less" in the command line will show.
And when you read it with less, it shows <E1> <ED> where it should
show caracters with tilde.

Do you want some of this files?
Comment 7 Paolo Maggi 2002-06-26 13:27:47 UTC
Manuel: yes, please attach the files to this report.
Which locale are you using?
Comment 8 Manuel Clos 2002-06-26 13:41:04 UTC
llanero@llanero:~/gnome2$ echo $LOCALE

llanero@llanero:~/gnome2$ echo $LANG  
es_ES.ISO-8859-1
llanero@llanero:~/gnome2$ file louie.txt 
louie.txt: ISO-8859 English text

I tried:
export LOCALE=es

but it still happens.

Please, note that after I hit the OK button, gedit does *not* disable
all the icons in the toolbar, menus, ... (assertion `active_child !=
NULL' failed).
Comment 9 Manuel Clos 2002-06-26 13:41:54 UTC
Created attachment 9462 [details]
chat conversation saved from xchat.
Comment 10 Manuel Clos 2002-06-26 13:50:00 UTC
More info:

llanero@llanero:~/gnome2$ cat louie.txt 
--> Estás hablando ahora en #bugs

Can you read "Est'as" ?? Here it shows correctly, but in
gnome-terminal I don't see the "a"  + the tilde.
Comment 11 Manuel Clos 2002-06-26 13:54:06 UTC
See the screenshot at:

http://llanero.eresmas.net/bugs/Screenshot-Gnome-terminal.png

Kang: what distro are you using?

I think it is not gedit fault, but some lib or misconfiguration.
Comment 12 Manuel Clos 2002-06-26 14:24:31 UTC
gedit 0.9.7 loads the file but does not show the bad caracter:

http://llanero.eresmas.net/bugs/Screenshot-gedit.png

Note that I can write "á" in gedit2, save it and load the file again
with no problems.

The same file (I don't think mcopy changes the file :) in a redhat box
with ximian gnome2 shows the file correctly both with the text
component in nautilus and with gedit 1.121.1.

Back at my computer, the text view compononent in nautilus shows it
correctly!
(gedit still shows the message about UTF-8).

I will upgrade the red hat box to gedit2 in short.

Please, test if the text component in nautilus works for you.
Comment 13 Manuel Clos 2002-07-07 23:51:53 UTC
Hi, I updated gnome2 in the RH7.2 box. It displays louie.txt correctly
both with the text component and gedit.

Also, updated gedit on my box, I gedit still tells me that it is not
UTF-8, but the text component shows it.

What is the text component doing that gedit does not?
Comment 14 Paolo Maggi 2002-07-08 09:06:18 UTC
Please, update your gedit to current CVS HEAD and let me know.
Comment 15 Manuel Clos 2002-07-09 10:40:59 UTC
I've been waiting a day so anoncvs is updated for sure. It still
happens here.
Comment 16 Manuel Clos 2002-07-15 11:39:43 UTC
Hi! I updated gedit from cvs today and the bug is gone.

I also tried with the other bug attachment and it also works.

Can the reporter please test this?
Comment 17 kz 2002-07-15 11:51:06 UTC
attachment 9462 [details] is ok to load on 0714 snapshot of ximian red-carpet on
rh72.
Comment 18 Manuel Clos 2002-07-15 11:56:46 UTC
And what about attachment 7547 [details]?
Comment 19 kz 2002-07-15 12:15:22 UTC
7547 is broken. EUC-KR the multibyte character appears to be each
single byte.
I think gedit failed to check this file with EUC-KR and fallback to
ISO-8859-1,
I have no idea how to deal the broken widechar, like the 7547 have in.
Comment 20 kz 2002-07-15 12:16:53 UTC
Created attachment 9870 [details]
broken EUC-KR text loaded in ISO-8859 encoding.
Comment 21 Paolo Maggi 2002-07-15 13:50:53 UTC
The algorith used by gedit is:

1. Try to load the file as UTF-8
2. If it fails, load it using current locale encoding
3. It if fails, load it using ISO 8859-15
4. If it fails, display an error message.

Have you a better idea?
Comment 22 Andrew Sobala 2002-11-23 11:04:46 UTC
I can't load this file (ko.po) in 2.1 when I add "Korean" to the input
filters using preferences.
Comment 23 Paolo Maggi 2002-11-25 13:46:02 UTC
Andrew: could you please attach the file you are referring to?

If you are referring to the ko.po file already attached. It was broken
and there is no way for gedit to display broken files (or binary files).
Comment 24 Andrew Sobala 2002-11-25 16:06:38 UTC
I was referring to the ko.po already attached.

Is this bug fixed now, with the new input encoding preferences etc?
Comment 25 Paolo Maggi 2002-11-25 16:11:20 UTC
I think it is fixed.

Closing
Comment 26 hey 2010-05-05 10:59:20 UTC
This still happens with .doc files and several other extensions. The dropdown menu for encoding does not help at all. Reported at https://bugs.edge.launchpad.net/gedit/+bug/575500