After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 93751 - Problems with file names encoding
Problems with file names encoding
Status: RESOLVED NOTABUG
Product: glib
Classification: Platform
Component: general
2.0.x
Other other
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2002-09-20 11:29 UTC by Gonzalo Paniagua Javier
Modified: 2004-12-22 21:47 UTC
See Also:
GNOME target: ---
GNOME version: 2.0


Attachments
contains test case, makefile and a file whose name causes trouble (1.23 KB, application/octet-stream)
2002-09-20 11:30 UTC, Gonzalo Paniagua Javier
Details

Description Gonzalo Paniagua Javier 2002-09-20 11:29:15 UTC
When i get a file name using g_dir_read_name (), if the name contains
characters above 127, g_print complains saying [Invalid UTF-8].

Then i tried g_filename_to_utf8 but this one returns NULL and the error is
the same: invalid utf8.

The documentation about talks about "the encoding for filenames".

I also created a GString by adding the characters in the file name using
g_string_append_unichar and when i g_print it, no complains made, but the >
127 character is not displyed correctly (tried differents LC_ALL values
such as C and es_ES and also setting the terminal in UTF8 mode).
Comment 1 Gonzalo Paniagua Javier 2002-09-20 11:30:38 UTC
Created attachment 11186 [details]
contains test case, makefile and a file whose name causes trouble
Comment 2 Gonzalo Paniagua Javier 2002-09-20 11:31:27 UTC
Ah, forgot to say that i'm using glib 2.0.6 (debian sid)
Comment 3 Matthias Clasen 2002-09-20 11:38:27 UTC
Have you tried setting the G_BROKEN_FILENAMES envvar ?
See 
http://developer.gnome.org/doc/API/2.0/glib/glib-running.html
Comment 4 Havoc Pennington 2002-09-20 12:49:27 UTC
g_print only accepts UTF-8, you must ensure anything passed to it is
UTF-8.

If you have non-UTF-8 filenames, then you have to set
G_BROKEN_FILENAMES, and run in a locale that reflects your filenames
(such as en_US.ISO-8859-1). Then g_filename_to_utf8() will convert the
filenames.

g_string_append_unichar() works because Latin-1 has the same numeric
values as Unicode, so you are converting from Latin-1 to Unicode when
you append each Latin-1 char as if it were a Unicode char. But this
won't work in any locale other than Latin-1.
Comment 5 Gonzalo Paniagua Javier 2002-09-20 13:09:48 UTC
(My aim is to use g_utf8_to_utf16. That's why i need the file name to
be utf8. So using g_print is just a way to show what's happening to me)

I use linux and have not done anything to change the encoding of the
file names.

gpanjav@lalo2:~/gnome2/testfile$ echo $LC_ALL
es_ES
gpanjav@lalo2:~/gnome2/testfile$ echo $G_BROKEN_FILENAMES

gpanjav@lalo2:~/gnome2/testfile$ ./testfile
[Invalid UTF-8] GOMAESPUMA - El Calcetín.mp3
(null)

Now using es_ES.ISO-8859-1 and setting G_BROKEN_FILENAMES:

gpanjav@lalo2:~/gnome2/testfile$ export LC_ALL=es_ES.ISO-8859-1
gpanjav@lalo2:~/gnome2/testfile$ export G_BROKEN_FILENAMES=1
gpanjav@lalo2:~/gnome2/testfile$ ./testfile
[Invalid UTF-8] GOMAESPUMA - El Calcetín.mp3
(null)

And the same happens when using en_US.ISO-9959-1.

Am i doing anything wrong?

Comment 6 Havoc Pennington 2002-09-20 13:39:15 UTC
Does your test app call setlocale()? 

(I'll look at it later)
Comment 7 Gonzalo Paniagua Javier 2002-09-20 13:43:04 UTC
Nope! But i have just tried setting setlocale (""); and it doesn't
work either.

Thanks for the quick replies!
Comment 8 Matthias Clasen 2002-09-20 13:57:37 UTC
Shouldn't that be setlocale (LC_ALL, "") ?
you should verify that g_get_charset () 
returns indeed
ISO-8859-1 after the setlocale call.
Comment 9 Gonzalo Paniagua Javier 2002-09-20 14:27:00 UTC
Yeah! Now it works when i set G_BROKEN_FILENAMES (and using the proper
call to setlocale as Matthias pointed out).

So i guess i will first try g_filename_to_utf8 and if it fails, i will
setenv (if not already set) G_BROKEN_FILENAMES and try again. Or is
there any other way to do it?

Thanks!

(i let the bug open for you both to see it just in case you wanna
answer my question before closing it :-)
Comment 10 Gonzalo Paniagua Javier 2002-09-20 14:42:46 UTC
Oh, crap. setenv does not work cause 'broken' is only set on the first
call to have_broken_filenames! Any hint?
Comment 11 Matthias Clasen 2002-09-20 14:50:16 UTC
If your filenames are broken, that is a property of your local
environment. Thus you 
should put G_BROKEN_FILENAMES=1 in your 
environment. Apps should not set it, only 
read it.