After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 94672 - Filedialog does not convert filenames correctly.
Filedialog does not convert filenames correctly.
Status: RESOLVED NOTABUG
Product: gtk+
Classification: Platform
Component: Widget: GtkFileChooser
2.0.x
Other Linux
: Normal normal
: ---
Assigned To: gtk-bugs
gtk-bugs
Depends on:
Blocks:
 
 
Reported: 2002-10-02 11:41 UTC by Jan D.
Modified: 2004-12-22 21:47 UTC
See Also:
GNOME target: ---
GNOME version: 2.0



Description Jan D. 2002-10-02 11:41:02 UTC
When listing a directory where filenames have non-ASCII characters
(ISO-8859-1 in this case), the file selection box produces this on stderr:

Gtk-Message: [Invalid UTF-8] Filnamnet "böcker" kunde inte konverteras till
UTF-8 (prova att ställa in miljövariabeln G_BROKEN_FILENAMES): Ogiltig
bytesekvens i konverteringsindata

Untransltaed:
Gtk-Message: [Invalid UTF-8] The filename "böcker" couldn't be converted to
UTF-8 (try setting the environment variable G_BROKEN_FILENAMES): Invalid
byte sequence in conversion input

My locale settings clearly states that I am using ISO-8859-1 so Gtk+ should
be able to convert from it.  Worse, the file name in question is not
present in the file selection box and can't be choosen.

As a side note, messages like the above would be better off converted to
the current locale before written.  Some GNU tools does this.  The name
G_BROKEN_FILENAMES is not very descriptive, it suggests that my filenames
are broken.  There is no standard mandating any character set for filenames.
Comment 1 Owen Taylor 2002-10-02 11:49:04 UTC
Please set G_BROKEN_FILENAMES 

We consider locale-dependent filenames to be "broken" because:

  - Multiple users may have different loclale needs, but
    can't share a filesystem if the filenames are locale-dependent.
  - If a single user needs to switch to a different locale
    temporarily, all their filenames get reinterpreted.
  - Tar files with such filenames in them aren't portable.

Etc.

I'm sorry if you don't like the name :-), but it will do what
you want.

GLib does convert messages to the current locale's encoding
when printing them. The only reason I can think for them
not being converted is if you have GLib-2.0.0 - it has
done the conersion in all subsequent releases.
Comment 2 Jan D. 2002-10-02 15:16:19 UTC
Well, from a philosophical view I guess that makes sense.  But from a
practical view, there are problems.

First, setting an environment variable is too late, many applications
are started from panels/window managers, and to make an environment
variable take effect, you may have to restart the whole window system.

Also, stderr for applications started this way are not shown in a GUI
anyway, so the unsuspecting user does not even know that there was an
error.

It is really hard to explain to a user why GTK is correct when every
other application out there that isn't linked to GTK can show and read
these files just fine, whereas in the Gtk file dialog, these files
does not even show up, incorrectly named or not (compare for example
Mac OSX, which displays the filenames, but with funny characters).

It is very bad that files are silently not shown.  The very least a
file dialog should to is try its best to show all files.  For example,
it could test for UTF-8 filenames and if that fails, convert from the
current locale.  If that fails, it should display it anyway (perhaps
replacing the offending characters with ? or something).  That
behaviour can be explained to a user, by mumbling about locales and
charsets.

By converting from the current locale if the filename isn't UTF-8, one
could get two identical names in the file list.  But this situation
already exists with the BROKEN_FILENAME variable set.  It is a pity
that the file dialog doesn't have the original string around, in that
case a user could try to open both of the duplicates to see which is
which.

Would a change like this be accepted if implemented? 

About Glib, I'm running 2.0.4, I've filed a separate bug for that.