GNOME Bugzilla – Bug 94672
Filedialog does not convert filenames correctly.
Last modified: 2004-12-22 21:47:04 UTC
When listing a directory where filenames have non-ASCII characters (ISO-8859-1 in this case), the file selection box produces this on stderr: Gtk-Message: [Invalid UTF-8] Filnamnet "böcker" kunde inte konverteras till UTF-8 (prova att ställa in miljövariabeln G_BROKEN_FILENAMES): Ogiltig bytesekvens i konverteringsindata Untransltaed: Gtk-Message: [Invalid UTF-8] The filename "böcker" couldn't be converted to UTF-8 (try setting the environment variable G_BROKEN_FILENAMES): Invalid byte sequence in conversion input My locale settings clearly states that I am using ISO-8859-1 so Gtk+ should be able to convert from it. Worse, the file name in question is not present in the file selection box and can't be choosen. As a side note, messages like the above would be better off converted to the current locale before written. Some GNU tools does this. The name G_BROKEN_FILENAMES is not very descriptive, it suggests that my filenames are broken. There is no standard mandating any character set for filenames.
Please set G_BROKEN_FILENAMES We consider locale-dependent filenames to be "broken" because: - Multiple users may have different loclale needs, but can't share a filesystem if the filenames are locale-dependent. - If a single user needs to switch to a different locale temporarily, all their filenames get reinterpreted. - Tar files with such filenames in them aren't portable. Etc. I'm sorry if you don't like the name :-), but it will do what you want. GLib does convert messages to the current locale's encoding when printing them. The only reason I can think for them not being converted is if you have GLib-2.0.0 - it has done the conersion in all subsequent releases.
Well, from a philosophical view I guess that makes sense. But from a practical view, there are problems. First, setting an environment variable is too late, many applications are started from panels/window managers, and to make an environment variable take effect, you may have to restart the whole window system. Also, stderr for applications started this way are not shown in a GUI anyway, so the unsuspecting user does not even know that there was an error. It is really hard to explain to a user why GTK is correct when every other application out there that isn't linked to GTK can show and read these files just fine, whereas in the Gtk file dialog, these files does not even show up, incorrectly named or not (compare for example Mac OSX, which displays the filenames, but with funny characters). It is very bad that files are silently not shown. The very least a file dialog should to is try its best to show all files. For example, it could test for UTF-8 filenames and if that fails, convert from the current locale. If that fails, it should display it anyway (perhaps replacing the offending characters with ? or something). That behaviour can be explained to a user, by mumbling about locales and charsets. By converting from the current locale if the filename isn't UTF-8, one could get two identical names in the file list. But this situation already exists with the BROKEN_FILENAME variable set. It is a pity that the file dialog doesn't have the original string around, in that case a user could try to open both of the duplicates to see which is which. Would a change like this be accepted if implemented? About Glib, I'm running 2.0.4, I've filed a separate bug for that.