GNOME Bugzilla – Bug 69059
local filenames treated as UTF-8, but I want them treated as the local character set
Last modified: 2004-12-22 21:47:04 UTC
The current local charset -> UTF-8 conversion routine converts all non-ASCII characters to a '?', i.e. a folder with my name ('Håvard', the second character is the HTML entity å) shows as 'H?vard'
Created attachment 6440 [details] [review] Patch that handles UTF conversion
The code currently assumes that filenames are in UTF-8. The question marks are only used when a character is not valid UTF-8. It's not true that all non-ASCII characters turn into question marks. Your patch changes it to treat all filenames as if they are in the local character set. But this is clearly wrong for paths that aren't on the local machine. And for the local machine itself, the glib default is to assume that filenames are UTF-8 unless G_BROKEN_FILENAMES is set. I'll look into changing this to respect G_BROKEN_FILENAMES, but I'm not going to take this patch. (Another problem with the patch is that make_valid_ascii is the wrong name for the old function, which handles non-ASCII UTF-8 just fine. And this patch leaves us with an unused make_valid_ascii function.)
I'm pretty firm in my opinion that putting locale dependent filenames on the hard drive is wrong. But since I know that some people will disagree, or will need to use locale dependent filenames, that is the reason for G_BROKEN_FILENAMES. - It's an environment variable because that's the only way we have of doing configuration at the GLib level that will apply to all programs using GLib. - It's called "G_BROKEN_FILENAMES" because I want to editorialize that it is a bad thing to have such filenames. If you don't run in a UTF-8 locale, you should restrict your filenames to ASCII. If all GNOME programs use g_filename_to_utf8() than user configuration of this will work properly. (Note that I suspect that nautilus should internally manipulate filenames in the on-disk-encoding, and only convert for display/interchange. Otherwise it will be impossible to manipulate files with invalid filenames, like, say, rename them to something valid.)
Owen suggested in an email that we fall back to the locale character set whenever the filename is not valid UTF-8. I plan to take his advice on that. This leaves a separate question of how to know when to use the locale character set when renaming files.
Adding GNOME2 keyword to bugs with 1.1.x milestone.
Removing GNOME2 keyword from a bunch I accidentally marked incorrectly.
I'd suggest that this isn't major as the problem is with users doing broken things, and the workaround is straightforward- 'don't be broken on your filesystem.'
It's never broken to save files with correctly spelled names, no matter the circumstances.