GNOME Bugzilla – Bug 142138
Gtk::FileChooser::get_filename() should return std::string
Last modified: 2009-05-01 10:47:01 UTC
As stated in the GTK+ documentation, Gtk::FileChooser::get_filename() returns a string in the local filename encoding and thus should not return Glib::ustring.
Can people still use the conversion function on the result? If so, then we need to document it, telling them to put it in a std::string if necessary.
Yes.
Gtk::FileChooser::get_filename() calls Glib::convert_return_gchar_ptr_to_ustring on the return value of gtk_file_chooser_get_filename(), so it correctly returns Glib::ustring. Changing it now would break API.
Glib::convert_return_gchar_ptr_to_ustring() doesn't do any character set conversion, it's just a tiny inline function that handles the 0 pointer case. Note that I'm not proposing to change it now -- although the API break is more of theorhetical nature, it definitely would break the ABI and that's what matters.
Gtk::FileChooser::get_filenames returns a container of Glib::ustring objects (Glib::SListHandle<Glib::ustring>). Should it in fact return a container of std::string objects (as does Gtk::FileSelection::get_selection())? It would be awkward if Gtk::FileChooser::get_filename() returned std::string and Gtk::FileChooser::get_filenames returned a container of Glib::ustring.
Exactly, it should return a container of std::string just like the old FileSelection did. Unfortunately the GTK+ documentation is a bit vague on the other FileChooser methods that return URIs or folder names etc. I still have to investigate those.
For what it is worth, I see that gtk_file_chooser_get_filenames() derives its return value from file_paths_to_strings(), and file_paths_to_strings() is passed a pointer to function gtk_file_system_path_to_filename() as its third parameter. As its name suggests, gtk_file_system_path_to_filename() converts a GtkFilePath object to a gchar* string, but I do not know enough about the GTK+ internals to know what codeset the resulting string is in. gtk_file_chooser_get_filename() also calls gtk_file_system_path_to_filename() to obtain its return value. As you say, it appears therefore that either both Gtk::FileChooser::get_filename() and Gtk::FileChooser::get_filename() incorrectly specify Glib::ustring as their return type/contained return type (which appears to be the outcome), or both specify it correctly, depending on what gtk_file_system_path_to_filename() delivers. GtkFileSelection doesn't appear to use gtk_file_system_path_to_filename() at all. This is a nuisance. I have just converted some code to expect a Glib::ustring container instead of the old std::string container for Gtk::FileSelection.
I have just noticed on reviewing some code of mine that there is the same problem with Gtk::IconInfo::get_filename(). Does this need to be filed as a separate bug? (Probably however this is a problem that is going to be repeated elsewhere in gtkmm-2.4 for any GTK+ file functions which return a gchar* string.) In gtkmm-2.4.2, I wonder if it would be a good idea to actually convert to UTF-8 in all these methods so that they do return a valid UTF-8 encoding. It would avoid breaking ABI (which would occur by changing the return type to std::string or std::string containers), and at present these filename functions are unusable with any file systems which use other than ASCII characters or (if they use other than ASCII) which do not use the UTF-8 codeset, so it wouldn't break API (applying Glib conversion functions to the Glib::ustrings returned by them after the event won't be much good as the Glib::ustrings will already be in an error state if the input wasn't valid UTF-8).
No, doing that would break ABI. The ABI includes the behavior of the library functions too, not just the ability to link with any library of the same ABI version. Also, applying Glib::filename_to_utf8() and Glib::filename_from_utf8() *does* work regardless of the fact that the return type should be std::string for clarity. A Glib::ustring can contain invalid UTF-8 (or else validate() would be useless) -- you just aren't allowed to use any methods apart from validate(), is_ascii(), bytes(), empty(), raw(), operator std::string(), c_str(), data() and the copy constructor. Note that there is a reason why the Gtk::FileChooser interface operates directly in the filesystem encoding: Many users still use some legacy character set for filenames but aren't aware of the G_FILENAME_ENCODING environment variable. Worse yet, they might be using UTF-8 encoded filenames on their system partition but don't have the utf8 mount option specified in /etc/fstab for floppy and cdrom drives (even I didn't use that flag until very recently -- simply because I wasn't aware of it). Thus, the app must be able to cope with invalid byte sequences in filenames. Those bytes of course have to be escaped in order to display the filename, but that shouldn't stop the app from working. Last but not least I don't think gtkmm should change what is cleary a design decision in GTK+ (apart from language-specific considerations of course). What we need is a prominent NOTE in the documentation of each and every Gtk::FileChooser function that takes or returns filenames (it appears that all of them are affected). This note should explain the fact that the gtkmm API is broken and mistakingly uses Glib::ustring where it shouldn't. It should also mention that you have to call Glib::filename_to_utf8() before displaying the filename, and that Glib::filename_from_utf8() has to be used before a filename obtained from Gtk::Entry or the like is passed to Gtk::FileChooser.
As a start, I added a paragraph about the problem plus a link to this bug to the class documentation of Gtk::FileChooser: 2004-06-04 Daniel Elstner <daniel.elstner@gmx.net> * gtk/src/filechooser.hg (FileChooser): Copy the class docs from GTK+. Also explain the fact that the API is broken and how to use it correctly despite this problem (bug #142138).
Fixed in the gtkmm-3maybe branch.