GNOME Bugzilla – Bug 554558
gvfs ftp backend has no mean to specify encoding for remote filenames
Last modified: 2018-09-21 16:27:22 UTC
Please describe the problem: The FTP protocol is somewhat a mess because in older times the filename encoding was not standardized in any way. This has lead to lots of ftp servers which use different filename encodings for characters > 128. Most large public ftp sites don't play with this and just use latin1 characters, but in local networks there are thousands of user ftp sites who don't care about RFCs and such. AFAIR there's a recent FTP RFC which introduces the notion of filename encoding; however there are still little servers using it, and most users have to deal with existing ftp servers, which don't support this. Even if they would, I'm not sure GVFS supports this as well, but anyway this is not the point. Most ftp clients have a special setting which defines the encoding of filenames on the remote server. This is usually a per-server setting, however sometimes there's a default value which is useful as well. It would be very useful to have such a option in GVFS as well. Without it the ftp backend is mostly useless for users in countries using characters with codes > 128 for file names. Steps to reproduce: 1. Install vsftpd (for example, any other ftp server will do) and start it 2. Unpack the attached tarball to ~ftp. This should create a couple of directories and empty files under pub/ using windows-1251 encoding. 3. Open the ftp server via nautilus. Actual results: Directory and file names using characters > 127 will be undreadable Expected results: Should be readable Does this happen every time? Other information: As I see it, a global setting could be stored somewhere in gconf and a per-server encoding could be either as part of URL (e.g. ftp://user:password@encoding*localhost/pub or something like that) or asked in a dialog, just like login information is.
Created attachment 119721 [details] A test set of directories and empty files using the Windows-1251 encoding for file names
This is part of a general problem. We would like to add persistant per-mount properties that the user can specify. This would include things like specifying the filename encoding on ftp sites.
(In reply to comment #2) > This is part of a general problem. We would like to add persistant per-mount > properties that the user can specify. This would include things like specifying > the filename encoding on ftp sites. One option is to encode stuff like this in the URI. If it's not encoded in the URI, then it needs to live in another file somewhere (and with global bookmarks, you'd need a global file too and now the system admin needs to provide not only the bookmark but also this new file). If it's encoded in the URI, the URI becomes gvfs specific. But we've already established that gvfs URI are gvfs specific, even more so with ftp URIs: bug 528670 comment 12. I don't know.
And being able to set default values for these options would be very handy as well. Often you just click on ftp://blah links from other programs, and those URLs won't contain a encoding... Another issue here is the support for RFC2640 and especially RFC3659 in the ftp backend of GFS: http://www.ietf.org/rfc/rfc2640.txt (section 3.3) http://www.ietf.org/rfc/rfc3659.txt (sections 2.2, 7.3) The above RFCs define a special feature "UTF8" which, if present, means that all file names are exchanged in UTF-8. Otherwise, the user-specific encoding should be used.
You don't want to save this in the URI, because such info would not really be part of the mount specification. I.E. if some app doesn't specify the encoding but another do it shouldn't reference different mounts. And as andrew said, many service types including some ftp daemons don't require a setting just to get the encoding right. Anyway, as I said, the plan all along was to have persistant mount properites to solve this. Just a small matter of programming...
(In reply to comment #4) ... > The above RFCs define a special feature "UTF8" which, if present, means that > all file names are exchanged in UTF-8. Otherwise, the user-specific encoding > should be used. > The UTF8 feature has now been implemented (gvfs >= 1.1.x). Also, see bug #544586.
*** Bug 544586 has been marked as a duplicate of this bug. ***
The code to implement translation from ftp paths to gvfs paths in GVfsFtpFile is likely the right place to implement this. But without a way to get at the encoding - and I agree with Alex that guessing is a bad idea, so we want per-mount options here - I'm not gonna work on making this happen.
*** Bug 598535 has been marked as a duplicate of this bug. ***
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gvfs/issues/63.