After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 166973 - gnome-vfs just assume that filename encoding is utf-8 if there is no information
gnome-vfs just assume that filename encoding is utf-8 if there is no information
Status: RESOLVED WONTFIX
Product: gnome-vfs
Classification: Deprecated
Component: Module: ssh/sftp
2.12.x
Other All
: Normal normal
: ---
Assigned To: gnome-vfs maintainers
gnome-vfs maintainers
Depends on:
Blocks:
 
 
Reported: 2005-02-10 18:06 UTC by Eungkyu Song
Modified: 2008-09-06 19:07 UTC
See Also:
GNOME target: ---
GNOME version: 2.7/2.8


Attachments
filename encoding example screenshot (40.93 KB, image/png)
2005-02-10 18:08 UTC, Eungkyu Song
Details

Description Eungkyu Song 2005-02-10 18:06:46 UTC
Please describe the problem:
If the protocol have no information about character set encoding of filenames,
gnome-vfs just assume that filename is utf-8 encoding.

But, world is not utf-8 compatible.
I want have a option to set charset encoding, at least "current locale's or utf-8"
(like G_FILENAME_ENCODING of glib's). In many environment, we can't use the
advantage of gnome-vfs at all just because of this problem. This is very sad
situation.

Steps to reproduce:
1. use ssh, ftp, ... via gnome-vfs
2. see the filename of non-utf8 filename
3. amen


Actual results:
broken filename labeled "incorrect unicode"

Expected results:
converted filename

Does this happen every time?
yes

Other information:
example screenshot will be attached
Comment 1 Eungkyu Song 2005-02-10 18:08:35 UTC
Created attachment 37308 [details]
filename encoding example screenshot

This nautilus screen is made of ssh connection of gnome-vfs.
Comment 2 Christian Kellner 2005-02-11 17:08:15 UTC
I was looking a the code and gnome-vfs already supports the G_FILENAME_ENCODING
enviroment variable. Did you try setting it before running nautilus?
Comment 3 Eungkyu Song 2005-02-12 14:09:51 UTC
Yes, I always sets G_FILENAME_ENCODING=@locale in GNOME environment.
and I make sure of that by /proc/[pid of nautilus]/environ file.
Nautilus make and/or read local filename as EUC-KR encoding (locale encoding).
but, in ftp/sftp protocol, make UTF-8 filename and read filename as UTF-8.

(or, maybe, does the CVS code correct this problem?)
Comment 4 Christian Kellner 2005-02-12 15:02:41 UTC
Ohh so this does not happen while browsing the local filesystem but only over
sftp? (I guess so therefore I am moving this over to the sftp Component)
Comment 5 Eungkyu Song 2005-02-14 04:12:10 UTC
Oh, I'm sorry i'm not specify the problem clearly.
This ploblem is about the remote filename with network protocols which does not
have encoding information.

For example, ftp, ssh/sftp, ... (samba does not have this problem because the
protocol provide encoding information)
Comment 6 kz 2005-09-11 18:36:51 UTC
gnome-vfs, in the long run, shall provide an API to set encoding information
into connection object. It's the only way to assure connectivity over network.
Such transfer with implicit charset sometimes make application broken,
by pango's segfault with illegal byte sequences.

W3C did provide a statement (I can't find it this moment) to encourage people
set encoding properly over network, for this era of mixture charsets.
Comment 7 Christian Neumair 2005-09-11 18:41:07 UTC
What does it effectively mean if a protocol does not have encoding informations?
GnomeVFS just transfers escaped URIs.
Comment 8 kz 2005-09-13 10:29:34 UTC
nautilus-connect-server create a connection to remote system.
This dialog does not provide entry of encoding infomation.
I tried a glance, and couldn't find any place to set encoding.

GnomeVFSHandle or whatever can provide a property of encoding
and promote to use the information between transfers.
eg. send a filename of EUC-KR to UTF-8 system, filename conversion
shall be done by transfer layer. And vice versa.
Comment 9 danny.milo 2005-09-13 17:10:55 UTC
It would probably be best to expose a combobox/radiobuttons where you can choose
the encoding in the connect server dialog when the selected protocol is
ftp[/sftp?]. The filenames then could be converted back and forth from that
encoding to utf-8 by gnome-vfs.

Not the most beautiful thing in the world, but a "good" workaround (the user
shouldnt need to select the encoding, but there is no sane way around that ...
sigh).

Note that the encoding mess is just that, a mess. 
Nobody guarantees that one does not traverse filesystem borders with the ftp
protocol, and then, the encoding could be even another one.

And there is no way to find whether a given encoding is "correct".

But I guess providing some means to select the encoding is better than none, but
best will be when all the world switched to unicode, of course :)

I'd suggest maintaining a file with locale -> legacy encoding mappings and at
least only provide that *one entry* (or maybe at max. 3) in the gui.

for example:

  LANG="de"

  Encoding
   [*] Unicode
   [ ] Latin 1 (this one is used most)
   [ ] Other [______]

at least for German, there is "just" one legacy used encoding. Other languages
have multiple ones (Japanese comes to mind, having Shift-Jis, Euc/Jp, Windows
Codepage 932, ISO-2022-JP):

  LANG="ja"

  Encoding
   [*] Unicode
   [ ] Shift Jis
   [ ] EUC Jp (this one is used most)
   [ ] Windows 932
   [ ] Other [______]

As you see, here the mess starts. Note that Shift Jis even reads weird on paper,
but hey, its still used. Sigh.

Kang Jeong-Hee, how is it with Korean ? Are there multiple legacy encodings used ?
Comment 10 Alexander Larsson 2005-09-19 14:39:11 UTC
A connected server is in no way a "connection" in the protocol sense. Its more
like a bookmark or a shortcut. Its not visible at the i/o levels in gnome-vfs.
Comment 11 kz 2005-09-23 13:42:36 UTC
danny_milo:
Korean have EUC-KR, CP949 and iso blah thing, as regular.
And I strongly disagree to provide radio buttons. Just a combo box.

alexl:
Yeah, gnome-vfs may have no responsibility on encoding conversion.
But there're convenient 'display' APIs. They need encoding from and to.

BTW, I've heard that one of RFC define FTP with encoding information. :)
Comment 12 Eungkyu Song 2005-10-19 17:42:14 UTC
> BTW, I've heard that one of RFC define FTP with encoding information. :)

But, most of ftp servers didn't implement that feature. :(

to all:
World of encoding is in chaos. Most of ftp clients, ftp servers, blabla servers,
and blabla clients are not responsible for encoding problem if the protocol says
nothing about encoding. But there EXISTS the problem.

If all the people in the world use unicode, this problem will be automatically
fixed. Before this utopia occur, however, something or someone should do
workaround. (I'm used to convert encoding of transferred filename manually.)
Gnome-vfs is very good place to insert workaround.
Comment 13 kz 2005-12-20 14:29:57 UTC
Where the encoding information belong to?

NautilusFile provide a display name. For local file system, g_filename_display_basename() works.
In this routine, G_FILENAME_ENCODING environmental variable certainly take effect.

For remote, GnomeVFSURI come up. But encoding conversion does not occur. Only last piece
of URI (short name) extracted from full length. The string shall be converted.

But encoding conversion require @from and @to encoding name. @to is UTF-8 always.
@from may determined from G_FILENAME_ENCODING, implicitly.

I mean, with implicit, that G_FILENAME_ENCODING is just a list of fallback encodings.
There might be an encoding from outside of list. It have to be done at runtime, explicitly.
Environmental variable is not matter of runtime.

The approach with G_FILENAME_ENCODING can just be a workaround.
Comment 14 kz 2005-12-20 15:04:52 UTC
gnome-keyring item type of GNOME_KEYRING_ITEM_ENCODING, for example, may put this problem simple way.
Comment 15 André Klapper 2008-09-06 19:07:51 UTC
gnome-vfs has been deprecated and superseded by gio/gvfs since GNOME 2.22, hence mass-closing many of the gnome-vfs requests/bug reports. This means that gnome-vfs is NOT actively maintained anymore, however patches are still welcome.

If your reported issue is still valid for gio/gvfs, please feel free to file a bug report against glib/gio or gvfs.

@Bugzilla mail recipients: query for gnome-vfs-mass-close to get rid of these notification emails all together.


General further information: http://en.wikipedia.org/wiki/GVFS 
Reasons behind this decision are listed at http://www.mail-archive.com/gnome-vfs-list@gnome.org/msg00899.html