GNOME Bugzilla – Bug 670117
gedit lacks locale-based encoding detection
Last modified: 2013-11-10 10:11:04 UTC
Created attachment 207609 [details] Works with zh locales and leafpad but gedit (without extra configuration) As I mentioned in early bug reports, many existing text files using a localized encoding specific to one country/region, e.g., GB encoding family (GB2312/GBK/GB18030) for Mainland of China. The attachment is an simple example. gedit fails to handle such files by default. Some people work around the issue using something like the following, which is excerpted form a Chinese Linux mailing list. gsettings set org.gnome.gedit.preferences.encodings auto-detected "[ 'UTF-8','GB18030', 'GB2312', 'GBK', 'BIG5', 'CURRENT', 'UTF-16']" gsettings set org.gnome.gedit.preferences.encodings shown-in-menu "[ 'UTF-8','GB18030', 'GB2312', 'GBK', 'BIG5', 'CURRENT', 'UTF-16']" This is indeed a user's preference. But at least in China, this makes much more practical sense than the default settings. Some people Google it and do it. Other people may give up gedit or Linux Desktop before knowing this :) leafpad ( http://tarot.freeshell.org/leafpad/ ) author used locale to figure out such preference automatically. Details can be found in encoding.c of the leafpad souce. The detect_charset() in it should explain everything clearly. Some Chinese people recommand leafpad in mailing lists, forums and blogs, mainly because of its detection capacity. leafpad's detection capacity is not limited to Chinese, which is obvious from the source code, but I have no much knowledge for other languages. For gedit, some people may argue such mechanism is dirty. And I already filed a bug asking for universal encoding detection. https://bugzilla.gnome.org/show_bug.cgi?id=669448 For the locale-based way. It should be much easier to introduce. I recommend to introduce it in a passive way. It is not enabled by default. But user can enable it with a simple Check Box click. Maybe it can be made a plugin.
'GB2312', 'GBK' are unnecessary. They are included in 'GB18030'. Or a codepage switcher like Kate will also solve many problems.
I strongly agree with you. But I'd like to show the origin content. I filed a new bug for encoding choosing feature. https://bugzilla.gnome.org/show_bug.cgi?id=670495
This is implemented in gedit. See for example: https://git.gnome.org/browse/gedit/tree/po/zh_CN.po#n412 (search for "encoding" in the Chinese translation)
Many languages, including h_HW and zh_TW, do not have translation for that entry I guess. You should check with Windows localized version, change all corresponding languages then declare this as fixed. Or have you made an announcement about the existence of that entry to translators? Where?
s/h_HW/zh_HK
Guessing does not help, but translating it (or fixing the translation) does: https://l10n.gnome.org/vertimus/gedit/master/po/zh_HK https://l10n.gnome.org/vertimus/gedit/master/po/zh_TW And that's out of scope for this bug report, as the developers did everything correctly here.