After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 670117 - gedit lacks locale-based encoding detection
gedit lacks locale-based encoding detection
Status: RESOLVED FIXED
Product: gedit
Classification: Applications
Component: general
3.3.x
Other Linux
: Normal normal
: ---
Assigned To: Gedit maintainers
Gedit maintainers
Depends on:
Blocks:
 
 
Reported: 2012-02-15 04:18 UTC by Ma Hsiao-chun
Modified: 2013-11-10 10:11 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Works with zh locales and leafpad but gedit (without extra configuration) (52 bytes, text/plain)
2012-02-15 04:18 UTC, Ma Hsiao-chun
Details

Description Ma Hsiao-chun 2012-02-15 04:18:18 UTC
Created attachment 207609 [details]
Works with zh locales and leafpad but gedit (without extra configuration)

As I mentioned in early bug reports, many existing text files using a localized encoding specific to one country/region, e.g., GB encoding family (GB2312/GBK/GB18030) for Mainland of China. The attachment is an simple example.

gedit fails to handle such files by default. Some people work around the issue using something like the following, which is excerpted form a Chinese Linux mailing list.

gsettings set org.gnome.gedit.preferences.encodings auto-detected "[ 'UTF-8','GB18030', 'GB2312', 'GBK', 'BIG5', 'CURRENT', 'UTF-16']"
gsettings set org.gnome.gedit.preferences.encodings shown-in-menu "[ 'UTF-8','GB18030', 'GB2312', 'GBK', 'BIG5', 'CURRENT', 'UTF-16']"

This is indeed a user's preference. But at least in China, this makes much more practical sense than the default settings. Some people Google it and do it. Other people may give up gedit or Linux Desktop before knowing this :)

leafpad ( http://tarot.freeshell.org/leafpad/ ) author used locale to figure out such preference automatically. Details can be found in encoding.c of the leafpad souce. The detect_charset() in it should explain everything clearly. Some Chinese people recommand leafpad in mailing lists, forums and blogs,  mainly because of its detection capacity. leafpad's detection capacity is not limited to Chinese, which is obvious from the source code, but I have no much knowledge for other languages.

For gedit, some people may argue such mechanism is dirty. And I already filed a bug asking for universal encoding detection.
https://bugzilla.gnome.org/show_bug.cgi?id=669448

For the locale-based way. It should be much easier to introduce. I recommend to introduce it in a passive way. It is not enabled by default. But user can enable it with a simple Check Box click. Maybe it can be made a plugin.
Comment 1 chenzhipeter 2012-02-20 19:03:27 UTC
'GB2312', 'GBK' are unnecessary. They are included in 'GB18030'. 
Or a codepage switcher like Kate will also solve many problems.
Comment 2 Ma Hsiao-chun 2012-02-21 02:05:10 UTC
I strongly agree with you. But I'd like to show the origin content.

I filed a new bug for encoding choosing feature.
https://bugzilla.gnome.org/show_bug.cgi?id=670495
Comment 3 Sébastien Wilmet 2013-11-09 19:30:11 UTC
This is implemented in gedit. See for example:

https://git.gnome.org/browse/gedit/tree/po/zh_CN.po#n412

(search for "encoding" in the Chinese translation)
Comment 4 Ma Hsiao-chun 2013-11-10 03:53:45 UTC
Many languages, including h_HW and zh_TW, do not have translation for that entry I guess. You should check with Windows localized version, change all corresponding languages then declare this as fixed.

Or have you made an announcement about the existence of that entry to translators? Where?
Comment 5 Ma Hsiao-chun 2013-11-10 03:54:32 UTC
s/h_HW/zh_HK
Comment 6 André Klapper 2013-11-10 10:11:04 UTC
Guessing does not help, but translating it (or fixing the translation) does:
https://l10n.gnome.org/vertimus/gedit/master/po/zh_HK
https://l10n.gnome.org/vertimus/gedit/master/po/zh_TW

And that's out of scope for this bug report, as the developers did everything correctly here.