Bug 488463 – Wrong interpretation of charset encoding for synced Greek text

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 488463 - Wrong interpretation of charset encoding for synced Greek text


Summary:	Wrong interpretation of charset encoding for synced Greek text


Status:	RESOLVED WONTFIX

Product:	evolution
Classification:	Applications
Component:	Calendar
Version:	2.12.x (obsolete)
Hardware:	Other Linux

Importance:	Normal major
Target Milestone:	---
Assigned To:	evolution-calendar-maintainers
QA Contact:	Evolution QA team

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2007-10-20 00:43 UTC by Ioannis Karalis
Modified:	2012-06-17 20:37 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
screenshot of the problem (178.35 KB, image/png) 2007-10-20 15:50 UTC, Ioannis Karalis	Details
screenshot of "locale" from the terminal (34.29 KB, image/png) 2007-10-20 15:52 UTC, Ioannis Karalis	Details

Description Ioannis Karalis 2007-10-20 00:43:58 UTC

Incorrect translation
Application: Evolution

Incorrect text:
Actually this is not a "translation" issue. But the greek characters in Evolution can not be read

Should be:
Support the greek characters

Comment 1 André Klapper 2007-10-20 11:26:26 UTC

Evolution does support greek. Please elaborate a lot more. What distro? What's your $LOCALE setting? What does "can not be read" mean?

Comment 2 Kostas Papadimas 2007-10-20 12:17:41 UTC

Gianni , everything seems OK to me both in Calendar and address book (evolution 2.10.3 - debian etch). Please post the results of the "locale" command and a screenshot of the problem.

Comment 3 Ioannis Karalis 2007-10-20 15:50:45 UTC

Created attachment 97520 [details]
screenshot of the problem

This is how evolution contacts look like

Comment 4 Ioannis Karalis 2007-10-20 15:52:38 UTC

Created attachment 97521 [details]
screenshot of "locale" from the terminal

Comment 5 Ioannis Karalis 2007-10-20 15:53:25 UTC

Comment on attachment 97520 [details]
screenshot of the problem

I am new in linux and I have recently installed Ubuntu 7.04 and updated to 7.10. I have used Outlook till now for my emails and syncing with both my mobile phone and Palm Tungsten T3 for contacts and calendar. I googled searching for options transferring the data to Evolution from Outlook (office 2007) but I found them rather complicated. So I decided to hotsync my palm with evolution hoping that it would transfer all the valuable data from contacts and calendar to evolution. It actually did but can not be read as you can see in the screenshots. Any ideas? or I'll just stick with web mail while in linux...

Comment 6 André Klapper 2007-10-20 16:05:40 UTC

ahaha, so you sync'ed some stuff. then it's probably not evolution's fault. ;-)
if you enter a new (greek) contact, does it work?

Comment 7 Kostas Papadimas 2007-10-20 16:32:38 UTC

You've imported your contacts with wrong encoding (probably the old iso-8859-7 ) You need to convert the contacts file (palm saves the in a .csv file or sth else?) to utf-8 and then import to evolution .

Comment 8 Ioannis Karalis 2007-10-20 17:40:26 UTC

Thank you for your immediate response, both of you!

Comment 9 Simos Xenitellis 2007-11-01 13:27:15 UTC

(In reply to comment #3)
> Created an attachment (id=97520) [edit]
> screenshot of the problem
> 
> This is how evolution contacts look like
> 

This looks like one of the eternal issues of legacy (8-bit) encodings.

In the screenshot you can see that Evolution is trying to be smart and attempts to convert the 8-bit encoding text into Unicode. It converts from iso-8859-1 to UTF-8, thus the Latin characters with those accents.

I am not sure if Evolution has special code that checks what the LANG variable looks like (en_US? so probably iso-8859-1 to UTF-8 please, el_GR? so probably from iso-8859-7 to UTF-8 please). Even if it had, in your case it would not help due to the en_US.UTF-8 locale being used.

What do to? Evolution could ask the user in what language the contacts are written in, then use the correct legacy encoding to convert from. Here there could be a wizard that shows in a dialog box a contact list entry converted from different legacy encodings, and asks the user which one is correct (so no need to ask for language, or maintain a list of language->legacyencoding).

Alternatively, Evolution could try to autoconvert, by guessing from the LANG variable (assuming the user chose the Greek locale) and also attempting to figure out the encoding through frequencies of characters in words. 

In any case, such code is useful in several other projects, including

* Importing ID3 tags from songs (vast majority of ID3 tags are of legacy encoding)
* Figuring out the encoding of filenames in ZIP files (ZIP format does not specify encoding for filenames)

Some more links on this at
https://blueprints.launchpad.net/unzip/+spec/unzip-detect-filename-encoding

The wizard solution appears easy to implement; the autoconversion with heuristics would require someone to undertake it as a pet project to create a generic library to be used by other projects.

Comment 10 Tobias Mueller 2008-03-24 01:58:02 UTC

I'm setting to NEW due to Simos proposal.

I'd say to either implement this or close as WONTFIX. Maybe along with a discussion on e-h.

Comment 11 David Ayers 2010-05-20 21:20:44 UTC

FWIW: The most practical GUI experience with this issue I've seen is the CVS import of OO.o calc... it asks the user how to interpret the file with a popup and immediately displays a preview as visual conformation of the correct encoding.  (Just in case someone wants to tackle this).

Comment 12 André Klapper 2012-06-17 20:37:07 UTC

Crossing fingers that most applications nowadays finally use UTF8 for syncing. Currently no plans to make Evolution code more complicated to work around this.