After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 569321 - Mail is in the wrong encoding
Mail is in the wrong encoding
Status: RESOLVED FIXED
Product: evolution-mapi
Classification: Applications
Component: Mail
unspecified
Other All
: Normal normal
: 0.28
Assigned To: evolution-mapi-maint
evolution-mapi-maint
: 572144 577323 586525 591060 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2009-01-27 12:04 UTC by Mattias Eriksson
Modified: 2010-02-02 12:57 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
A sample mail with encoding problems (4.16 KB, application/octet-stream)
2009-03-16 10:42 UTC, Mattias Eriksson
  Details
body encoding fix (1.97 KB, text/plain)
2009-03-19 19:23 UTC, Milan Crha
  Details
A image showing a correctly encoded mail. (122.70 KB, image/png)
2009-03-20 07:53 UTC, Mattias Eriksson
  Details
Wrong encoded mail (111.53 KB, image/png)
2009-03-20 07:54 UTC, Mattias Eriksson
  Details
The sample mail used in the images (530 bytes, application/x-gzip)
2009-03-20 07:55 UTC, Mattias Eriksson
  Details
test.eml (21.72 KB, text/plain)
2009-04-06 12:40 UTC, Milan Crha
  Details
test.eml downloaded through IMAP (22.10 KB, text/plain)
2009-04-06 13:02 UTC, Milan Crha
  Details
proposed ema patch (5.31 KB, patch)
2009-04-06 13:27 UTC, Milan Crha
none Details | Review
Image showing my mailbox with the patch (87.89 KB, image/png)
2009-05-08 10:03 UTC, Mattias Eriksson
  Details
proposed ema patch ][ (5.08 KB, patch)
2009-06-15 09:52 UTC, Milan Crha
none Details | Review
Fix that converts calendar events to utf8 (1.27 KB, patch)
2009-06-17 15:46 UTC, Mattias Eriksson
none Details | Review
proposed ema patch ]I[ (5.09 KB, patch)
2009-08-10 17:14 UTC, Milan Crha
none Details | Review
proposed ema patch IV (8.76 KB, patch)
2009-08-11 18:42 UTC, Milan Crha
committed Details | Review
example of one message not displayed in Evolution (4.13 KB, text/plain)
2009-11-27 08:21 UTC, Valent Turkovic
  Details
UTF-8 issues in subject (332.88 KB, image/png)
2009-12-28 09:30 UTC, Valent Turkovic
  Details
ema patch/hack (2.40 KB, patch)
2010-01-15 16:59 UTC, Milan Crha
committed Details | Review
Fix problem on openchange side (25.49 KB, patch)
2010-02-01 23:02 UTC, Julien Kerihuel
none Details | Review

Description Mattias Eriksson 2009-01-27 12:04:01 UTC
Please describe the problem:
Using version evolution-mapi-0.25.6~svn80 provided by jelmer on ubuntu.

When I look in the list of mails I see that the name of people containing swedish characters are wrong. 
The name Engström is shown as Engstr÷m in the mail summary list. 
It looks ok in the header in the preview pane.
But in the message body in the preview pane the swedish chars are just gone. This may be fixed by selecting encoding ISO-8859-15.

Steps to reproduce:


Actual results:


Expected results:


Does this happen every time?


Other information:
Comment 1 Mattias Eriksson 2009-02-20 15:20:22 UTC
Note that my comment "This may be fixed by selecting encoding ISO-8859-15" only refers to the message body. The mail list is still displayed wrong, so this is quite a bad bug for all users with non-ascii characters in their language.
Comment 2 Vincent Bossier 2009-02-21 12:49:10 UTC
I face the same problems. I would however split this bug in two parts:

1. The mail summary list encoding problem, which seems a duplicate of bug 572144. In this case, diacritics are incorrectly displayed.
2. The mail body encoding problem (preview pane). Sometimes, I need to manually select ISO-8859 to see diacritics. If not selected, the diacritics do not appear at all.
Comment 3 Milan Crha 2009-03-16 10:06:56 UTC
Hi, would you mind to attach here some sample mail for testing? Could you test whether importing such email to the local folder (under On This Computer) will suffer the same issue as when displayed under MAPI mail folder? Thanks in advance.
Comment 4 Mattias Eriksson 2009-03-16 10:42:32 UTC
Created attachment 130733 [details]
A sample mail with encoding problems

This as problem with the sender name that should be displayed as "Malmö Aviation".
Also the mail is not shown properly, words like världsklass is shown as vrldsklass in the message body.
Comment 5 Milan Crha 2009-03-19 11:34:04 UTC
OK, I managed to transfer the message to the MAPI server by IMAP. Both of them shows the message differently, in a message list and in a preview. I think I see a problem for the message preview, but I'm not sure with the message list yet.
Comment 6 Milan Crha 2009-03-19 19:23:34 UTC
Created attachment 130984 [details]
body encoding fix

for evolution-mapi;

The subject/from/... I do not know yet, it returns those values in some strange encoding. Note: patch marked as a text intentionally.
Comment 7 Mattias Eriksson 2009-03-20 07:53:41 UTC
Created attachment 131011 [details]
A image showing a correctly encoded mail.

This image is a just a reference to show the correctly encoded mail
Comment 8 Mattias Eriksson 2009-03-20 07:54:21 UTC
Created attachment 131013 [details]
Wrong encoded mail

This image is of the same mail as previous but with the bad encoding
Comment 9 Mattias Eriksson 2009-03-20 07:55:57 UTC
Created attachment 131014 [details]
The sample mail used in the images

The sample mail used in the images
Comment 10 Mattias Eriksson 2009-03-20 08:00:22 UTC
And steps to reproduce this... qute simple using a google mail account.
Make sure the google settings for the account should use the default encoding. Then change the encoding in firefox to ISO-LATIN-15 (or ISO-LATIN-1).
Then change your account name to include the word "Räksmörgås" (swedish for shrimp sandwich). 
Then I used the subject "Sample mail: åäö and a räksmörgås"
And in the body: "åäö and a räksmörgås"
(cut and paste from this comment should work if you happen to not have a swedish keyboard). 

Then send the mail to your mapi account.

//Mattias
Comment 11 Milan Crha 2009-03-20 11:03:32 UTC
Mattias, thanks, I've just two things: a) Please use only bugzilla for communication, it's good, enough, and I will appreciate it.
b) the fix I attached yesterday should fix the message preview encoding as you requested. Not the message list yet. The reason for marking it as a text is that I would like to look on the encoding problem in the message-list too, but it'll take a bit longer. And this fix, I hope, is more valuable as is, than waiting longer for complete list.
Comment 12 Mattias Eriksson 2009-03-27 11:30:59 UTC
Tested with 0.26.0.1 and still have the same problem. Wrong encoding in messagelist and message preview. 

I also have problems with creating mapi accounts, since it craches. So what I did was that I created the profile database using:

mapiprofile --create -C Swedish -P <my username> --default -f .evolution/mapi-profiles.ldb -u <my username> -p <my password> -D <the doamin> -I <the servername>

The first time I reported the bug I hadn't used the Sweidish part, but that didn't make any difference. 

Comment 13 Mattias Eriksson 2009-03-27 12:26:11 UTC
I tested your patch and that fixes the problems I have in the preview! great!

still having encoding issues in the mail list.
Comment 14 Milan Crha 2009-04-06 12:40:57 UTC
Created attachment 132185 [details]
test.eml

It doesn't work with some of chars even with the below patch, but as I was told, it's just a question of improving libmapi/utf8_convert.l
Comment 15 Milan Crha 2009-04-06 13:02:45 UTC
Created attachment 132187 [details]
test.eml downloaded through IMAP
Comment 16 Julien Kerihuel 2009-04-06 13:05:14 UTC
(In reply to comment #13)
> I tested your patch and that fixes the problems I have in the preview! great!
> 
> still having encoding issues in the mail list.
> 

I know this may look a bit experimental, but could you paste me here or points me to the list of special Swedish characters with accents?
Comment 17 Milan Crha 2009-04-06 13:27:24 UTC
Created attachment 132189 [details] [review]
proposed ema patch

for evolution-mapi;

This is the patch for this issue. As I understood my chat with Kerihuel on IRC, the rest comes either to libmapi itself or the server itself (where the server might return some letters wrong for properties evolution is using). Please correct me if I'm wrong.

Nonetheless, it'll be great if you can help Kerihuel with those letters in Swedish.
Comment 18 Julien Kerihuel 2009-04-06 13:46:52 UTC
I've sent an email from Outlook where the body (PR_HTML) included the string
below:

"OOOěščřžýáíéúůOOOOOOěščřžýáíéúůOOO"

At the wire level, it looks like (starting at 0x41A and ending at 0x43B):

[0410] 6C 79 3A 41 72 69 61 6C   27 3E 4F 4F 4F EC B9 E8   ly:Arial '>OOO...
[0420] F8 BE FD E1 ED E9 FA F9   4F 4F 4F 4F 4F 4F EC B9   ........ OOOOOO..
[0430] E8 F8 BE FD E1 ED E9 FA   F9 4F 4F 4F 3C 6F 3A 70   ........ .OOO<o:p

Special characters are generally available within UNICODE strings only and are
a set of 2 up to 3 bytes. In the example above, if we look at the first part of
the string: "ěščřžýáíéúů", we have 11 characters. However we do also
have 11 characters in the wire dump, which means this need to be interpreted by
the client depending on the used charset (Outlook btw gives this opportunity). 

The same output is retrieved with PR_BODY (utf8).

However when we look at the unicode version of the body (PR_BODY_UNICODE), we
have the - truncated to be relevant - output below:

                            00   4F 00 4F 00 4F 00 1B 01   ..uOOO.. O.O.O...
[00F0] 61 01 0D 01 59 01 7E 01   FD 00 E1 00 ED 00 E9 00   a...Y.~. ........
[0100] FA 00 6F 01 4F 00 4F 00   4F 00 00 00 00

In this case we indeed have 22 chars used to represent the special char
sequence:

\x1b\x01 for ě
etc.

Conclusion:
- Depending on the property evolution-mapi uses, this problem may occur again
- However if evo-mapi only uses or force usage of UNICODE strings and libmapi
completes its windows to utf8 conversion table, we can probably get this fixed.
Comment 19 Mattias Eriksson 2009-04-06 14:07:24 UTC
The swedish special chars are åäö. 

Comment 20 Julien Kerihuel 2009-04-06 16:07:42 UTC
(In reply to comment #19)
> The swedish special chars are åäö. 
> 

I've just did a test with openchangeclient + a small patch calling windows_to_utf8 over UNICODE properties and they show up nicely in my gnome terminal (configured to use "UNICODE (UTF-8)" Terminal Character Encoding).


MAILBOX (23 messages)
+-------------------------------------+
message id: <40A1E755DBCDFD448C89379A56FA17920B55AE@exch2k3.openchange2003.local>
subject: Special chars ÄÅöä The bad one is å
From: Julien Kerihuel
To:  Julien Kerihuel


Comment 21 Mattias Eriksson 2009-04-17 14:04:04 UTC
Tested the latest patch, and the preview pane and the body looks good now. But the mail list pane still show the encoding according to the screenshots I attached. 
Comment 22 Mattias Eriksson 2009-04-17 14:05:16 UTC
Also.. I have it packaged for jaunty here:
https://launchpad.net/~snaggen/+archive/ppa if anybody like to test it. 
Comment 23 Mattias Eriksson 2009-05-05 07:59:19 UTC
Can I assist you in any way to get the last encoding problems in the mail-summary list fixed? 

Just let me know if you would like for me to help out/provide more info.
Comment 24 Milan Crha 2009-05-05 12:28:37 UTC
Hi Mattias, I had a chat with Julien and he said that with any libmapi of version 0.8 (from my point of view the latest better), and with the attached patch applied, it should work fine. The only thing is to refetch your summary, which means closing evolution, and delete your folders.db file for the mapi account, somewhere in ~/.evolution/mail/mapi/<account-url>/folders.db
It will fetch all the information next start again (thus may take some time).
Comment 25 Mattias Eriksson 2009-05-05 12:55:53 UTC
I retested this again. Did:
evolution --force-shutdown
rm -rf .evolution/mail/mapi
evolution

Evolution then fetched the summaries again, but I still have the encoding issue in the summary pane, both for names and subjects. 
However, the patch fixes the problem in the preview pane, both in the headers and body. 

Using libmapi from Ubunty Jaunty and evolution-mapi with the patch:
libmapi0       1:0.8-2ubuntu1
evolution-mapi                0.26.0.1-0ubuntu3~snaggen1

//Mattias
Comment 26 Milan Crha 2009-05-07 17:02:31 UTC
Hmm. I've still your test mail in my Inbox, and clearing ~/.evolution/mail/mapi and re-fetching the mail, I see in the "From" column "Malmö Aviation".
There was some strange sign instead of that ö before.
Comment 27 Mattias Eriksson 2009-05-08 10:03:00 UTC
Created attachment 134248 [details]
Image showing my mailbox with the patch

I have tested to remove the .evolution/mail/mapi folder. I have then refetched the mailbox again. I also repeated the procedure with evolution and evolution-data-server running with LANG=C but with no change. 

So this screenshot show how it looks for me with the patch applied.
Comment 28 Johnny Jacob 2009-05-14 08:34:54 UTC
(In reply to comment #17)
> Created an attachment (id=132189) [edit]
> proposed ema patch
> 

Milan, can you test this scenario also : http://bugzilla.gnome.org/show_bug.cgi?id=579150#c1 ?
Comment 29 Milan Crha 2009-05-19 11:32:32 UTC
This wasn't my case, I used only utf8 characters in my subject, no '@' nor '$' or such. As we talked about this on IRC with jony, Mattias, it's possible the issues you see are related also to bug #578287. Could you try with both this and the test patch from there please? I'm interested to know, as I said, it seems to work fine for me. Thanks in advance (and I'm sorry for all the testing requests).
Comment 30 Mattias Eriksson 2009-05-22 13:21:28 UTC
Tested with the patch in 578287 as well, but no luck. The summary still looks the same. 
Comment 31 Mattias Eriksson 2009-06-12 13:51:23 UTC
Ok, I did some digging in the code and this is my guesses... Note, that I don't know anything about camel and the other parts. So this is just uneducated guesses.

If we look at the patch to fix the encoding issues in the preview pane, it was done not by converting the things we get from the servet to UTF-8. But by simply adding the correct tag to the content type. 
This tells me that we know for sure that the content from the server is in some special code page. 

I looked in the evolution-data-server code and there are a few funcitons handling charsets, but I can't see them used in camel-mapi-summary.c. 
So my guess is that camel-mapi-summary is taking some shortcut that makes it endup storing the raw summary in the database instead of utf-8. 

However, I don't now how this should work so I really cant see what is wrong and fix it.. 

I might try to digg deeper into this later on... but I guess it is a lot quicker if done by someone that knows what they are doing.

//Mattias
Comment 32 Milan Crha 2009-06-12 15:13:48 UTC
Thanks for all the update and effort you do here, though as my tests works fine with a patch from comment #17, which exactly does what you are requesting here (note that the 'utf8tolinux' is a misleading name, as it converts to utf8).
Hmm, I just got the idea, maybe your server returns those UNICODE equivalents, which are supposed to be converted too? Hmm, let me try to ask Kerihuel.
Comment 33 Mattias Eriksson 2009-06-14 09:00:52 UTC
Ok, let me say that you guys rock, and that I doesn't rock so much...  

For some reason it seems that my latest tests has been performed with an obsolete patch. When I redid my tests with the patch from comment #17, then it works great! 

Sorry for me confusing the discussion due to me being unable to use the correct patches. Anyway, as I said above, it works now and I'm happy that you were able to solve it. 

//Mattias
Comment 34 Mattias Eriksson 2009-06-14 09:11:18 UTC
When using the correct patch from comment #17 I recieved a meeting invite where the description contained åäö. This caused evolution to say (Translated from swedish): 
The message claims to contain a calendar, but the calendar is not a valid iCalendar.

So I guess the patch need to be extended to convert calendars to utf8 also.

//Mattias
Comment 35 Milan Crha 2009-06-15 09:52:05 UTC
Created attachment 136616 [details] [review]
proposed ema patch ][

for evolution-mapi;

Please try with this updated patch.
By the way, you scared me a bit, as I wasn't able understand why it doesn't work for you, but works for me. Good you found the issue :)
Comment 36 Mattias Eriksson 2009-06-15 11:06:10 UTC
With the patch I see the charset=... in the Content-Type part. It still complains about the calendar not being a valid icalendar. So I guess that the charset doesn't have any effect. When I fetch the calendar event using imap I get it with the content-type text/calendar; charset="utf-8";

This is the mapi calendar event (well the begining of it) using this patch... 
--=-f9b8fluhTjQTqdcd36o9
Content-Transfer-Encoding: 8bit
Content-Type: text/calendar; charset="CP28591"

BEGIN:VCALENDAR
CALSCALE:GREGORIAN
PRODID:-//Ximian//NONSGML Evolution Calendar//EN
VERSION:2.0
METHOD:REQUEST
BEGIN:VEVENT
UID:0000000000000000
DTSTAMP:20090612T142344Z
CREATED:20090615T105339Z
LAST-MODIFIED:20090612T142344Z
SUMMARY:MS2
DESCRIPTION:Nõr: den 16 juni 2009 15:00-16:00 (GMT+01:00) Amsterdam\, 
 Berlin\, Bern\, Rom\, Stockholm\, Wien.\r\nVar: Konf. rum plan 

Comment 37 Mattias Eriksson 2009-06-15 11:31:30 UTC
More strangeness... I just got a new event from the same guy, this event was recognized but still has the same encoding issues. I will look at the source to see if I can find any difference that may cause one to fail and one to not fail.

I then tried to send an invite from google calendar with the following set:
Place: Närmare Jul
Description: Äta en räksmörgås

This causes the mapi backend to crash. However this crash might not be related at all, it might be caused by the conflict detection or anything. 

//Mattias
Comment 38 Mattias Eriksson 2009-06-15 11:38:07 UTC
Ok looking at both raw messages I see the following importand difference:'
Working message ends with:
CLASS:PUBLIC
BEGIN:VALARM
X-EVOLUTION-ALARM-UID:20090615T112310Z-26147-1000-25374-10@saphira
ACTION:DISPLAY
TRIGGER;VALUE=DURATION;RELATED=START:PT0S
END:VALARM
END:VEVENT
END:VCALE
--=-jjvNiykRJ5F9irzFAFKj--


Non working ends with:
ORGANIZER;CN=Daniel Wik:MAILTO:DWik@vizrt.com
PRIORITY:1
CLASS:PUBLIC
BEGIN:VALARM
X-EVOLUTION-ALARM-UID:20090615T112152Z-26147-1000-25374-4@saphira
ACTION:DISPLAY
TRIGGER;VALUE=DURATION;RELATED=START:PT0S
END:VALARM
E
--=-0LXmfX2AaDdn+LIkx3SS--

So I guess the first one works since the last END part is ok, and the parser really doesn't care that teh VCALENDAR part is truncated. However the non working part is even more truncated causing the END part of VEVENT to be truncated. So I guess that there are some general bug in multipart attachments or something. Not related to the encodings. 

So summary:
The encoding is still wrong for the calendar but it isn't that causing the event to be invalid.
The attachment is truncated causing the event to be invalid.
Comment 39 Milan Crha 2009-06-15 12:03:06 UTC
OK, let's open a new bug for this please, as it's really unrelated to encoding (though probably near to changes in the code lines here).

From my point of view:
a) UID:0000000000000000 is kinda suspicious
b) Both mails should end with END:VCALENDAR, same as they BEGIN:VCALENDAR

Could you try with a change in bug #585835 too, it uses there some already freed memory and maybe, will help. Though probably not. Just clear your message cache (not the folder summary this time), so the message will be fetched again. Thanks.
Comment 40 Mattias Eriksson 2009-06-15 12:14:24 UTC
I'm thinking if it may be something related to the encoding causing some lenght calcualtion go wrong that does some assumption like one char one byte. 

//Mattias
Comment 41 Mattias Eriksson 2009-06-15 12:22:50 UTC
The patch from  bug #585835 didnt apply to 2.26.0.1 that I use... 

//Mattias
Comment 42 Milan Crha 2009-06-15 13:06:47 UTC
(In reply to comment #40)
> I'm thinking if it may be something related to the encoding causing some lenght
> calcualtion go wrong that does some assumption like one char one byte. 

I like the idea, though I'm unable to reproduce it myself at the moment. I do something incorrectly for sure. Anyway, please open new bug report for the calendar issue and let's investigate there (if you can, please CC me there too). Thanks in advance.

(In reply to comment #41)
> The patch from  bug #585835 didnt apply to 2.26.0.1 that I use... 

Oh, pity, then skip this.
Comment 43 Mattias Eriksson 2009-06-15 14:53:20 UTC
Tested the following patch and that seemed to solve the encoding issue of the attachment by making it utf8. I'd hoped that it would fix the truncation issue since g_utf8_strlen would actually work on an utf8 string. However that issue is still exists, but the message was truncated a few chars later (well could be the same length since there were other changes, see below) 

But it is interesting that when I fixed the encoding the attachment that sas working stopped to work. 

The part 

DESCRIPTION:Nõr: den 16 juni 2009 15:00-15:30 (GMT+01:00) Amsterdam\, 
 Berlin\, Bern\, Rom\, Stockholm\, Wien.\r\nVar: The Kitchen - floor

Now looked 

DESCRIPTION:När: den 16 juni 2009 15:00-15:30 (GMT+01:00) Amsterdam\ 
 Berlin\ Bern\ Rom\ Stockholm\ Wien.\r\nVar: The Kitchen - floor

So utf8linux seems to remove the , sign... 

The ugly test-patch I used just to try to se what might have been wrong. 

diff -uBbr evolution-mapi-0.26.0.1/src/camel/camel-mapi-folder.c evolution-mapi-0.26.0.1.tests/src/camel/camel-mapi-folder.c
--- evolution-mapi-0.26.0.1/src/camel/camel-mapi-folder.c	2009-06-15 16:45:37.000000000 +0200
+++ evolution-mapi-0.26.0.1.tests/src/camel/camel-mapi-folder.c	2009-06-15 16:34:37.000000000 +0200
@@ -947,10 +947,13 @@
 	}
 
 	if (g_str_has_prefix (msg_class, IPM_SCHEDULE_MEETING_PREFIX)) {
-		guint8 *appointment_body_str = (guint8 *) exchange_mapi_cal_util_camel_helper (item_data->properties, 
+		guint8 *tmp_appointment_body_str = (guint8 *) exchange_mapi_cal_util_camel_helper (item_data->properties, 
 									     item_data->streams, 
 									     item_data->recipients, item_data->attachments);
 
+		guint8 *appointment_body_str = (guint8 *)  utf8tolinux((const gchar *) tmp_appointment_body_str);
+		        item->header.cpid = NULL;
+
 		body = g_new0(ExchangeMAPIStream, 1);
 		body->proptag = PR_BODY;
 		body->value = g_byte_array_new ();
@@ -1079,11 +1082,14 @@
 	if (body) { 
 		char *buff = NULL;
 
-		if (item->is_cal)
+		if (item->is_cal) {
 			type = "text/calendar";
-		else
+		    buff = g_strdup_printf ("%s; charset=\"utf-8\"; method=REQUEST", type);
+			type = buff;
+		} else {
 			type = (body->proptag == PR_BODY || body->proptag == PR_BODY_UNICODE) ? 
 				"text/plain" : "text/html";
+		}
 
 		if (item->header.cpid) {
 			buff = g_strdup_printf ("%s; charset=\"CP%d\"", type, item->header.cpid);
Comment 44 Mattias Eriksson 2009-06-17 15:43:43 UTC
I have fixed the truncation issue, there are a patch attached in the other bug...
I have also managed to get this last calendar encoding issue to work by converting the event to utf8. However that required a fix in libmapi utf8tolinux function to not drop importan characters like ,;@
so I changed libmapi/utf8_convert.l:
From:
chars [0-9A-za-z\_\'\.\"/\+\-=\{\}:] ). 
to 
chars [0-9A-za-z\_\'\.\,\"/\+\-=\{\}:\;\@]

Note that I have no clue about flex, just changed on a hunch and it works, dont know why :)

Will attatch the diff to evolution-mapi
Comment 45 Mattias Eriksson 2009-06-17 15:46:45 UTC
Created attachment 136844 [details] [review]
Fix that converts calendar events to utf8

This patch is an addon to the patch from comment #17. 
It also depends on the fix to libmapi/utf8_convert.l without this fix you will not end up with a valid calendar attachment.
Comment 46 Mattias Eriksson 2009-06-17 15:55:14 UTC
Note that with this patch I still loose chars like 'é' due to the utf8linux striping away chars. So I suggest a general fix for that problem to make it not remove all unknown chars. But I don't now anything about flex so I hope someone else can fix that. 
Comment 47 Mattias Eriksson 2009-06-17 19:51:28 UTC
Hmmm... I realize that there are some problems that still remains from the original core problem.

I have reviced a mail with content type:
Content-Type: text/html; charset="CP28591"

The body looks somethink like
<br><font size=2 face="sans-serif">Hej Mattias,</font>
<br><font size=2 face="sans-serif">Vänliga hälsningar<br></font>

However, It renders

Hej Mattias
[?]

since the second font tag contained non-ascii characters.

Another mail I get which the patch from comment #17 solved was:
Content-Type: text/plain; charset="CP65001"

where the body contains swedish characters that are displayed just nice. (I guess this guy runs linux, most windows users send html mails so I do not have any text mail from any windows users to compare with). 

//Mattias

Comment 48 Milan Crha 2009-06-18 09:36:20 UTC
(In reply to comment #47)
> ...
> Content-Type: text/html; charset="CP28591"
> ...
> Content-Type: text/plain; charset="CP65001"
> ...

Such a big numbers for charset are incorrect, they should be only 4 digit long. There seems to be working something incorrectly. When you've running evolution on console, you should also notice there some warning message about unknown code page and that it fallbacks to UTF8. I noticed this myself too. Maybe the 65001 is a code for UTF8, but what's the previous one I do not know.
Comment 49 Mattias Eriksson 2009-07-13 14:05:15 UTC
Do you guys like me to do some more investigation? If so just let me know, I really want this bug to be fixed so I may start use the exchange calendar with evolution.
Comment 50 Milan Crha 2009-08-10 17:14:35 UTC
Created attachment 140351 [details] [review]
proposed ema patch ]I[

for evolution-mapi;

This is similar patch as that above, only the information about used codepage is taken from PR_MESSAGE_CODEPAGE, not PR_INTERNET_CPID. Seems to work fine for your test message. The rest, about incorrectly treated characters in the decode function, is out of scope of this bug.
Comment 51 Milan Crha 2009-08-10 17:15:54 UTC
Jony, I believe there will be some followup patches and/or bugs, but please take care of this so long pending and then fix the rest. Ema is so big...
Comment 52 Milan Crha 2009-08-10 17:51:13 UTC
errr, messages sent by evo through exchange imap are reported with CP1252 for me, but inside is nice utf-8 encoding. By the way, the CP65001 is supposed to be UTF-8, maybe that internet encoding is the proper thing. Let me try slightly more tomorrow.
Comment 53 Milan Crha 2009-08-11 18:42:16 UTC
Created attachment 140469 [details] [review]
proposed ema patch IV

for evolution-mapi;

This is the final patch, which works fine for me on things I tried. The subject in summary is also shown properly now, for both messages from evo and that test message. Same for the message body. This should be ready to live in sources.
Comment 54 Mattias Eriksson 2009-08-12 06:38:24 UTC
Have you tested the patch with calendar attachments containing non-ascii chars?
Comment 55 Milan Crha 2009-08-12 07:40:35 UTC
(In reply to comment #54)
> Have you tested the patch with calendar attachments containing non-ascii chars?

Nope, this is for mails only. I thought it's covered by your patch from comment #45, isn't it?
Comment 56 Johnny Jacob 2009-08-12 13:01:40 UTC
*** Bug 577323 has been marked as a duplicate of this bug. ***
Comment 57 Johnny Jacob 2009-08-12 13:02:40 UTC
*** Bug 591060 has been marked as a duplicate of this bug. ***
Comment 58 Johnny Jacob 2009-08-12 13:03:18 UTC
*** Bug 572144 has been marked as a duplicate of this bug. ***
Comment 59 Johnny Jacob 2009-08-12 13:05:12 UTC
*** Bug 586525 has been marked as a duplicate of this bug. ***
Comment 60 Mattias Eriksson 2009-08-12 18:26:49 UTC
My patch from Comment #45, is not usable without fixing the libmapi/utf8convert fix from Comment #44. So this must be fixed at the same time I guess, if this solution with converting the attachments to utf8 should be used. 
But I'm not sure that the conversion fix in libmapi is good, since I don't know flex as I said in my Comments. I'm not sure that this is the "correct" way to solve it, I guess it would be better if evolution could work by setting the charset for the attachment... 

//Mattias
Comment 61 Johnny Jacob 2009-08-13 06:53:58 UTC
(In reply to comment #53)
> Created an attachment (id=140469) [edit]
> proposed ema patch IV
> 
> for evolution-mapi;
> 

Please commit. 

And feel free to use http://www.gnome.org/~jjohnny/tmp/569321-updated.patch I resolved some patch conflicts with master.

Thanks
Comment 62 Milan Crha 2009-08-13 07:59:45 UTC
Created commit 361893b in ema master (0.27.91+)

Thanks for the patch, interestingly my git creates almost the same version :)
Comment 63 Milan Crha 2009-08-13 08:39:41 UTC
I just tried to create a test meeting with couple of UTF-8 characters in OWA and I see all of them correctly, on a MAPI master after the above commit. I guess it's related to changes Johnny mentioned above (patch conflicts with master).
Though I see charset="CP28592" in the mail, which is strange, but works and I do not see any warning on the console with this code page.

(I'm closing this bug, it's awfully long and hard to read, if you encounter any similar issue, please open a new bug. Thank you.)
Comment 64 Mattias Eriksson 2009-10-19 13:52:06 UTC
I just upgraded to ubuntu karmic beta just to be able to test this. The mail body and summary seems to be OK! 

However, the calendar attachment still seems to be in the wrong encoding. The encoding problmes seem to be pretty simmilar with the same kind of bad encoding as in the mail summary. (I can create a meeting invite with åäö and have you as participants so you get the invite if you need some tho play with). I was going to attach the mail but it crached before I could get it pasted here and now it keeps krashing...  
Anyway, the calendar with the bad encoding have the following Content-Type.
Content-Type: text/calendar; charset="CP28591"


Also, it seems that all mails containing åäö gets filtered somehow so they loose the @ charachter in the summary pane. If you look at the mail source it is still there, so it it just a presentation think

And as always, just let me know what you need me to do for you to get the info you need.
Comment 65 Mattias Eriksson 2009-10-19 14:02:29 UTC
Also it seems related is that I got mail containing attachemnts and some text that contains åäö. Then it is only displayed as

[?]

but the attachment seems to work.

The relevant part of the mail seems to be (from the last line of the mailheader to the begining of the attachment). This sample it my salary specification, so I will not include the full mail here :) 
Note that the text Lnespecifikation 2009-10-23 really should be Lönespecifikation 2009-10-23.

Content-Type: multipart/related; type="multipart/alternative"; boundary="=-F0Q0iqmCOg83APDYvpOW"


--=-F0Q0iqmCOg83APDYvpOW
Content-Transfer-Encoding: 8bit
Content-Type: text/html; charset="CP28591"

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 08.00.0681.000">
<TITLE></TITLE>
</HEAD>
<BODY>
<!-- Converted from text/rtf format -->

<P><FONT FACE="Calibri">Lnespecifikation 2009-10-23</FONT>
</P>

</BODY>
</HTML>
--=-F0Q0iqmCOg83APDYvpOW
Content-Disposition: attachment; filename*=ISO-8859-1''mer%F1vizrt.com_2009-10-23.pdf
Content-Type: application/octet-stream
Content-ID: <1255959288.7490.2.camel@saphira>

%PDF-1.3

%1279

1 0 obj

<<

/Type /Catalog
Comment 66 Milan Crha 2009-10-20 14:25:25 UTC
> I can create a meeting invite with åäö and have you as
> participants so you get the invite if you need some tho play with

yes, please send me one on my bugzilla email for testing.
Comment 67 Mattias Eriksson 2009-10-21 06:39:23 UTC
an invite has been sent.
Comment 68 Milan Crha 2009-10-21 16:08:40 UTC
(In reply to comment #64)
> Also, it seems that all mails containing åäö gets filtered somehow so they
> loose the @ charachter in the summary pane. If you look at the mail source it
> is still there, so it it just a presentation think

Correct, the MAPI text parses is dropping them, but Kerihuel is on that and will fix it (in openchange/mapi library)

(In reply to comment #67)
> an invite has been sent.

Correct. I realized the evo's MAPI is not doing things quite well in the message composition from MAPI tags and properties, and I know jony was talking about OCXMAIL implementation in evo's MAPI, which will replace the actual implementation and the rest should work fine after that. But that's quite much work to be done, unfortunately, at least from my point of view of not much knowledgeable MAPI code reader.
Comment 69 Valent Turkovic 2009-11-27 08:20:28 UTC
Hi, I'm net to this discussion. Look like the same bug bit me. We are an Croatian ISP company and we use Exchange Mail Server 2007.

I have setup Evolution and it connects without problem, but I can't read half email messages. I'm running latest Evolution 2.28 in Fedora 12.

This is the message I get in console:
camel-WARNING **: Could not open converter for 'CP28592' to 'UTF-8' charset

I have one message example that I can't open and will add it via attachement. 

Is there a workaround so that I can use Evolution and read all messages or I have to wait for a bugfix?
Comment 70 Valent Turkovic 2009-11-27 08:21:20 UTC
Created attachment 148576 [details]
example of one message not displayed in Evolution

example of one message not displayed in Evolution
Comment 71 Valent Turkovic 2009-12-10 12:43:26 UTC
As per this Fedora bug:
https://bugzilla.redhat.com/show_bug.cgi?id=496594

I can just confirm that with evolution-mapi-2.28.1 this bug is still present, at least in Fedora 12 :(
Comment 72 Valent Turkovic 2009-12-28 09:30:04 UTC
Created attachment 150487 [details]
UTF-8 issues in subject

I'm attaching problems displaying UTF-8 characters in Subject. I'm not sure if this is the same bug so please tell me if I need to open a new bug.

In this attachment you will see header showing email as "eljko Kristek <jogijware.xx> instead of "Željko Kristek <jogi@jware.xx> which is correctly displayed in email body (just see the screenshot).

Thank you in advance.
Comment 73 Valent Turkovic 2009-12-28 09:38:40 UTC
I sent one mail to my boss and she couldn't read it because all croatian letter got messed up! I looked in my send mail and saw why.

word "Ra=C4=8Dun" instead of "Račun"
"Sva=C4=8Di=C4=87a" instead of "Svačića"
"uop=C4=87e" instead of "uopće"
"=C5=A1kola" instead of "škola"
"povr=C5=A1inu" instead of "površine"


Do you need more info? Is this the same bug or I need to open new bug for sending email issues?
Comment 74 Valent Turkovic 2009-12-28 09:39:18 UTC
This is all on Fedora 12 running Evolution 2.28.2
Comment 75 Milan Crha 2010-01-04 19:28:23 UTC
The issue with receiving UTF-8 characters is about windows_to_utf8, which is dropping some characters, like "()@," and few others, thus it can lost them from the address. The transition from some code to "Ž" fails for some reason. Hard to tell. Preview panel is using transport headers, which may not be always available, and which are usually received in UTF8 already, thus preview panel doesn't suffer of this. There should be some fix in a recent openchange, but I failed to find in the their ChangeLog.

The issue with sending, it's something different, letters are properly encoded in quoted-printable encoding, but it seems those are not correctly transformed to MAPI structure, thus it finally fails to show them properly on the receiving side, and/or in the Sent folder from Evolution.
Comment 76 Valent Turkovic 2010-01-05 08:56:12 UTC
Should I open a new bug report for any of these two issues or this is the right place to track this issue?

Do you need some more information from me so that you can faster fix this issue?
Comment 77 Mattias Eriksson 2010-01-05 09:23:02 UTC
Milan, about the fix in openchange... I did a minor fix mentioned in Comment #44, but that was never sent upstream (since that was just an cludge to show the problem... I do not know what the correct fix is). 
But hopefully you are refering to some real fix in openchange.

//Mattias
Comment 78 Valent Turkovic 2010-01-15 08:02:54 UTC
Milan: Do you know in what version of openchange and what versions of packages
in fedora does this respond to? I would like to try these new versions and
report back.
Comment 79 Milan Crha 2010-01-15 13:15:57 UTC
Thanks for a ping, I tend to forget of this. I managed to catch an upstream developer of openchange and the fix for the converting function (not dropping the letters) didn't make it into 0.9, so there is no fix for this yet. We decided to do a workaround in evolution-mapi until it'll be fixed in openchange itself.
Comment 80 Milan Crha 2010-01-15 16:59:22 UTC
Created attachment 151485 [details] [review]
ema patch/hack

for evolution-mapi;

It took me some investigation, but I finally managed to create this. It contains the hack in the function, which truly doesn't work always, but should be better than nothing. I also tried to find other way of decoding the returned string from windows encoding to utf8, but here didn't work anything known to me (locale -m), so it's some other encoding than I have installed on my system. Nonetheless I finally realized that the summary fetch doesn't request UNICODE versions for strings, thus it explains why that "Ž" got lost there, but was kept in the message preview. I do not know why I didn't realize earlier. The newly fetched messages should be better shown in the summary part (message list) with this patch applied.

Valent, this doesn't cover the issue with quoted-printable encoding in the message body. Do you have a new bug report for that already? I would prefer to deal with it somewhere else than here. Thanks.
Comment 81 Milan Crha 2010-01-15 17:05:52 UTC
Created commit fa61217 in ema master (0.29.6+)
Created commit 6de81b0 in ema gnome-2-28 (0.28.3+)
Comment 82 Milan Crha 2010-01-15 17:10:03 UTC
Can we open new bug reports for the rest items here, and maybe add a reference to it here, as this bug report is really long and pretty hard to read already?

From what I see we have here two left:
- Mattias' calendar attachments (comment #64)

- Valent's quoted-printable encoding issue in message body
  not recognized (comment #73)

If I overlooked any, feel free to add.
Comment 83 Valent Turkovic 2010-01-28 09:55:03 UTC
new bug opened from comment #73:
https://bugzilla.gnome.org/show_bug.cgi?id=608320
Comment 84 Julien Kerihuel 2010-02-01 23:02:13 UTC
Created attachment 152782 [details] [review]
Fix problem on openchange side

Apply it on latest openchange rev (r1695 or above)
Comment 85 Julien Kerihuel 2010-02-01 23:04:06 UTC
Hi all,

I have a pending patch in OpenChange that should solve the whole problem for any properties (as long as you use the _UNICODE version of the tag). My patch also gets ride of the ugly windows_to_utf8 and utf8_lexer we introduced long time ago.

You now receive a UTF8 string you can convert on purpose to any special encoding.

I have done test in Linux console and it reads properly an email with:
1. subject in Russian
2. part1 of the body (plain text) in russian
3. part2 in french
4. part3 quote in Spanish

I would appreciate any feedback on this prior merging this patch to trunk.

Please apply the following patch to the latest openchange trunk revision. 

https://bugzilla.gnome.org/attachment.cgi?id=152782


To test it, just use "openchangeclient --fetchmail" command. If your Linux console is configured with charset encoding to UTF-8, you should notice the difference.

Cheers
Comment 86 Milan Crha 2010-02-02 05:20:50 UTC
+++ utils/openchange-tools.c	(working copy)
@@ -56,13 +56,13 @@
 
 	if (((proptag & 0xFFFF) == PT_STRING8) ||
 	    ((proptag & 0xFFFF) == PT_UNICODE)) {
+		proptag = (proptag & 0xFFFF0000) | PT_UNICODE;
+		str = (const char *) find_SPropValue_data(aRow, proptag);
+		return (void *)str;
+
> 		proptag = (proptag & 0xFFFF0000) | PT_STRING8;
> 		str = (const char *) find_SPropValue_data(aRow, proptag);
> 		if (str) return (void *)str;
-
-		proptag = (proptag & 0xFFFF0000) | PT_UNICODE;
-		str = (const char *) find_SPropValue_data(aRow, proptag);
-		return (void *)str;
 	} 
 
 	return (void *)find_SPropValue_data(aRow, proptag);


The highlighted part above will never happen, is it intended?

And what about using PR_COMMENT_UNICODE? Maybe there are more there, but this one was shown in the patch.
Comment 87 Julien Kerihuel 2010-02-02 11:13:41 UTC
Indeed,

For the first one it is more a cut and paste typo.

I may have missed this one, will fix it when committing the code.
Comment 88 Milan Crha 2010-02-02 12:57:21 UTC
OK, I tested the patch and it seems to work fine, on openchangeclinet fetchmail and in evolution-mapi on PT_UNICODE string read (was even before) and write, which is a very nice improvement (there can be almost anything in the Subject with revision 1695+ (I've 1698 at the moment). The only thing is that it seems like not converting strings for PT_STRING8 to utf8, or does the conversion with a wrong code-page. Not sure yet (we are discussing on IRC at the moment, just wanted to make a record).