GNOME Bugzilla – Bug 517440
Gmail's IMAP server corrupts some TNEF attachments
Last modified: 2016-05-04 16:01:17 UTC
Please describe the problem: Some TNEF attachments aren't correctly decoded (by the TNEF Attachment Decoder plugin). In those cases a corrupted version of the attachment is used. Steps to reproduce: 1. Open a mail with a (base64 encoded) TNEF attachment. 2. 3. Actual results: Attachment isn't decoded properly. View / Message Source shows a message (supposedly base64 encoded) that is part plain text part (corrupted?) binary data. Expected results: Attachment is decoded properly. View / Message Source should show the base64 encoded source of the TNEF attachment. Does this happen every time? No: only with some TNEF attachments, under some circumstances. Other information: A few observation: - these are TNEF attachments that are actually only containers of the real attachments (say: two PDF files); - corruption happens when using an IMAP account; - no corruption (and the enclosed "real" attachments are properly displayed) when importing these same emails locally (using a copy of these mails form the IMAP server, converted to MBOX format). Not sure what happens here. I'll have to investigate further.
can you attach an example message here (please remove any confidential data before)?
Removing confidential information from these messages wouldn't leave much to analyze, so I won't attach an example message. I did however do some further testing. This testing suggests it is either a problem with the remote IMAP server or a problem in the communication between Evolution and that server: I set up a local IMAP server (on a NAS on which makes that _really_ easy) and the TNEF Attachment Decoder had no problems with the (almost identical) TNEF attachments to message on my local IMAP server. So the next step will be using another mail client to read the messages with the TNEF attachments on the remote IMAP server and see what happens. To be continued ...
First test: Thunderbird 2.0.0.9 (using an rpm from the updates repository for Fedora 8) retrieved the message form the remote IMAP just fine (when looking at the message's source; I wouldn't know whether thunderbird actually handles TNEF attachments). Second test: renamed the folder for this IMAP account in .evolution/mail/imap, disabled the TNEF Attachment Decoder plugin, restarted evolution. Viewed the source one of the offending messages (which seems to force evolution to regenerate the attachment data) and the same corruption occurred. So the TNEF Attachment Decoder plugin is not at fault: is seems to have to handle garbage (which it of course can't, but at least it doesn't crash). Somehow this attachment gets corrupted when it's retrieved and/or written to disk. Non standard IMAP server? Evolution bug? I'm not sure how to proceed. Any suggestions?
re comment #1: would there be anyone with access to a Microsoft mail client (or a Microsoft mail server?) that happens to generate these TNEF attachments (named "winmail.dat")? That might help in creating some example files and/or debugging.
I've done some further analysis. The remote IMAP seems to know about TNEF and seems not to send a (base64 encoded) TNEF encoded file named "winmail.dat" but instead sends (unencoded) the files that are actually included in the TNEF attachment. CAMEL_DEBUG="imap,[...]" output is (heavily edited): Thread b4c74b90 > Folder get message '115' folder info -> Subject: [...] To: Paul Bolle <pebolle@tiscali.nl> Cc: (null) mailing list: (null) From: [...] UID: 115 Flags: 10038 < b4c74b90 > Setting part content type to 'text/plain; charset=us-ascii' contentinfo type is 'text/plain; CHARSET=us-ascii' Setting part content type to 'application/ms-tnef; name=winmail.dat' contentinfo type is 'application/ms-tnef; NAME=winmail.dat' Setting message content type to 'multipart/mixed; boundary="[...]"' contentinfo type is 'multipart/MIXED' Literal: -->date: Tue Feb 19 12:01:26 PST 2008 content-type: application/msword filename: [...] date-modified: Tue Feb 19 12:01:26 PST 2008 date-created: Tue Feb 19 12:01:26 PST 2008 content-length: 33280 [Valid MS Word Document]date: Tue Feb 19 12:01:26 PST 2008 content-type: application/msword filename: [...] date-modified: Tue Feb 19 12:01:26 PST 2008 date-created: Tue Feb 19 12:01:26 PST 2008 content-length: 242176 [Another MS Word Document]<-- Evolution treats this data (two headers and two MS Word documents) as a base64 encoded TNEF attachment. The TNEF decoder cannot handle it, which isn't very surprising. Please not that I cannot yet say whether Evolution could now it's not send an base64 encoded attachment.h
you probably need to check what the BODYSTRUCTURE response from the server was. If it declared the content to be base64 encoded, then when evolution requests the part, it better be base64 encoded or the server is broken :) either that or the message is broken...
Created attachment 105690 [details] CAMEL_VERBOSE_DEBUG output In response to comment #6, I've attached the output of env CAMEL_VERBOSE_DEBUG=1 evolution 2>&1 | grep -v gtkhtml-WARNING Output for (roughly) the same sequence of events as in comment #5 was selected, with the same sort of personal stuff edited out. Please note that two debugging streams seem to be mixed. I did not try to separate them more cleanly.
("APPLICATION" "MS-TNEF" ("NAME" "winmail.dat") NIL NIL "BASE64" 275913) the server says the ms-tnef part is base64 encoded, but it isn't. this /probably/ means that the raw message is broken... but it's /possible/ that the server is decoding it on the fly when giving us the ms-tnef part content. the only way to know for sure is probably to view the raw message content on the server.
I'm going to close this "NOTGNOME" since this must be a bug in Gmail's IMAP server. (I'll try to update the summary to reflect this.) Just a couple of observations, just to archive them somewhere: - problem doesn´t occur with Gmail over POP (the TNEF attachment is base64 encoded when downloaded over POP); - the documents send in the not-base64-encoded attachment aren't actually valid (comment #5 was incorrect): their length differ from the "content-length" in the "headers" and they seem to be corrupted (I couldn't get open them in Abiword, Evince, etc.); - Thunderbird downloads the entire message when "viewing source". This entire message is correct (i.e. has a base64 encoded TNEF attachment). When you save the attachment, you'll end up with the same problem as described in this bug; - decoding TNEF attachments isn't actually a bad idea. TNEF attachments seem to be just containers for the real attachments and provide little benefit over more common ways to add attachments. I'll have to prod Gmail to fix this on their IMAP server(s).
Thunderbird's behavior (downloading a correct entire message) got me digging through Evolution's code again. In camel/providers/imap/camel-imap-store.c (from evolution-data-server) I stumbled on the cute CAMEL_IMAP_BRAINDAMAGED environment variable. Restarting evolution with that environment variable set (after blowing away the Gmail IMAP folder on disk) showed that the troublesome TNEF attachments now can be decoded (because Gmail sends the entire message with a correctly encoded TNEF attachment). So, we have a workaround. Besides, there might be more to do: - add Gmail to the IMAP hall of shame (in camel/providers/imap/camel-imap-store.c); or - if a TNEF attachment from Gmail cannot be decoded, fetch the entire message and try to decode the attachment again (not sure if that's doable within a reasonable sized patch); or - something in between. Reopening for further discussion.
I would just vote for adding gmail to the imap hall of shame - a lot easier to implement ;)
(In reply to comment #11) > I would just vote for adding gmail to the imap hall of shame - a lot easier to > implement ;) 0) I once again rebuild evolution with the TNEF Attachment decoder plugin enabled, using current Fedora Rawhide's evolution (evolution-2.28.0-2.fc12.i686). Turns out this is still an issue with Gmail IMAP. (I thought this was actually resolved at their end. I must have been mistaken.) 1) Can somebody confirm this? 2) What is needed to add gmail to the hall of shame (I haven't looked at the relevant code in e-d-s in quite some time)?
I know this is a long time ago, I think of bug #761096 for IMAPx these days, where some GMail messages could be downloaded corrupted when using multi-fetch. The IMAPx wasn't used in the time of this bug report for sure. I cannot speak about multi-fetch though. It would be nice to retest with 3.20.1 or any later version, which I would do, but I did not find any message with tnef attachments to be able to test it, unfortunately. I'm closing this, but feel free to retest and report back here.