Bug 302991 – RFC2047 subject decoding of outlook emails (?iso-8859-...)

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 302991 - RFC2047 subject decoding of outlook emails (?iso-8859-...)


Summary:	RFC2047 subject decoding of outlook emails (?iso-8859-...)


Status:	RESOLVED FIXED

Product:	evolution
Classification:	Applications
Component:	Mailer
Version:	2.2.x (obsolete)
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	parthasarathi susarla
QA Contact:	Evolution QA team

URL:
Whiteboard:

Duplicates:	317083 325290 372986 (view as bug list)
Depends on:
Blocks:

Reported:	2005-05-04 13:37 UTC by Sebastien Bacher
Modified:	2012-01-31 12:49 UTC

See Also:
GNOME target:	---
GNOME version:	Unversioned Enhancement

Attachments
Screenshot of the "From" field in the message list. (47.89 KB, image/png) 2008-04-06 17:19 UTC, ljuksi	Details
evolution bug i message list (273.42 KB, image/png) 2008-09-12 14:51 UTC, Mikolaj Mackowiak	Details

Description Sebastien Bacher 2005-05-04 13:37:52 UTC

This bug has been opened here: https://bugzilla.ubuntu.com/9078

"evolution does not display € the right way in the listview, they are displayed
the right way in the mail display part.
...
from line looks like this:
From: =?iso-8859-1?Q?=80?= <emal@example.com> "

from IRC:

<kjetilho> the correct name is CP1252, I think.
<kjetilho> yep, http://www.microsoft.com/globaldev/reference/sbcs/1252.htm
...
<seb128> kjetilho: is evolution supposed to handle that correctly? ie: is that a
bug?
<kjetilho> seb128: it's not a bug in Evolution
<seb128> hum, there is http://bugzilla.gnome.org/show_bug.cgi?id=259292 about this
<kjetilho> but of course Evolution could incorporate a hack to support it
<seb128> kjetilho: apparently it has a hack somewhere, since the mail is
correct, only the mail list displays it bugged
...
<kjetilho> seb128: the mail is _not_ correct
<kjetilho> it claims to be iso-8859-1, but it's not.
<seb128> kjetilho: how come than it's displayed correctly by the preview pane?
<NotZed> there is a hack in the display code to check for windows charsets and
remap them to the correct one
<seb128> k
<seb128> that explains it
<kjetilho> NotZed: heh.  inconsistent handling _is_ a bug ;)

Comment 1 Not Zed 2005-07-29 08:54:21 UTC

ask the sender to fix their mailer

Comment 2 Sebastien Bacher 2005-08-03 19:22:22 UTC

try to ask to microsoft to fix their mailers, that will take some time and some
user still use outdated version. For the moment that's evolution which seems to
be broken to users since he deals with the same subject differently on 2
different places andmozilla has no issue with these mails by example. As
described by the original comment that would be nice to workaround it.

Comment 3 André Klapper 2006-01-01 14:36:35 UTC

*** Bug 325290 has been marked as a duplicate of this bug. ***

Comment 4 André Klapper 2006-01-01 14:37:35 UTC

rephrasing subject

Comment 5 André Klapper 2006-01-03 13:17:47 UTC

*** Bug 317083 has been marked as a duplicate of this bug. ***

Comment 6 Kjetil Torgrim Homme 2006-01-03 14:14:52 UTC

the IRC transcript quotes rather daming comments from me, but I don't actually think a workaround for this would be a horrible thing.  I copy a little from my comments to bug 317083:

"the bug is about Evolution displaying the octet value 0x80 as the glyph

  |00|
  |80|

rather than the euro sign (€), even though the charset is declared to be ISO
8859-1.  in other words, the broken client sends out characters from the
CP-1252 coded character set, but claims the characters are in ISO 8859-1.  the
RFCs don't have anything to say about that...

this bug doesn't really touch on RFC 2047 decoding at all.  it's just a
request to be a bit lenient and special case the octet value 0x80, so that it
maps to the Unicode U+20AC.  personally, I don't see a great harm in that."

I don't know if there are other characters in CP-1252 which could do with similar remapping.  I'm pretty sure the euro character is the most noticable one.

BTW, the subject for this bug should be changed back so that it is less misleading.

Comment 7 oa 2006-01-03 15:07:36 UTC

At least bug 325290 is about a slightly different topic of what is the proper delimitation of the =??Q??= encoding. RFC 2047 states that it should always be separated by white space from other header words, but in practice a lot of clients, Outlook in particular, don't respect this when encoding the headers. It's a case of "be strict when you encode, and lenient when you decode".

My interpretation of bug 317083 is that it's not about the charset either (at least primarily), but the same "what's an encoded-word" decision. Quote from the RFC:

IMPORTANT: 'encoded-word's are designed to be recognized as 'atom's
by an RFC 822 parser. As a consequence, unencoded white space
characters (such as SPACE and HTAB) are FORBIDDEN within an
'encoded-word'. For example, the character sequence

=?iso-8859-1?q?this is some text?=

would be parsed as four 'atom's, rather than as a single 'atom' (by
an RFC 822 parser) or 'encoded-word' (by a parser which understands
'encoded-words'). The correct way to encode the string "this is some
text" is to encode the SPACE characters as well, e.g.

=?iso-8859-1?q?this=20is=20some=20text?=

The characters which may appear in 'encoded-text' are further
restricted by the rules in section 5.

I can't find a good quote about the "whitespace must precede =? and trail ?=" part, but it's implicit by how whitespace should be treated when two encoded-words follow each other. Anyhow - in most of the cases quoted in the linked bugs, the encoded-words violate the quoted paragraphs above, but as they are commonly occuring cases in real-world email, Evolution should treat them "as they were meant".

Comment 8 parthasarathi susarla 2006-01-03 16:05:54 UTC

I really think this should be a WONTFIX.

Hmm... users would *not* really like that though.

Comment 9 oa 2006-01-03 17:39:52 UTC

WONTFIX would be the simple choice, yes, but what is most important, standards compliance, security, or interoperability? I would say that without the latter, the former two are meaningless.

Comment 10 Kjetil Torgrim Homme 2006-01-03 22:30:51 UTC

oa, which bug are you talking about?

anyway, interoperability is defined by standard compliance.  only when the standard is incompletely specified, leading to ambiguities, should you consider mimicing other implementations to enhance interoperability.

Comment 11 oa 2006-01-04 08:39:11 UTC

My apologies - my comment is relevant for Bug 325290 and perhaps Bug 318083, but not this one, as those two are NOT duplicates of this.

A comment relevant to this bug: it seems that for some encodings, Evolution does not display the properly decoded form in the listview although it does in the message display. Two From: headers:

From: Aapo =?iso-8859-1?Q?Kyr=F6l=E4?=  <email.deleted>
From: =?ISO-8859-1?Q?Aapo_Kyr=F6l=E4?= <email.deleted>

Both shown correctly in the message view, but the first one is not decoded in the listview. In this case, there certainly is no problem with the character sets.

The one displaying the error is much older, first read with an older version of Evo. Should I clear some cache for this test? Which one and how?

Comment 12 Jeffrey Stedfast 2006-01-04 16:35:00 UTC

yes, you'll prob need to clear your cache. what you need to do is delete .ev-summary file for the mailbox in question.

Comment 13 oa 2006-01-05 15:39:11 UTC

I deleted all cache directories and summary files under .evolution/mail/imap4, but the problem persists. Messages containing the first From header display incorrectly (in listview only), messages with the second header are OK. Only my four precreated (empty) local folders have .ev-summary.

Comment 14 Kjetil Torgrim Homme 2006-01-05 16:07:00 UTC

(In reply to comment #13)
> I deleted all cache directories and summary files under .evolution/mail/imap4,
> but the problem persists. Messages containing the first From header display
> incorrectly (in listview only), messages with the second header are OK. Only my
> four precreated (empty) local folders have .ev-summary.

both headers work perfectly here (2.4.1), both in listview and preview.  I sent the messages by telnet to my SMTP server, with the values:

From: Aapo =?iso-8859-1?Q?Kyr=F6l=E4?=  <email@deleted.com>
From: =?ISO-8859-1?Q?Aapo_Kyr=F6l=E4?= <email@deleted.com>

Comment 15 Jeffrey Stedfast 2006-01-05 18:04:14 UTC

you probably deleted the wrong cache, imap4 is experimental - you're probably using imap instead.

Comment 16 oa 2006-01-06 16:45:48 UTC

Been using imap4 since Evo 2.4's release. I verified, there were no caches left anywhere in the evo directory tree (which is why I wiped all the caches, and not just those of the folder in question).

Comment 17 oa 2006-01-06 16:53:49 UTC

imap backend doesn't suffer from the same problem, based on a test of adding the same account a second time. I was under the impression that imap4 was considered the preferred backend type, though. Perhaps I should switch back...

Comment 18 Jeffrey Stedfast 2006-01-06 16:57:19 UTC

no, imap4 is experimental and is not the preferred backend. probably gonna be removed from the tree since I left the project nearly a year ago (why am I even looking at bugzilla? I have no idea... :p)

Comment 19 Jeffrey Stedfast 2007-12-26 00:27:15 UTC

fixed in svn

Comment 20 André Klapper 2008-02-29 06:52:18 UTC

*** Bug 519323 has been marked as a duplicate of this bug. ***

Comment 21 ljuksi 2008-04-06 17:19:08 UTC

Not fixed yet in Evo 2.22

Comment 22 ljuksi 2008-04-06 17:19:52 UTC

Created attachment 108722 [details]
Screenshot of the "From" field in the message list.

Comment 23 Leonardo Ferreira Fontenelle 2008-04-06 18:35:00 UTC

Same here.

Comment 24 Jeffrey Stedfast 2008-04-06 20:27:03 UTC

it's fixed in 2.23

Comment 25 Mikolaj Mackowiak 2008-09-12 14:51:58 UTC

Created attachment 118591 [details]
evolution bug i message list

it isn't fixed in 2.23.92

Comment 26 Kjetil Torgrim Homme 2008-09-12 15:59:01 UTC

Mikolaj, you really need to include the raw header (copied from View Source) for developers to make a judgement of whether Evolution should add another exception to the code.

I think it is important to stress that the bug is *not* in Evolution, although Evolution tries to undo some forms of braindamage in other e-mail clients.

Comment 27 Jeffrey Stedfast 2008-09-12 16:22:20 UTC

his examples are also not of the subject header, they are address headers.

anyways, Mikolaj is running into the "gmail doesn't properly encode address headers" bug which is filed as bug #536457

Comment 28 André Klapper 2012-01-31 12:49:01 UTC

*** Bug 372986 has been marked as a duplicate of this bug. ***