GNOME Bugzilla – Bug 623838
Evolution does not use the character encoding that is hinted at in the subject line
Last modified: 2021-05-19 11:12:07 UTC
( Forwarded from Launchpad Bug #596849: https://bugs.launchpad.net/ubuntu/+source/evolution/+bug/596849 ) When receiving e-mails that do not correctly include information about the character encoding used, evolution does not display the e-mail correctly, even when the character encoding used is revealed in the subject line. By looking at the raw e-mail, I presume the correct way to specify character encoding is by using the "Content-Type" MIME header. However, if the subject of the e-mail is encoded using a specific character encoding (by following the following standard: http://en.wikipedia.org/wiki/MIME#Encoded-Word), wouldn't it be safe to assume that the e-mail body - if it does not specifically specify a character encoding - uses the same character encoding as the subject line? Are there any examples where this might not be the case? Example: I receive an e-mail with the following content: Subject: =?iso-8859-1?Q?din_bestilling_-_ordre_nr_123456?= Sender: "=?iso-8859-1?Q?someone@somewhere=2Edk?=" <someone@somewhere.dk> From: "=?iso-8859-1?Q?someone@somewhere=2Edk?=" <someone@somewhere.dk> Date: Mon, 21 Jun 2010 00:47:48 +0200 To: "=?iso-8859-1?Q?mymailaddress@mail=2Ecom?=" <mymailaddress@mail.com> X-Priority: 3 X-MSMail-Priority: Normal MIME-Version: 1.0 X-Mailer: JMail 4.4 by Dimac Content-Type: text/html Content-Transfer-Encoding: 8bit The "Content-Type"-header does not correctly state that the character encoding used is ISO-8859-1, but the subject line uses an Encoded-Word to specify this information (and Evolution correctly displays the subject line).
> wouldn't it be safe to assume that the e-mail body - if it does not specifically specify a character encoding uses the same character encoding as the subject line? Unfortunately, it is unsafe to make that assumption. > Are there any examples where this might not be the case? Quite a few :-( A lot of mailers will use UTF-8 in the headers, for example, and the user's locale charset in the body. Japanese mailers, as another example, will often use one iso-2022-jp in the headers, but ShiftJIS in the body. Chinese, Korean, Russian, etc mailers all do similar things. Evolution used to (at least back when I was the maintainer) have a menu option to override the charset used to display the message. The way I remember Evolution working was, in order of preference: 1. for text/html mails, use the charset from the Content-Type property embedded in the HTML content 2. Use the charset from the Content-Type header 3. If the content is valid UTF-8, use UTF-8 4. Use the user's locale charset Maybe the current maintainers can try adding in a fallback that attempts to use whatever charsets it finds in the headers (but be warned: not only might each header use a different charset, but each encoded-word token in each header might also use a different charset)
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/Community/GettingInTouch/BugReportingGuidelines and create a new enhancement request ticket at https://gitlab.gnome.org/GNOME/evolution/-/issues/ Thank you for your understanding and your help.