After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 582678 - Evolution could do a better job when displaying characters that are in windows-1252 but not iso-8859-1
Evolution could do a better job when displaying characters that are in window...
Status: RESOLVED OBSOLETE
Product: evolution
Classification: Applications
Component: Mailer
2.26.x (obsolete)
Other All
: Normal enhancement
: ---
Assigned To: evolution-mail-maintainers
Evolution QA team
Depends on:
Blocks:
 
 
Reported: 2009-05-14 21:26 UTC by Mike Crowe
Modified: 2021-05-19 11:09 UTC
See Also:
GNOME target: ---
GNOME version: 2.25/2.26



Description Mike Crowe 2009-05-14 21:26:30 UTC
Please describe the problem:
Ubuntu 9.04. evolution 2.26.1-0ubuntu1

I'm using the Exchange connector but I doubt that this affects Evolution's behaviour.

I received a multi-part HTML email from another user of the same Exchange server. Despite the HTML part starting with

 Content-Type: text/html; charset="iso-8859-1"
 Content-Transfer-Encoding: quoted-printable

it contained the quoted character "=92" which is the windows-1252 code for a curly quote.

This appeared in Exchange as a "character not available in font" box containing "0092".

The same problem occurred with a hand-crafted plain text email.

Of course this is clearly Exchange or Outlook's fault for claiming iso-8859-1 but then using windows-1252 characters. But having said that, it would seem to be quite easy for Evolution to work around this and show the expected glyph. The characters in that range are considered to be control characters in iso-8859-1 and could therefore be automatically mapped to their corresponding Unicode code points based on the windows-1252 encoding. See http://en.wikipedia.org/wiki/Windows-1252#Codepage_layout .

Steps to reproduce:
Here's how I reproduced without involving Exchange or Outlook:

1. Enter the following as an appropriate SMTP session (changing addresses as appropriate):

mail from: Me <me@here>
rcpt to: Me <me@there>
data
From: Me <me@here>
To: Me <me@there>
Subject: 1252 test
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

There=92s a strange character here.
.


2. View the email in Evolution.



Actual results:
The character represented by =92 appears as a box containing 0092.

Expected results:
The character represented by =92 appears as the curly quote character at Unicode code point 2019.

Does this happen every time?
Yes.

Other information:
All characters in the range 0x80 to 0x9f behave the same and would benefit from the same workaround.

This bug was originally entered at https://bugs.launchpad.net/ubuntu/+source/gtkhtml3.14/+bug/373325
Comment 1 Paul Bolle 2009-08-26 19:59:58 UTC
0) Confirming. As suggested in comment #0, the Exchange Connector angle is irrelevant.

1) The wikipedia page contains a link to (a version of) the HTML 5 Draft Recommendation, with this interesting phrase:

"When a user agent would otherwise use an encoding given in the first column of the following table to either convert content to Unicode characters or convert Unicode characters to bytes, it must instead use the encoding given in the cell in the second column of the same row. When a byte or sequence of bytes is treated differently due to this encoding aliasing, it is said to have been misinterpreted for compatibility.

Character encoding overrides
Input encoding  Replacement encoding  References
[...]
ISO-8859-1      windows-1252          [RFC1345] [WIN1252]
[...]

Note: The requirement to treat certain encodings as other encodings according to the table above is a willful violation of the W3C Character Model specification, motivated by a desire for compatibility with legacy content."

2) Haven't looked at any code yet. Can't say whether evolution can be said to "convert" as described in that phrase.

3) Anyway, one question that comes up is whether evolution should treat iso-8859-1 as windows-1252:
- for all characters; or
- for characters in the range 0x80 to 0x9f only.
Comment 2 André Klapper 2021-05-19 11:09:37 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. 
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/Community/GettingInTouch/BugReportingGuidelines
and create a new enhancement request ticket at
  https://gitlab.gnome.org/GNOME/evolution/-/issues/

Thank you for your understanding and your help.