Bug 200921 – charset override command for MailDisplay

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 200921 - charset override command for MailDisplay


Summary:	charset override command for MailDisplay


Status:	RESOLVED FIXED

Product:	evolution
Classification:	Applications
Component:	Mailer
Version:	unspecified
Hardware:	Other All

Importance:	Normal major
Target Milestone:	---
Assigned To:	Jeffrey Stedfast
QA Contact:	Evolution QA team

URL:
Whiteboard:	evolution[MIME]

Duplicates:	206448 208474 209759 (view as bug list)
Depends on:	202933
Blocks:

Reported:	2000-11-29 14:59 UTC by Dan Winship
Modified:	2013-09-10 13:59 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Dan Winship 2000-11-29 14:59:20 UTC

Apparently some mailers send text with explicit, but incorrect, charset
encodings.
http://lists.helixcode.com/archives/public/evolution-hackers/2000-November/001060.html

Also, we currently have no way to correctly display text sent
with no charset encoding where the real encoding is not iso-8859-1.

So, we want some sort of command to try an alternate encoding.

Emacs appears to have some ability to automagically detect the encoding of
a buffer, although it's not 100% reliable (it can't distinguish iso-8859-1
from iso-8859-2 for instance). We might want to look into that.

Comment 1 Not Zed 2001-07-17 01:36:51 UTC

Ok, I see 3 major scenarios with this:

1. the charset specified exists, we convert it to utf8 inside the
message.  When the user asks for a different charset, we convert it
from utf8 back to the charset specified in the message, then we
convert from that, saying its the user's charset, back to utf8, for
display.
2. the charset specified doesn't exist or we dont know about it.  In
this case we'll have the raw charset stored in the message, we just
convert that to utf8 using the charset the user asks for.
3. the charset specified exists but the content contains characters
outside of that charset.  In this case the conversion to utf8 may be
incomplete, so conversion back may also be incomplete.  Not sure what
we do about this case.

For conversion we can just use a stream-filter to write to, with the
right filters attached, as we write the stream to memory for
processing in the display code.  So its probably not all that hard to
handle, and maybe add a charset parameter/member to the messagedisplay
api, or something like that.

Its probably not worth adding the automatic charset detector, although
I guess it could be used to prompt the user.

Comment 2 Dan Winship 2001-07-17 14:14:50 UTC

iconv() tells you when it either can't translate, or badly translates
characters. So the code can detect when case #3 is happening and bail
out and fall back to case #2.

For #1, there's the possibility that foo -> UTF8 -> foo isn't an
identity, because there could be multiple ways of representing
the same characters (by decomposing accents or whatever). This
might be enough of an edge case that we don't care though.

Comment 3 Jeffrey Stedfast 2001-08-06 17:56:09 UTC

*** bug 206448 has been marked as a duplicate of this bug. ***

Comment 4 Dan Winship 2001-09-14 19:12:32 UTC

*** bug 209759 has been marked as a duplicate of this bug. ***

Comment 5 Jeffrey Stedfast 2001-09-25 20:44:27 UTC

I've just implemented this in CVS - if the data wrapper's contents are
in raw form, then it assumes the charset is the user's preferred
charset encoding and so uses that when converting to UTF8 before
sending it off to GtkHTML.

Comment 6 Vlad 2001-10-07 12:21:35 UTC

No, the bug is not fixed. In raw form, all non-ascii is displayed as ??? (if message is raw 8bit) or just as QP or base64!!!
Anyway, if wasn't - it's VERY BACKWARDS to tweak 'prefered encoding for sending mail' to affect display the current message!
And what about mails that contain attachments - they will be also visible (in raw form) in raw view!!!
And think about html mails - user will see machine-generated HTML code instead of message text!
Menu item 'Message charset' in 'View' menu is needed, whoose subitems are radiomenuitems - charset names known to Evo.

Comment 7 Jeffrey Stedfast 2001-10-07 21:40:47 UTC

the bug *is* fixed. This fix is not supposed to affect viewing in raw
form. Raw form is exactly that - RAW. This means it does NO charset
conversions, no base64 decoding, no QP decoding...NOTHING.

Comment 8 Vlad 2001-10-08 12:40:32 UTC

I understand that the fix didn't do anything about fixing raw view.
As I understand your comment, upon receipt of the message with incorrectly specified charset, user has to switch to raw message display, then go to options and select another charet, right?
If yes, it's completely broken at best. Add to that that raw view shows non-ascii as ??? and that it doesn't decode content transfer encoding like base64 or QP (not decoding is OK for raw view, I agree). This means that user won't be able to read anything at all. And what about replying?
Reopened the bug.

Comment 9 Jeffrey Stedfast 2001-10-08 18:37:53 UTC

you got it totally wrong. a data-wrapper in raw form is NOT anything
that the user sees. That is the backend representation of the data.

Comment 10 Vlad 2001-10-08 18:53:22 UTC

OK, please explain how the user should override charset of the message then -
which menuitem? Are any means for this available in 0.15.99? I have Oct 4
snapshot and I didn't find any mean for this.

Comment 11 Vlad 2001-10-11 16:08:59 UTC

Haven't recieved any answer to my question. Reopening.

Comment 12 Vlad 2001-10-11 16:58:28 UTC

Now actually reopening..

Comment 13 Jeffrey Stedfast 2001-10-11 17:05:48 UTC

*** bug 208474 has been marked as a duplicate of this bug. ***

Comment 14 Jeffrey Stedfast 2001-10-15 22:25:54 UTC

View -> Character Encoding -> *** charset ***

Comment 15 Vlad 2001-10-26 12:29:14 UTC

Reopening. It seem not to work at all. When one displays mail say with russian and selects different russian or even latin-1 charset from View->Encoding NOTHING CHANGES. The text of the letter should change regardless of whether the charset is specified in the mail headers and regardless of content transfer encoding. Tested with Evo-0.16 from current ximian gnome (not a snapshot).

Comment 16 Jeffrey Stedfast 2001-10-26 19:32:05 UTC

I discovered a bug recently ( a week or 2 ago?) so it may not have
made it into the 0.16 release (yaneti was the one who let me know of
the bug).

anyways, you still can't override a message's charset if it was
transformed to UTF-8 without problems - and I don't see why you'd want
to anyway? If it was validly transformed to UTF-8 given the charset
that the message claimed it to be in, then changing it to another
charset is more likely to make it render badly than it is to render
correctly.

Comment 17 Vlad 2001-10-27 07:16:48 UTC

You are totally wrong - one needs ability to override charset of the message regardless of whether it was correctly transformed to utf8 or not. For example, when one writes a mail through a broken www iterface, e.g. usa.net, mail gets charset marked iso8859-1. So russian is displayed as a mess of latin1 characters most of them with umlauts. The properly implemented recoding would allow to read such mail, and mature MUAs like Mozilla and Outlook Express have this functionality implemented and working correctly.
Reopening..

Comment 18 Jeffrey Stedfast 2001-10-27 23:31:23 UTC

fine fine, implemented.

for future reference, can you hit return when you're typeing in
bugzilla? otherwise the text scrolls forever to the right and I have
to manually scroll to read what you wrote which is a royal pain for
me.

Comment 19 Vlad 2001-10-30 11:38:18 UTC

Thank you very much for fixing it (though I didn't check it yet)!!
I'm very sorry for bad formatting of my comments - I'm using lynx
and they look pretty reasonable in lynx. I will format my comments
propery next time.

Comment 20 Jeffrey Stedfast 2001-10-30 20:09:09 UTC

much appreciated :-)