GNOME Bugzilla – Bug 108557
Encoded-word are not decoded when contain '.', ':', ...
Last modified: 2004-12-22 21:47:04 UTC
Pan 0.13.91. When an encoded-word (RFC 2047) contains characters like '.', ':' or ',' (may be others) it is not decoded. Sample: Subject: 8bits et =?iso-8859-1?Q?=E9=E9=E9:=E0=E0=E0?= (Some Newsreaders like OE encode the Re: in the subject)
Reassigning to fejj, gmime's author. Fejj: this is a duplicate of bug #102361. Haven't closed so you can decide whether you want to support invalid RFC2047 encodings or not.
It is not all the time invalid. It depends of the context. For unstructured text field like Subject, the RFC2047 allows these characters into an encoded-word : Subject: =?iso-8859-1?Q?=E9=E9=E9:=E0=E0=E0?= Subject: =?us-ascii?Q?"Patrick"?= are *valid* RFC2047 encoding. But, for example: From: =?us-ascii?Q?"Patrick"?= is not valid. ---- RFC2047 : 2 - Syntax [...] encoded-text = 1*<Any printable ASCII character other than "?" or SPACE> ; (but see "Use of encoded-words in message ; headers", section 5) 4 - encoding [...] For example, an 'encoded-word' in a 'phrase' preceding an address in a From header field may not contain any of the "specials" defined in RFC 822. Finally, certain other characters are disallowed in some contexts, to ensure reliability for messages that pass through internetwork mail gateways. The "B" encoding automatically meets these requirements. The "Q" encoding allows a wide range of printable characters to be used in non-critical locations in the message header (e.g., Subject), with fewer characters available for use in other locations. ---- [Sorry for the duplicate report]
they are never valid. you are just mis-interpreting the rfc (it's easy to do...) rfc2047 encoded-word tokens MUST be parsable ALWAYS as rfc822 atoms. if you go look up the definition for an rfc822 atom, you find that an atom disallows :'s etc
g_mime_utils_header_decode_text() replaces g_mime_8bit_header_decode() in gmime-2.1.x development series (aka cvs trunk) and now allows for this. it is much less strict than the old code. still does not accept stuff like: =?charset?q?foo?=, bar ie, gmime still wants a lwsp around encoded-word's. might eventually try working around that too tho. besides, I think that is in another similar bug report.