GNOME Bugzilla – Bug 629235
get_headers function change FROM field of mail message
Last modified: 2010-11-21 17:06:50 UTC
get_headers function change 'FROM' field of mail message. get_header function don't change 'FROM' field of mail message [ CASE #1 ] some get_headers INPUT) From: =?EUC-KR?B?xtvHw726xbg=?= <backup@3355music.com> some get_headers OUTPUT) From: =?iso-2022-kr?q?=1B=24=29C=0EF=5BGC==3AE8=0F?= <backup@3355music.com> [ CASE #2 ] get_header INPUT) From: =?EUC-KR?B?xtvHw726xbg=?= <backup@3355music.com> get_header OUTPUT) From: =?EUC-KR?B?xtvHw726xbg=?= <backup@3355music.com> ============================================ CASE #2 is normal but in CASE #1 EUC-KR encoding -> iso-2022-kr encoding My problem is that the changed iso-2022-kr encoding value is not present perfectly korean character set. P.S. I'm not good at english. If you don't understand, please let me know. Thank you.
What is your locale set to? If it is set to KO, then gmime will prefer euc-kr, else it prefers iso-2022-kr. Also, simply calling get_header() will not change the encoding, you must also be changing one of the other headers or adding a top-level mime part to the message or something (which clobbers the cached message header and thus forces a re-encoding). I've got some ideas on how to fix this (I think my current charset picker logic is wrong), but it'd help to know what exactly you are doing.
ok, so part of the reason it works the way it does is because of the fix for bug #138218 I think I'll need to rethink that fix a bit...
Thank you for comments. These are my locale environment variables. LANG=ko_KR.eucKR LC_CTYPE="ko_KR.eucKR" LC_NUMERIC="ko_KR.eucKR" LC_TIME="ko_KR.eucKR" LC_COLLATE="ko_KR.eucKR" LC_MONETARY="ko_KR.eucKR" LC_MESSAGES="ko_KR.eucKR" LC_PAPER="ko_KR.eucKR" LC_NAME="ko_KR.eucKR" LC_ADDRESS="ko_KR.eucKR" LC_TELEPHONE="ko_KR.eucKR" LC_MEASUREMENT="ko_KR.eucKR" LC_IDENTIFICATION="ko_KR.eucKR" LC_ALL= I use dbmail-2.3.6 ( http://www.dbmail.org ) and dbmail use gmime-2.4 My Mail System = qmail + vpopmail + dbmail + MySQL a email message move like below flow. SENDER -> (A)qmail -> (B)vpopmail -> (C)dbmail-deliver -> (D)MySQL -> (E)dbmail-imap -> RECEIVER step(C): 'dbmail-deilver' use gmime's get_headers function. Before step(C), a message's FROM field is not changed. (euc-kr) After step(C), a message's FROM field is changed. (iso-2022-kr)
Okay... as a temporary workaround until I can figure out what the right fix for this is, try setting LC_ALL to "ko_KR.eucKR" and see if that solves the problem (it should afaict).
I've done some digging and it sounds like, according to rfc 1557, that it is suggested that euc-kr be used for header encodings because a lot of korean email clients cannot handle base64 or quoted-printable encoded iso-2022-kr (the rfc suggests iso-2022-kr as a charset to be used for the contents, however). Based on this, I'm thinking of simply dropping iso-2022-kr as a charset that GMime will even try for headers. Of course, simply dropping iso-2022-kr from the encoding tables in gmime won't fix this bug (it'll probably end up encoding to UTF-8 instead since your LC_ALL was null), so I still need to rethink how charsets are chosen for encoding and/or find better ways of preserving cached headers. I haven't yet looked at the dbmail code, but they must be doing something that clobbers gmime's cached headers (when it parses a message, it caches the original headers so that when you write the message back out, it uses those original headers rather than re-encoding). Right now adding a new header, removing a header or changing a header will clobber the entire cache of preserved originals, so writing the message out again will be forced to re-encode all of them (which is the cause of your bug).
I just filed a new bug about the cached header stream clobbering. The other parts of this bug have now been fixed. *** This bug has been marked as a duplicate of bug 635445 ***