GNOME Bugzilla – Bug 730985
problem with BOM in header
Last modified: 2014-05-30 10:22:44 UTC
I have some spam mails, which have a utf-16 byte order mark (BOM) U+FEFF as the first character in one of their "Received:" lines, before the "From", etc. header lines. When I do g_mime_object_get_header (GMIME_OBJECT(message),"From") I simply get the string "(null)", but not the actual "From" field. I suspect that gmime chokes on parsing the BOM, and considers the remainder of the message to be part of the body. Is this the expected behavior or is this a bug in gmime? Maybe gmime could simply ignore BOMs independent of whether they appear in the header or body of a message? Attached find a sample mail and a minimal program which demonstrates the problem. The program tries to read the "received" and the "from" header from the supplied email. However, it only succeeds in obtaining the "Received" header (which is before the BOM) and not the "From" header which is after. Expected behaviour would be that both headers are successfully parsed.
Created attachment 277522 [details] mail which demonstrates the problem
Created attachment 277523 [details] minimal program which demonstrates bug (adapted from gmime/examples/basic-example.c))
The problem isn't the BOM, the problem is the blank line before the Received header. A blank line terminates the header block.