GNOME Bugzilla – Bug 714339
subject contains garbled UTF-8 characters
Last modified: 2014-02-20 23:18:01 UTC
---- Reported by adam@yorba.org 2012-08-24 10:32:00 -0700 ---- Original Redmine bug id: 5711 Original URL: http://redmine.yorba.org/issues/5711 Searchable id: yorba-bug-5711 Original author: Adam Dingle Original description: I just received a message whose subject in Geary looks like this: =?UTF-8?Q?Build_failed_in_Jenkins:_goodhope-ci_=C2=BB_precise, amd64_#141?= In Gmail, the subject looks like this: Build failed in Jenkins: goodhope-ci » precise, amd64 #141 I'll forward the message to the team for investigation. This looks a lot like 4400, which we closed a while back. I'm opening a new ticket for this, but I'll mark that one as related. Related issues: related to geary - 4400: subjects with non-Latin characters sometimes appear garbled (Fixed) related to geary - 5712: in conversations list, subject is not truncated properly (Fixed) related to geary - 5977: invalid subjects are propagated in replies (Open) related to geary - 7158: Garbled email addresses in conversation viewer (Open) ---- Additional Comments From geary-maint@gnome.bugs 2013-09-04 12:11:00 -0700 ---- ### History #### #1 Updated by Adam Dingle about 1 year ago Aha - forwarding the message would not be useful, since it would just forward the corrupted subject rather than the source that produces it. In any case, I think Jim (who is most likely to investigate this) is on the mailing list which received this message, so he can probably find it. #### #2 Updated by Charles Lindsay about 1 year ago I received the same message, along with three other very similar ones. Interestingly, I'm seeing slightly different behavior than Adam here. The four messages divided into two conversations with two messages each. The conversation with the subject Adam lists above shows up in the conversations list garbled same as Adam's. However, only the second email in that conversation has the garbled subject in the conversation view. Gmail's "show original" reports the subjects as: Subject: =?UTF-8?Q?Build_failed_in_Jenkins:_goodhope-ci_=C2=BB_precise, amd64_#141?= Subject: =?UTF-8?Q?Build_failed_in_Jenkins:_goodhope-ci_=C2=BB_precise, amd64_#142?= Geary's view source reports them as: Subject: =?UTF-8?Q?Build_failed_in_Jenkins:_goodhope-ci_=C2=BB_precise, amd64_#141?= Subject: =?UTF-8?Q?Build_failed_in_Jenkins:_goodhope-ci_=C2=BB_precise, amd64_#142?= They both look identical to me except the final digit before the terminating ?=. Adam reports that both conversations have garbled subjects in the list, and all four subject lines look garbled to him in the conversation view. #### #3 Updated by Adam Dingle about 1 year ago Note that I'm running Ubuntu Quantal, and Charles is on Precise. #### #4 Updated by Eric Gregory about 1 year ago Note that Apple Mail and iOS Mail both show the same garbled subject lines as Geary. #### #5 Updated by Jim Nelson about 1 year ago * **Category** set to _client_ * **Assignee** set to _Jim Nelson_ #### #6 Updated by Charles Lindsay about 1 year ago Now that I'm thinking about it, the "header" flavor of quoted printable requires that each atom is a fully valid quoted printable string. In other words, you can break the string for newlines only if you first terminate the quoted printable encoding block. See http://www.faqs.org/rfcs/rfc2047.html. So, if I'm reading correctly, these headers aren't technically valid. To be valid, the subject line would have to be something like this: Subject: =?UTF-8?Q?Build_failed_in_Jenkins:_goodhope-ci_=C2=BB_precis?= =?UTF-8?Q?e,_amd64_#141?= Not sure how we really want to handle invalid input, but it seems like it should be consistent regardless. #### #7 Updated by Adam Dingle about 1 year ago * **Assignee** deleted (<strike>_Jim Nelson_</strike>) #### #8 Updated by Adam Dingle about 1 year ago * **Target version** deleted (<strike>_0.2_</strike>) #### #9 Updated by Jim Nelson 10 months ago * **Target version** set to _0.3.0_ #### #10 Updated by Jim Nelson 9 months ago * **Category** changed from _client_ to _charset-encoding_ #### #11 Updated by Eric Gregory 9 months ago Here's another one, the subject header is: Subject: ERIC, =?UTF-8?Q?=e2=98=85?=New offers waiting for you now Geary prints the subject as-is rather than decoding it. Apple and iOS Mail display a star character in place of the Unicode sequence. #### #12 Updated by Charles Lindsay 9 months ago Here too, it's encoded incorrectly. If they wanted a star with no space before it, the correct form would be =?UTF-8?Q?=e2=98=85New?= ... (I believe). The way they did it violates the RFC. #### #13 Updated by Jim Nelson 8 months ago * **Target version** changed from _0.3.0_ to _0.4.0_ #### #14 Updated by Jim Nelson 3 months ago * **Target version** changed from _0.4.0_ to _0.5.0_ --- Bug imported by chaz@yorba.org 2013-11-21 20:26 UTC --- This bug was previously known as _bug_ 5711 at http://redmine.yorba.org/show_bug.cgi?id=5711 Unknown milestone "unknown in product geary. Setting to default milestone for this product, "---". Setting qa contact to the default for this product. This bug either had no qa contact or an invalid one. Resolution set on an open status. Dropping resolution
Interestingly, I'm seeing this bug only in the message viewer on the following subject: Subject: [Bug 722273] New: Should replace [1]=?UTF-8?Q?=20with=20=C2=B9?= In the preview/message list, it shows up correctly with a ¹ instead of that block of encoded UTF-8, but in the viewer itself, it comes out literally like it is up there. Spec lawyer alert: From RFC 2047 (http://www.faqs.org/rfcs/rfc2047.html) section 5.(1): > ...an 'encoded-word' that appears in a header field defined as '*text' [which Subject is] MUST be separated from any adjacent 'encoded-word' or 'text' by 'linear-white-space' Because there's no space between the ] and =?..., it's technically not a valid encoded-word.
The proposed patch for bug #713060 also fixes this issue.