GNOME Bugzilla – Bug 713990
blank lines are removed in plain-text messages
Last modified: 2013-09-23 08:55:00 UTC
---- Reported by geary-maint@gnome.bugs 2013-09-14 11:26:00 -0700 ---- Original Redmine bug id: 7488 Original URL: http://redmine.yorba.org/issues/7488 Searchable id: yorba-bug-7488 Original author: Andrea Corbellini Original description: How to reproduce: 1. `gsettings set org.yorba.geary compose-as-html false` 2. Send a new message with the following body: This is a test. A very nice test. 3. Look at the actual body of the email sent: This is a test. A very nice test. Related issues: related to geary - 7530: Some HTML displayed in plain-text mode composer (Open) related to geary - 6990: HTML/CSS processing (Open) duplicated by geary - 7515: newlines stripped in message body (Duplicate) ---- Additional Comments From geary-maint@gnome.bugs 2013-09-23 13:55:00 -0700 ---- ### History #### #1 Updated by Jim Nelson 2 months ago * **Category** set to _composer_ * **Target version** set to _0.4.0_ Yep, I can repro this. Robert, is there any chance you can take a look at this one? #### #2 Updated by Robert Schroll 2 months ago * **File** 0001-Always-save-drafts-in-HTML-format.patch added * **Status** changed from _Open_ to _Review_ * **Assignee** set to _Robert Schroll_ The problem commit is 1f859dea, which closed #7357. This made changes to the html_to_flowed_text() function, which relies on WebKit's Node.get_inner_text() method, which is (too) sensitive to whether an element is in the document or not. Note, for example, that we have to go through the list of blockquotes backwards to ensure that we process them while they're still in the document, for example. My guess is that the cloned body, not being part of the document, does line spacing differently. Before we were saving drafts, the DOM manipulation wasn't a problem, since it only happened after the user was done composing. But saving drafts happens in the middle. However, I wonder why we're saving drafts in plain text mode? One of the lessons of my previous mucking about in the composer was that we should always treat emails as HTML when they're inside Geary. And it seems perfectly reasonable to treat the Drafts folder as "inside Geary," even if it's technically not. (I doubt that any mail client decides whether to compose in HTML or plain text based on the format of the message in the Drafts folder.) If we always save our drafts as HTML, we can revert 1f859dea, since we don't hit html_to_flowed_text() until we actually send the message. That's what the attached patch does, and in brief testing it seems to work. Maybe we should add some comment to html_to_flowed_text() to remind people not to use this while the composer is open. We could take out the re-assembly of the DOM to make this obvious. (I don't recall why we did that in the first place.) #### #3 Updated by Jim Nelson 2 months ago * **Status** changed from _Review_ to _Open_ Thanks for the patch, Robert. Alas, while this does fix this ticket's problem, it re-introduces the problem in #7357. I think we need to keep Eric's code of saving a copy of the Document to work on in html_to_flowed_text(). I'm unfamiliar with this code, however; maybe there's another way? #### #4 Updated by Robert Schroll 2 months ago * **File** 0001-Always-save-drafts-in-HTML-format.patch added * **Status** changed from _Open_ to _Review_ Duh - the get_text part gets called each time, regardless. For Geary's purpose, we don't need the text part for the drafts, so one way to do this would be to only produce the html part for drafts. That's what the attached patch does. (And I actually tested this one a bit more.) But this has the potential to mess up other mail clients that expect drafts to have a text part. (Not a big risk, probably, but one nonetheless.) One thing we could do us use the copy while saving drafts and end up with slightly malformed text parts in the draft. As long as we always save drafts as HTML, that won't affect Geary. Another way would be to figure out exactly where those missing line breaks are supposed to come from, and put them in by hand. A third possibility is to build our own HTML -> flowed text converter that doesn't rely on WebKit. This might be a good idea in the long run, but it'll be rather complicated, since we can get arbitrary HTML from the composer through stuff in replies. We can't just assume we're getting the HTML produced by WebKit in edit mode. I've set this back to review, but I definitely recognize that another approach may be desired. Alternatively, we could use this patch in the short term and open another issue for a better long-term solution. #### #5 Updated by Oliver Giles 2 months ago As I semi-religiously send only plain text email, your comment that in the composer Geary treats all messages as HTML piqued my interest. I subsequently noticed, if I reply to an HTML email in what I intended to be plain text, the original message included in the reply is sent along in HTML. Is this intended behaviour? Trying the same thing in Thunderbird, the original HTML message is stripped to text (or they use the text/plain part) so that the reply I send is pure text/plain. With the usual caveat of my not being familiar with the code, it strikes me that a potential longer-term solution could be to more aggressively split the composer between an HTML composer and a plain-text composer based on a user preference. For plain-text preferences, this would eliminate the convert-html- to-text issue, and for html emails that would make it more logical to store only the HTML version in drafts. Sorry for the digression #### #6 Updated by Robert Schroll about 1 month ago Oliver Giles wrote: > As I semi-religiously send only plain text email, your comment that in the composer Geary treats all messages as HTML piqued my interest. This is really an implementation detail, and shouldn't be visible to the user at all. Basically, "plain text" emails are usually actually flowed text, which resembles plain text but with characters at the beginning and ends of lines giving clues about line wrapping and quote levels. Rather than decode this each time we need to do something with the text (it's more complicated than you might think) or come up with our own markup to describe such text, we decided to just use (a small subset of) HTML to describe these messages. But anyplace where this leaks through to the user is a bug. > I subsequently noticed, if I reply to an HTML email in what I intended to be plain text, the original message included in the reply is sent along in HTML. Is this intended behaviour? No! I just tried to replicate this but couldn't. If you see it happen again, please open another bug so we can track this down. If you find an original message that triggers this, it'd be a great help. > With the usual caveat of my not being familiar with the code, it strikes me that a potential longer-term solution could be to more aggressively split the composer between an HTML composer and a plain-text composer based on a user preference. For plain-text preferences, this would eliminate the convert-html- to-text issue, and for html emails that would make it more logical to store only the HTML version in drafts. The nice thing about the current setup is that it allows you to switch from HTML to plain text and back without losing anything in the message you're composing. But after using this (or rather, not using this) feature for some time, I'm not convinced that it's particularly useful. A dedicated flowed text composer would also allow us to quote the plain text version of emails, rather than the HTML version munged to plain text, when that's available. However, carrying two composers in the code base could prove to be quite a burden. #### #7 Updated by Oliver Giles about 1 month ago Thanks for your detailed explanation, it's very informative. After testing it again, I realised what I said isn't quite accurate: a forwarded message in a plain text message is indeed included as text (munged from the HTML) **when it's sent**. However in the composer before sending, the original message appears at least partially in HTML: font colours and sizes are preserved, some sections have background colours etc. This is disturbing because, like the original "blank line" problem, the email you think you sent differs from the one you actually sent. In terms of a forwarded message or a reply, in this case it's not as big an issue, but it would be elegantly fixed if you were to introduce a second composer, and (depending on your inheritance models) most of its code would be shared. Anyway I'll be interested to see what geary goes with in the end #### #8 Updated by Robert Schroll about 1 month ago Oliver Giles wrote: > However in the composer before sending, the original message appears at least partially in HTML: font colours and sizes are preserved, some sections have background colours etc. Ah, yes. I've seen this occasionally, and it looks like I never filed a bug. So this problem is now #7530. As for this bug: The problem we're having is ultimately due to the fact that we're replacing blockquoted text with tokens before the conversion, so that we can replace those tokens with quoted text later. I wonder if it's really necessary to use these tokens. Could we instead just use the text of the blockquotes as tokens? The immediate problem I see is if the same text appears both as a blockquote and somewhere else -- we could end up replacing the wrong thing with a quote. This would be a rather rare occurrence, but would it be rare enough to ignore? #### #9 Updated by Jim Nelson about 1 month ago * **Priority** changed from _Normal_ to _High_ Taking a look at some recent messages I've sent out on the mailing lists, it appears this is simply broken now, regardless if Rich Text is on or off (i.e. the plain text portion of the email is not flowed). It's not obvious to readers who accept HTML mail, but things like mailing lists (which often strip HTML mail, as ours do) it's a problem. Maybe this was obvious to everyone else, but I thought this was only an issue when Rich Text was turned off, so I'm elevating this. #### #10 Updated by Robert Schroll about 1 month ago * **Status** changed from _Review_ to _Fixed_ Applied in changeset 8634bd81ba6f9c942f3ab6a1e1c29a6732040f9b. #### #11 Updated by Jim Nelson about 1 month ago I went ahead and pushed your second patch, Robert. We really need to get this fixed now and it's obvious that a long-term solution is not something we're going to have developed in the next two weeks. We really need to come up with a better DOM solution. It's obvious that we're fighting side-effects and limitations of the WebKit DOM, and worse, putting a lot of key manipulations in the client when they realize should be in the engine. I've bumped up #6990 for consideration for the 0.5 cycle. --- Bug imported by chaz@yorba.org 2013-11-21 20:24 UTC --- This bug was previously known as _bug_ 7488 at http://redmine.yorba.org/show_bug.cgi?id=7488 Imported an attachment (id=260829) Imported an attachment (id=260830) Unknown milestone "unknown in product geary. Setting to default milestone for this product, "---". Setting qa contact to the default for this product. This bug either had no qa contact or an invalid one.