After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 713990 - blank lines are removed in plain-text messages
blank lines are removed in plain-text messages
Status: RESOLVED FIXED
Product: geary
Classification: Other
Component: composer
0.4.0
Other All
: High normal
: ---
Assigned To: Robert Schroll
Geary Maintainers
Depends on:
Blocks:
 
 
Reported: 2013-09-14 06:26 UTC by Geary Maintainers
Modified: 2013-09-23 08:55 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
0001-Always-save-drafts-in-HTML-format.patch (3.06 KB, patch)
2013-09-16 20:16 UTC, Robert Schroll
none Details | Review
new patch to use only HTML format for drafts (3.23 KB, patch)
2013-09-18 00:39 UTC, Robert Schroll
none Details | Review

Description Charles Lindsay 2013-11-21 20:23:55 UTC


---- Reported by geary-maint@gnome.bugs 2013-09-14 11:26:00 -0700 ----

Original Redmine bug id: 7488
Original URL: http://redmine.yorba.org/issues/7488
Searchable id: yorba-bug-7488
Original author: Andrea Corbellini
Original description:

How to reproduce:

1. `gsettings set org.yorba.geary compose-as-html false`

2. Send a new message with the following body:

    
    
    This is a test.
    
    A very nice test.
    

3. Look at the actual body of the email sent:

    
    
    This is a test.
    A very nice test.

Related issues:
related to geary - 7530: Some HTML displayed in plain-text mode composer (Open)
related to geary - 6990: HTML/CSS processing (Open)
duplicated by geary - 7515: newlines stripped in message body (Duplicate)



---- Additional Comments From geary-maint@gnome.bugs 2013-09-23 13:55:00 -0700 ----

### History

####

#1

Updated by Jim Nelson 2 months ago

  * **Category** set to _composer_
  * **Target version** set to _0.4.0_

Yep, I can repro this. Robert, is there any chance you can take a look at this
one?

####

#2

Updated by Robert Schroll 2 months ago

  * **File** 0001-Always-save-drafts-in-HTML-format.patch added
  * **Status** changed from _Open_ to _Review_
  * **Assignee** set to _Robert Schroll_

The problem commit is 1f859dea, which closed #7357. This made changes to the
html_to_flowed_text() function, which relies on WebKit's Node.get_inner_text()
method, which is (too) sensitive to whether an element is in the document or
not. Note, for example, that we have to go through the list of blockquotes
backwards to ensure that we process them while they're still in the document,
for example. My guess is that the cloned body, not being part of the document,
does line spacing differently.

Before we were saving drafts, the DOM manipulation wasn't a problem, since it
only happened after the user was done composing. But saving drafts happens in
the middle. However, I wonder why we're saving drafts in plain text mode? One
of the lessons of my previous mucking about in the composer was that we should
always treat emails as HTML when they're inside Geary. And it seems perfectly
reasonable to treat the Drafts folder as "inside Geary," even if it's
technically not. (I doubt that any mail client decides whether to compose in
HTML or plain text based on the format of the message in the Drafts folder.)
If we always save our drafts as HTML, we can revert 1f859dea, since we don't
hit html_to_flowed_text() until we actually send the message. That's what the
attached patch does, and in brief testing it seems to work.

Maybe we should add some comment to html_to_flowed_text() to remind people not
to use this while the composer is open. We could take out the re-assembly of
the DOM to make this obvious. (I don't recall why we did that in the first
place.)

####

#3

Updated by Jim Nelson 2 months ago

  * **Status** changed from _Review_ to _Open_

Thanks for the patch, Robert. Alas, while this does fix this ticket's problem,
it re-introduces the problem in #7357. I think we need to keep Eric's code of
saving a copy of the Document to work on in html_to_flowed_text(). I'm
unfamiliar with this code, however; maybe there's another way?

####

#4

Updated by Robert Schroll 2 months ago

  * **File** 0001-Always-save-drafts-in-HTML-format.patch added
  * **Status** changed from _Open_ to _Review_

Duh - the get_text part gets called each time, regardless. For Geary's
purpose, we don't need the text part for the drafts, so one way to do this
would be to only produce the html part for drafts. That's what the attached
patch does. (And I actually tested this one a bit more.) But this has the
potential to mess up other mail clients that expect drafts to have a text
part. (Not a big risk, probably, but one nonetheless.)

One thing we could do us use the copy while saving drafts and end up with
slightly malformed text parts in the draft. As long as we always save drafts
as HTML, that won't affect Geary. Another way would be to figure out exactly
where those missing line breaks are supposed to come from, and put them in by
hand. A third possibility is to build our own HTML -> flowed text converter
that doesn't rely on WebKit. This might be a good idea in the long run, but
it'll be rather complicated, since we can get arbitrary HTML from the composer
through stuff in replies. We can't just assume we're getting the HTML produced
by WebKit in edit mode.

I've set this back to review, but I definitely recognize that another approach
may be desired. Alternatively, we could use this patch in the short term and
open another issue for a better long-term solution.

####

#5

Updated by Oliver Giles 2 months ago

As I semi-religiously send only plain text email, your comment that in the
composer Geary treats all messages as HTML piqued my interest.

I subsequently noticed, if I reply to an HTML email in what I intended to be
plain text, the original message included in the reply is sent along in HTML.
Is this intended behaviour? Trying the same thing in Thunderbird, the original
HTML message is stripped to text (or they use the text/plain part) so that the
reply I send is pure text/plain.

With the usual caveat of my not being familiar with the code, it strikes me
that a potential longer-term solution could be to more aggressively split the
composer between an HTML composer and a plain-text composer based on a user
preference. For plain-text preferences, this would eliminate the convert-html-
to-text issue, and for html emails that would make it more logical to store
only the HTML version in drafts.

Sorry for the digression

####

#6

Updated by Robert Schroll about 1 month ago

Oliver Giles wrote:

> As I semi-religiously send only plain text email, your comment that in the
composer Geary treats all messages as HTML piqued my interest.

This is really an implementation detail, and shouldn't be visible to the user
at all. Basically, "plain text" emails are usually actually flowed text, which
resembles plain text but with characters at the beginning and ends of lines
giving clues about line wrapping and quote levels. Rather than decode this
each time we need to do something with the text (it's more complicated than
you might think) or come up with our own markup to describe such text, we
decided to just use (a small subset of) HTML to describe these messages. But
anyplace where this leaks through to the user is a bug.

> I subsequently noticed, if I reply to an HTML email in what I intended to be
plain text, the original message included in the reply is sent along in HTML.
Is this intended behaviour?

No! I just tried to replicate this but couldn't. If you see it happen again,
please open another bug so we can track this down. If you find an original
message that triggers this, it'd be a great help.

> With the usual caveat of my not being familiar with the code, it strikes me
that a potential longer-term solution could be to more aggressively split the
composer between an HTML composer and a plain-text composer based on a user
preference. For plain-text preferences, this would eliminate the convert-html-
to-text issue, and for html emails that would make it more logical to store
only the HTML version in drafts.

The nice thing about the current setup is that it allows you to switch from
HTML to plain text and back without losing anything in the message you're
composing. But after using this (or rather, not using this) feature for some
time, I'm not convinced that it's particularly useful. A dedicated flowed text
composer would also allow us to quote the plain text version of emails, rather
than the HTML version munged to plain text, when that's available. However,
carrying two composers in the code base could prove to be quite a burden.

####

#7

Updated by Oliver Giles about 1 month ago

Thanks for your detailed explanation, it's very informative.

After testing it again, I realised what I said isn't quite accurate: a
forwarded message in a plain text message is indeed included as text (munged
from the HTML) **when it's sent**. However in the composer before sending, the
original message appears at least partially in HTML: font colours and sizes
are preserved, some sections have background colours etc. This is disturbing
because, like the original "blank line" problem, the email you think you sent
differs from the one you actually sent. In terms of a forwarded message or a
reply, in this case it's not as big an issue, but it would be elegantly fixed
if you were to introduce a second composer, and (depending on your inheritance
models) most of its code would be shared.

Anyway I'll be interested to see what geary goes with in the end

####

#8

Updated by Robert Schroll about 1 month ago

Oliver Giles wrote:

> However in the composer before sending, the original message appears at
least partially in HTML: font colours and sizes are preserved, some sections
have background colours etc.

Ah, yes. I've seen this occasionally, and it looks like I never filed a bug.
So this problem is now #7530.

As for this bug: The problem we're having is ultimately due to the fact that
we're replacing blockquoted text with tokens before the conversion, so that we
can replace those tokens with quoted text later. I wonder if it's really
necessary to use these tokens. Could we instead just use the text of the
blockquotes as tokens? The immediate problem I see is if the same text appears
both as a blockquote and somewhere else -- we could end up replacing the wrong
thing with a quote. This would be a rather rare occurrence, but would it be
rare enough to ignore?

####

#9

Updated by Jim Nelson about 1 month ago

  * **Priority** changed from _Normal_ to _High_

Taking a look at some recent messages I've sent out on the mailing lists, it
appears this is simply broken now, regardless if Rich Text is on or off (i.e.
the plain text portion of the email is not flowed). It's not obvious to
readers who accept HTML mail, but things like mailing lists (which often strip
HTML mail, as ours do) it's a problem.

Maybe this was obvious to everyone else, but I thought this was only an issue
when Rich Text was turned off, so I'm elevating this.

####

#10

Updated by Robert Schroll about 1 month ago

  * **Status** changed from _Review_ to _Fixed_

Applied in changeset 8634bd81ba6f9c942f3ab6a1e1c29a6732040f9b.

####

#11

Updated by Jim Nelson about 1 month ago

I went ahead and pushed your second patch, Robert. We really need to get this
fixed now and it's obvious that a long-term solution is not something we're
going to have developed in the next two weeks.

We really need to come up with a better DOM solution. It's obvious that we're
fighting side-effects and limitations of the WebKit DOM, and worse, putting a
lot of key manipulations in the client when they realize should be in the
engine.

I've bumped up #6990 for consideration for the 0.5 cycle.



--- Bug imported by chaz@yorba.org 2013-11-21 20:24 UTC  ---

This bug was previously known as _bug_ 7488 at http://redmine.yorba.org/show_bug.cgi?id=7488
Imported an attachment (id=260829)
Imported an attachment (id=260830)

Unknown milestone "unknown in product geary. 
   Setting to default milestone for this product, "---".
Setting qa contact to the default for this product.
   This bug either had no qa contact or an invalid one.