After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 712904 - Header labels are affected by message body
Header labels are affected by message body
Status: RESOLVED FIXED
Product: geary
Classification: Other
Component: conversations
master
Other All
: High normal
: 0.8.0
Assigned To: Geary Maintainers
Geary Maintainers
review
: 726477 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2012-06-21 10:24 UTC by Geary Maintainers
Modified: 2014-09-10 02:21 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Selección_001.png (97.46 KB, image/png)
2012-06-21 22:24 UTC, Geary Maintainers
  Details
geary-message-SR9EGW.txt (18.62 KB, text/plain)
2012-06-21 22:24 UTC, Geary Maintainers
  Details
patch (4.25 KB, patch)
2014-03-20 10:22 UTC, Simon Lipp
none Details | Review
Scope style rules in emails (3.60 KB, patch)
2014-09-02 02:09 UTC, Robert Schroll
committed Details | Review
Email body (222 bytes, text/html)
2014-09-03 01:32 UTC, Robert Schroll
  Details

Description Charles Lindsay 2013-11-21 20:17:37 UTC


---- Reported by geary-maint@gnome.bugs 2012-06-21 15:24:00 -0700 ----

Original Redmine bug id: 5440
Original URL: http://redmine.yorba.org/issues/5440
Searchable id: yorba-bug-5440
Original author: Arturo Torres Sánchez
Original description:

I received a HTML message from Edublogs, and the “From:”, “Subject:”, “Date:“
labels are shown with a bigger font and misplaced.

Related issues:
related to geary - 6990: HTML/CSS processing (Open)
duplicated by geary - 6896: Elements of <head> in email end up in body of
document in... (Fixed)
duplicated by geary - 5191: Formatting errors in email header (Duplicate)
duplicated by geary - 6054: From/Subject/Date in wrong font in
Conversation Viewer (Duplicate)



---- Additional Comments From geary-maint@gnome.bugs 2013-09-04 13:20:00 -0700 ----

### History

####

#1

Updated by Adam Dingle over 1 year ago

  * **Priority** changed from _Normal_ to _High_
  * **Target version** set to _0.2_

####

#2

Updated by Adam Dingle about 1 year ago

  * **Target version** deleted (<strike>_0.2_</strike>)

####

#3

Updated by Jim Nelson 10 months ago

  * **Category** set to _client_
  * **Target version** set to _0.3.0_

####

#4

Updated by Jim Nelson 10 months ago

Incidentally, I have messages in my personal Inbox that trigger this bug.

####

#5

Updated by Jim Nelson 9 months ago

  * **Target version** changed from _0.3.0_ to _0.4.0_

####

#6

Updated by Jim Nelson 7 months ago

  * **Category** changed from _client_ to _conversations_

####

#7

Updated by Jim Nelson 7 months ago

See Robert Schroll's note in comment 3 of #6896 for how this ticket is
different than that ticket. Both are related to a similar problem.

####

#8

Updated by Robert Schroll 6 months ago

What we need to do here is take each CSS rule and prepend "#message_1234
.body". If "body" appears in the rule, it should be removed. There is an
additional complication of what to do with nested emails, since they don't
have a message id.

If you'll allow me to wax philosophic: It would be nice to be able to write
something like

    
    #message_1234 .body {
        ...existing CSS rules...
    }

and have it correctly interpreted. This could also be useful for our own
stylesheet: instead of writing these long conditional rules, we could write

    
    body:not(.nohide) {
        .email.hide {
            .body {...}
            .header_container {...}
        }
    }

Stylesheet preprocessors like [LESS](http://lesscss.org/) and [SASS](http
://sass-lang.com/) do exactly this. These two are are written in Javascript
and Ruby, respectively, so we may not want to depend on them. They also
include all sorts of features beyond nesting that we don't necessarily need.
But I wonder if we're going to do enough work to fix this bug, if we should
make a general nested-rule resolver that we can use elsewhere?

####

#9

Updated by Jim Nelson 6 months ago

Robert Schroll wrote:

> But I wonder if we're going to do enough work to fix this bug, if we should
make a general nested-rule resolver that we can use elsewhere?

I presume you mean a nested-rule resolver we would use at runtime and not at
compile-time.

Um ... hmm. I mean, I can see the real value in this, but I also know that
we're opening up a new frontier here when we start building our own document-
manipulation tools.

One thing I've been thinking about is all the DOM manipulation we do inside
the client (particularly the Conversation Viewer). A lot of the reason is that
we're piggybacking on WebKit's DOM tools. Before I was against using WebKit
inside of the engine to manipulate HTML. Now I'm wondering if we should offer
some manipulation tools, that way the HTML mail can be "prepped" for the
client.

It strikes me that some of what you're talking about here falls into that
rubric, and perhaps that's how we should be proceeding -- a new module inside
the engine for manipulating HTML mail in DOM-ish ways to prepare it for
examination and display by whatever client wants to use it. Right now so much
of the prepwork seems ad hoc. We do have dreams of other clients using the
Geary engine, and it would be great if all that code was reusable.

Does that make sense? Curious if you agree.

####

#10

Updated by Robert Schroll 6 months ago

Jim Nelson wrote:

> I presume you mean a nested-rule resolver we would use at runtime and not at
compile-time.

Yes, so it could be used for solving this particular bug.

That said, we should keep in mind that the CSS manipulation to solve this
would be significantly easier than a general nested-rule resolver, so it may
be better to just solve this bug at run-time and use a CSS preprocessor to do
the static CSS at compile time (if desired).

> Does that make sense? Curious if you agree.

It does strike me as rather silly that our HTML-to-plain-text converter lives
in the composer itself, just because that's where WebKit is. So in the
abstract, a separate set of DOM and CSS tools sounds good. I wonder about the
implementation, though. It doesn't seem that WebKit lets you make a DOM
document separately from a WebView. Would this library have its own WebView
for doing all these manipulations? Does that even work if it's never shown?

####

#11

Updated by Jim Nelson 6 months ago

Robert Schroll wrote:

> That said, we should keep in mind that the CSS manipulation to solve this
would be significantly easier than a general nested-rule resolver, so it may
be better to just solve this bug at run-time and use a CSS preprocessor to do
the static CSS at compile time (if desired).

I have concerns about using a preprocessor even at compile time, as that means
future developers have to learn the preprocessor's syntax. Neither tool's
syntax looks all that bad, I suppose it wouldn't be such a burden.

> Would this library have its own WebView for doing all these manipulations?
Does that even work if it's never shown?

The reason my thoughts returned to this problem is that Charles' recent work
on HTML normalization (#6837) uses libxml2 to parse the HTML and build a
simple DOM. libxml2 is not the prettiest thing, but we've used it in Shotwell
and it does work as advertised. I also recall that the library can be compiled
with XPath, which I think would ease some of our DOM traversal woes.

My thought is that, just as the RFC822 module abstracts out GMime, we could
have a DOM module that abstracts out libxml2. We definitely want to wrap it,
incidentally, as libxml2's memory management is error-prone.

How well this would work with our current approach I don't know. But it
something I'm interested in exploring. That plus some CSS manipulation
routines might alleviate both issues you mention here.

####

#12

Updated by Robert Schroll 6 months ago

Jim Nelson wrote:

> I have concerns about using a preprocessor even at compile time, as that
means future developers have to learn the preprocessor's syntax. Neither
tool's syntax looks all that bad, I suppose it wouldn't be such a burden.

A nice thing about both LESS and SASS (in the SCSS syntax) is that they are
supersets of CSS. We can just start with CSS and only use the additional
features when they help. We could also purposefully restrict ourselves to a
subset of their syntax.

I'd be more worried about simply adding another dependency, especially one
outside of the GObject ecosystem. (This could be mitigated by including the
resultant CSS files in the git repository.) This is why I find building our
own sort of tempting, especially if we have to build half of it for this bug.

####

#13

Updated by Jim Nelson 6 months ago

Robert Schroll wrote:

> I'd be more worried about simply adding another dependency, especially one
outside of the GObject ecosystem. (This could be mitigated by including the
resultant CSS files in the git repository.) This is why I find building our
own sort of tempting, especially if we have to build half of it for this bug.

You probably understand when I say that I prefer not to include auto-generated
files into source code control. There are always exceptions to these kinds of
rules, though.

The dependency thing is ok if these tools are widely distributed in the Linux
world. If they're packaged for Debian and RPM, then that's probably good
enough for us.

I would be more inclined to handroll our own CSS preprocessor if there was
some stable tool out there that could parse CSS. I'm sure it's not difficult
to write our own, but said parser will become something we have to maintain.
I've written HTML parsers in the past and the problem is not in the HTML but
in writing a _forgiving_ HTML parser. Since this parser has to deal with CSS
"in the wild", it faces a similar problem. (If it was only parsing our own
CSS, I would have less issue here.)

Still, I do see the benefits of an HTML/CSS unit in Geary. I think it's
worthwhile to start considering what shape it would take and what features it
would offer.

####

#14

Updated by Jim Nelson 6 months ago

I've opened a ticket for this task: #6990. Let's move further discussion
there.

####

#15

Updated by Robert Schroll 6 months ago

I've posted code for a CSS de-nester in #6990. This would have the capability
of solving this bug, among other things. If we don't want the full de-nester
for other purposes, we can rip out the fancier parts and use the remainder
here.

####

#16

Updated by Jim Nelson 3 months ago

  * **Target version** changed from _0.4.0_ to _0.5.0_



--- Bug imported by chaz@yorba.org 2013-11-21 20:17 UTC  ---

This bug was previously known as _bug_ 5440 at http://redmine.yorba.org/show_bug.cgi?id=5440
Imported an attachment (id=260515)
Imported an attachment (id=260516)

Unknown milestone "unknown in product geary. 
   Setting to default milestone for this product, "---".
Setting qa contact to the default for this product.
   This bug either had no qa contact or an invalid one.
Resolution set on an open status.
   Dropping resolution 

Comment 1 Charles Lindsay 2014-03-17 18:26:10 UTC
*** Bug 726477 has been marked as a duplicate of this bug. ***
Comment 2 Charles Lindsay 2014-03-17 18:28:07 UTC
Bug #726477 I believe shows off another aspect of this same bug: styles aren't properly limited in scope.
Comment 3 Simon Lipp 2014-03-17 20:43:52 UTC
You’re right, that’s the same issue. Just to know, why the scoping solution wasn’t adopted ? Seems doable with a simple regexp.
Comment 4 Robert Schroll 2014-03-18 02:22:29 UTC
Both the shadow DOM and <style scoped> would be rather easy solutions here, but I can't find any evidence that either is available through WebKitGTK.  The last comment on this bug (https://bugs.webkit.org/show_bug.cgi?id=97655) suggests that scoped styles aren't currently implemented.  My testing suggests that createShadowRoot() is unavailable by default; I didn't see any likely looking switches to enable it.

If and when WebkitGTK gives us these, this would be a rather easy fix.  Until then, our options are manipulating or disabling any <style> elements in emails.
Comment 5 Simon Lipp 2014-03-20 10:22:16 UTC
Created attachment 272471 [details] [review]
patch

I made a fix using a regex-based sanitize_css function to manually scope a CSS stylesheet. Seems pretty resilent, since it works with weird stylsheets like

     for[class="\{\\\""], bar[class=\\\{]:after{content:'\}\\\';}
Comment 6 Jim Nelson 2014-03-20 19:03:25 UTC
Thanks, we'll review soon.
Comment 7 Jim Nelson 2014-08-29 23:14:02 UTC
Robert, this patch fell off my radar screen.  Can you review?  If it passes muster, please commit.
Comment 8 Robert Schroll 2014-09-01 21:45:56 UTC
There are a number of issues here, none of them fatal.  I'll start with the largest:

I recently learned there's a W3C interface to style sheets, which lets you go through the rules programmatically.  (See https://developer.mozilla.org/en-US/docs/Web/API/CSSStyleSheet and subsequent pages.)  I don't know if WebKitGTK implements these or not, but if it does, it would allow us to manipulate style sheets without needing our own parser.  This would be a great help in avoiding various edge cases.  I'll look into this further and let you know what I find.

This code does not handle @media declarations.  That's not so bad in and of itself, but the parser gets confused enough that additional rules can sneak through un-prefixed.  For example, these rules

@media (screen) {
    * {
        color: blue;
    }
}
div {
    background: yellow;
}

get turned into

html>body>#message_container>div[id="message_[34490/32745]"]>.email_container>.body @media (screen){
    * {
        color: blue;
    }
html>body>#message_container>div[id="message_[34490/32745]"]>.email_container>.body }
div{
    background: yellow;
}

and everything is bright yellow.  I guess the regexes aren't matching brackets?

It'd be nice if body (and html) rules got turned into just "<prefix>" rules, instead of "<prefix> body", so they still apply to the message body.

A question: Why did the web_view.container.insert_before() call get moved?

There are a few minor code style issues: trailing white spaces and no spaces between keywords and open parens.

All that said, we'd be no worse off with this, and in most cases it'd work just fine.  We could commit it and try to tidy things up later.  My instinct, though, is to see if this new CSS interface will work for us first, so I haven't committed it.
Comment 9 Robert Schroll 2014-09-02 02:09:41 UTC
Created attachment 285106 [details] [review]
Scope style rules in emails

The WebKitGTK bindings contain enough to let us read the rules, but not 
alter them.  We can make do with that, by re-writing the whole <style> 
element.  This also means that we can't descend into @media rules and 
the like, so we skip them completely.  Some types of rules, like 
@font-face, don't make sense to scope, so we let them through 
un-molested.
Comment 10 Jim Nelson 2014-09-03 00:52:04 UTC
> A question: Why did the web_view.container.insert_before() call get moved?

I have no idea.  I assume you mean in the WebKit API?

> There are a few minor code style issues: trailing white spaces and no spaces
> between keywords and open parens.

Yorba's policy is to clean up minor code guideline problems and commit with the submitted credited as author (i.e. git commit --author "name <email@example.com>")

Regarding your patch, do you have a ready test case for this issue?  Is your patch to be added on top of Simon's?
Comment 11 Robert Schroll 2014-09-03 01:26:46 UTC
(In reply to comment #10)
> > A question: Why did the web_view.container.insert_before() call get moved?
> 
> I have no idea.  I assume you mean in the WebKit API?

Sorry, I meant in Simon's patch.  It was more curiosity on my part, but DOM operations have been rather sensitive to order in the past.

> Regarding your patch, do you have a ready test case for this issue?  Is your
> patch to be added on top of Simon's?

My patch would be in place of Simon's.  I think using the CSS object model is the way to go.  But the missing API means that I've taken it as far as it can go.  If we want to handle @media rules, for example, we're going to need to do our own parsing, in which case Simon's approach may be the way to go.  (I don't really see a need for this, though.)
Comment 12 Robert Schroll 2014-09-03 01:32:05 UTC
Created attachment 285213 [details]
Email body

You can use this as an email body as a test case.  Ideally, the email would have a red background with blue text in a serif font.  My patch gets all of this except the blue (which is set in a @media rule).  Simon's patch only gets the font-family and incorrectly gives everything a yellow background.
Comment 13 Jim Nelson 2014-09-09 20:12:11 UTC
I agree, using the CSS API looks to me a more contained approach.  I have one code request: please add a comment to scope_style_sheet() that explains what's going on there, so the next person isn't baffled.

Commit!
Comment 14 Robert Schroll 2014-09-10 02:05:38 UTC
Comment on attachment 285106 [details] [review]
Scope style rules in emails

Attachment 285106 [details] pushed as 6454860 - Scope style rules in emails
Comment 15 Robert Schroll 2014-09-10 02:20:45 UTC
Now that this is in place, we may want to revisit bug #714452.  If we allowed entire HTML document in, style sheets would no longer be a problem.  This is related to bug #713547 and bug #730680.