After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 743898 - HTML signatures break delimited links in text signatures
HTML signatures break delimited links in text signatures
Status: RESOLVED FIXED
Product: geary
Classification: Other
Component: composer
master
Other Linux
: Normal normal
: ---
Assigned To: Geary Maintainers
Geary Maintainers
review
Depends on:
Blocks:
 
 
Reported: 2015-02-02 23:35 UTC by Michael Gratton
Modified: 2015-02-04 05:24 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Stricter regex for finding HTML tags (965 bytes, patch)
2015-02-03 05:22 UTC, Robert Schroll
committed Details | Review

Description Michael Gratton 2015-02-02 23:35:50 UTC
The recent support for HTML that landed (Bug 738895) broke a common convention in text emails - putting links in angle brackets.

My signature looks something like this:

    -- 
    Michael Gratton <http://my.web.site>
    Blah blah.

When it is now inserted into new messages, it looks like:

    -- 
    Michael Gratton 
    Blah blah.

Use of angle brackets is noted as being common in RFC 3986, Appendix C, as well quotes and whitespace.
Comment 1 Robert Schroll 2015-02-03 00:52:16 UTC
When I try that signature, it works just fine.  If you could post your actual signature, or confirm that this signature causes a problem for you, that would be helpful.  Also, please let us know if you're entering the signature in the Accounts dialog or using a ~/.signature file.

In advance of that data, let me hazard a guess: Does your URL in the angled brackets end with a forward slash?
Comment 2 Michael Gratton 2015-02-03 02:26:28 UTC
This is set in the accounts dialog, but yep, it has a trailing slash. So more like:

    -- 
    Michael Gratton <http://my.web.site/>
Comment 3 Robert Schroll 2015-02-03 03:23:37 UTC
The problem is that <http://my.web.site/> looks like a self-closing HTML tag, at least to our regex for detecting such things.  Therefore, Geary switches into HTML mode and you get an HTML tag, not a URL.  The simple solution is to skip the trailing slash. :)

A few changes we could consider for the regex:
1) Require a space before the "/>" to count as a self-closing tag.  Put the space there is consider good practice, for reasons I've never understood, so most people will probably do it.  But it might confuse people that don't use it.
2) Look for "://" and don't count anything that has that as an HTML tag.  But this won't help if someone does "<my.web.site/>".
3) Check that the tag itself consists only of letters followed by a space (or "/>").
4) Only look for pairs of open/close tags when guessing whether a string is HTML or not.  But this means that a signature containing only an img element, for example, would be detected to be plain text.

I think (3) is the most correct, but (1) would be very simple and will probably work in 99% of the cases.  Jim, your thoughts?
Comment 4 Jim Nelson 2015-02-03 03:28:40 UTC
Let's go with (3).

Is this code particular to the HTML signature or in a utility method other code uses?  I ask to understand the scope affected.  It *seems* like this is ok as a general solution, but it definitely seems like a sig-only issue and would like to contain the change to just that.
Comment 5 Robert Schroll 2015-02-03 05:22:57 UTC
Created attachment 295994 [details] [review]
Stricter regex for finding HTML tags

I was worried that (3) would really complicate the regex, but it 
doesn't.  As I was looking at this, I noticed the regex for a tag pair 
would erroneously accept "<ab>c</a>", so I fixed that as well.

This code is in a utility file in the engine.  It's currently called 
from the accounts dialog and from the composer.  Perhaps it should move 
into the client utilies?
Comment 6 Jim Nelson 2015-02-03 21:22:28 UTC
Review of attachment 295994 [details] [review]:

Looks good.  Yes, please move to the client util directory and commit.
Comment 7 Robert Schroll 2015-02-03 22:40:21 UTC
Attachment 295994 [details] pushed as 3d14719 - Stricter regex for finding HTML tags
Comment 8 Michael Gratton 2015-02-04 05:24:52 UTC
That fixed the issue I was seeing, thanks.