After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 391472 - Add ability to match headers by words
Add ability to match headers by words
Status: RESOLVED FIXED
Product: evolution
Classification: Applications
Component: Mailer
2.10.x (obsolete)
Other Linux
: Low enhancement
: Future
Assigned To: evolution-mail-maintainers
Evolution QA team
evolution[filters]
Depends on:
Blocks:
 
 
Reported: 2007-01-01 04:49 UTC by Matthew Barnes
Modified: 2012-02-07 17:44 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
eds patch (6.42 KB, patch)
2012-02-07 17:37 UTC, Milan Crha
committed Details | Review
evo patch (4.11 KB, patch)
2012-02-07 17:39 UTC, Milan Crha
committed Details | Review

Description Matthew Barnes 2007-01-01 04:49:33 UTC
Forwarding this from a downstream bug report:
http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=135918

I was able to reproduce the filtering behavior using Evolution 2.9.4.


Description of problem: I have a filter that checks if the subject
contains pto and then moves the message to a folder. I would expect
the filter to look for the word "pto". The filter though matches on
the substring "pto" so while subject lines such as "PTO today",
"Taking pto" and "PTO" trigger the filter so does any email with the
phrase "laptop" which is not what I intended. The other option would
be to use a subject is filter which would catch "pto", but not "pto
today".  

The help documentaition does not make clear what "contains" actually
means for the rule.

Version-Release number of selected component (if applicable):
evolution 2.0.2

How reproducible:
Always

Steps to Reproduce:
1. set up a subject filter with a contains line (pto)
2. send mail containing the word "laptop" in the subject.
3.
  
Actual results:
laptop is filtered.

Expected results:
laptop not be filtered.

Additional info:

There are a number of ways to approach this. One possible way would be
to split contains into 2 selections: contains word, and contains
substring. Or you could make contains search on word and add a
substring selection for filter. It seems as if the code is making any
substring match trigger the filter. It should probably parse the field
 with a space token and then try to exact match the filter text to the
subject line pieces and not match on substring.

Workaround is to add a leading and trailing whitespace around " pto "
in the filter text. (which one could argue is not all that good of
behavior either).

Anyhow filter rules in general should be better documented.
Comment 1 Matthew Barnes 2007-04-30 17:50:14 UTC
(In reply to comment #0)
> There are a number of ways to approach this. One possible way would be
> to split contains into 2 selections: contains word, and contains
> substring. Or you could make contains search on word and add a
> substring selection for filter. It seems as if the code is making any
> substring match trigger the filter. It should probably parse the field
> with a space token and then try to exact match the filter text to the
> subject line pieces and not match on substring.

Splitting on whitespace and then trying to do an exact match won't work in cases where there's punctuation characters adjacent to the word (e.g. "laptop,").  But maybe replacing all punctuation characters with spaces (using ispunct()) and THEN splitting on whitespace would get us closer to a reasonable behavior.  Would that work for all locales though?

I think I'd prefer to avoid presenting the user with the somewhat technical term "substring".  My suggestion would be

   has word
   does not have word
   contains
   does not contains

with "has word" listed first since that seems like the most common case.  Would that sufficiently disambiguate the word "contains"?

 
> Anyhow filter rules in general should be better documented.

Definitely agree.
Comment 2 Jeffrey Stedfast 2007-04-30 18:08:27 UTC
how would you implement this for IMAP? or any other backend where the messages are stored remotely?

you're talking about adding complex string matching which would probably have to work like pango's word boundary logic.
Comment 3 Jeffrey Stedfast 2007-04-30 18:34:40 UTC
ignore me as far as IMAP goes, I was thinking this was body word matching (which, for local mail would be "easy" since that's how ibex indexes, but a pita for IMAP where it is only able to do substring matching).

subject word matching should be doable as those strings are all locally cached, could just add a new method for "word match" which could be implemented as suggested above quite easily (bonus points for using pango word breaking logic, but probably not required)
Comment 4 Matthew Barnes 2008-03-11 00:33:00 UTC
Bumping version to a stable release.
Comment 5 Milan Crha 2012-02-07 17:37:37 UTC
Created attachment 207004 [details] [review]
eds patch

for evolution-data-server;

The eds part, defining "header-has-words". I'm not sure whether ispunct() is not too much, because for example from "desktop-devel-list" you get 3 words, which might not be always expected, same as from "3:30 pm" one gets 3 words. But who knows, let's see what will users think.
Comment 6 Milan Crha 2012-02-07 17:39:30 UTC
Created attachment 207005 [details] [review]
evo patch

for evolution;

Using the new "header-has-words" for filtering.
Comment 7 Milan Crha 2012-02-07 17:44:05 UTC
Created commit e3da65f in eds master (3.3.90+)
Created commit 569cdde in evo master (3.3.90+)