After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 510779 - Improve Evolution Spam checking
Improve Evolution Spam checking
Status: RESOLVED FIXED
Product: evolution
Classification: Applications
Component: Mailer
2.22.x (obsolete)
Other Linux
: Normal normal
: ---
Assigned To: evolution-mail-maintainers
Evolution QA team
Depends on:
Blocks:
 
 
Reported: 2008-01-20 12:34 UTC by Srinivasa Ragavan
Modified: 2013-09-13 00:58 UTC
See Also:
GNOME target: ---
GNOME version: 2.21/2.22


Attachments
EDS part of the patch (8.46 KB, patch)
2008-01-20 12:35 UTC, Srinivasa Ragavan
none Details | Review
Evolution portion of patch. (31.67 KB, patch)
2008-01-20 12:35 UTC, Srinivasa Ragavan
none Details | Review
EDS portion of patch (8.94 KB, patch)
2008-01-21 11:15 UTC, Srinivasa Ragavan
committed Details | Review
Evolution portion of patch. (31.94 KB, patch)
2008-01-21 11:17 UTC, Srinivasa Ragavan
committed Details | Review

Description Srinivasa Ragavan 2008-01-20 12:34:46 UTC
I have made a patch for speeding up spam filtering.

1. Evolution now has a way to specify spam headers and values to look out for in the mail. If this is specified, Evolution won't download the entire message to check for spam. This should speed up spam checking a lot.

2. Now we have a whilelist implementation. Evolution can lookup sender in addressbook and if the sender is found, it is considered as HAM. This check is conducted after the custom junk header test. (since it is possible that I get spam mails from my address. The custom header can help to skip filter this better). But one thing is that, it can be slow, if remote addressbooks are marked for auto completion. (The same addressbooks are used).

3. Only if we can't decide by steps 1 and 2, we do the normal junk filtering.

It changes lots of code at this point, so I'm posting for a review here.
Comment 1 Srinivasa Ragavan 2008-01-20 12:35:21 UTC
Created attachment 103253 [details] [review]
EDS part of the patch
Comment 2 Srinivasa Ragavan 2008-01-20 12:35:51 UTC
Created attachment 103254 [details] [review]
Evolution portion of patch.
Comment 3 Srinivasa Ragavan 2008-01-20 12:38:51 UTC
Milan, sorry to bother you much. I have more confidence on your review. Can you do it for me?
Comment 4 Srinivasa Ragavan 2008-01-21 08:32:40 UTC
Hmm, I did a self review. I have to free the headers at the message_info_free. 
Comment 5 Srinivasa Ragavan 2008-01-21 09:03:42 UTC
Another thing, 
em-mailer-prefs.c: 
	toggle_button_init (prefs, prefs->junk_book_lookup, FALSE,
			    "/apps/evolution/mail/junk/lookup_addressbook",
-			    G_CALLBACK (toggle_button_init));
+			    G_CALLBACK (toggle_button_toggled));

Should be another thing,
Comment 6 Srinivasa Ragavan 2008-01-21 11:15:57 UTC
Created attachment 103322 [details] [review]
EDS portion of patch
Comment 7 Srinivasa Ragavan 2008-01-21 11:17:20 UTC
Created attachment 103323 [details] [review]
Evolution portion of patch.
Comment 8 Milan Crha 2008-01-21 12:39:39 UTC
a) in eds part, this chunk "@@ -2699,12 +2740,12 @@ imap_update_summary (CamelFolder *folder" can be removed. 

b) in camel_session_finalise, you should check for non-NULL session->priv->junk_headers before freeing it (as you do few chunks below)

c) you do not want to have camel_session_set_junk_headers prototype in camel-session.c, because it's already in the camel-session.h

d) I also noticed some minor coding style issues like missing space between function name and opening bracket, too many empty lines and one other thing, but nothing serious.

e) mail_session_set_junk_headers, why not have its params as 'const char **'? It would be better.

f) construct_junk_headers I guess the 'else bs = NULL;' should be placed to 'if (es)', am I right?

g) in CamelMessageInfo, you should call it 'junk_headers' or something similar, not 'headers' only, because one can think it contains all headers, which is not true.

There is one thing I am not sure if I understand correctly, is it right that this thing works for IMAP only? And how do you force to download those user specified junk headers from the server? I didn't notice the code for it, somehow.

Another question, if I got it right, user will be able to enter special mail headers to check the spam, like 'X-Spam-Status' I can see on one of my mails. Then  user can also specify for which value to look, or more precisely, for which subvalue (string contained in the header value, case insensitive). For IMAP, all the headers specified in the UI are placed to the new messageinfo's property.

Anyway, I just briefly read the code, I didn't test it. It would be also better if someone using junk plugin can review this too.
Comment 9 Srinivasa Ragavan 2008-01-21 12:58:40 UTC
Fejj, any thoughts on this approach/patch would be nice.

Comment 10 Jeffrey Stedfast 2008-01-21 16:11:03 UTC
I think this kind of replicates existing filters somewhat. In essence, all this code is doing is performing a [Specific Header][X-Spam-Status][contains][blah] filter and then [Mark as Junk] as the action, right?

(well, assuming we have a [Mark as Junk] action, which, if we don't, might be a better alternative to solve this problem?)

The other bit about checking the addresses I know currently doesn't have a filter equivalent, but I know it's been on the TODO list for a long time and it looks like you've basically done the needed work for that, so it could be added as a filter :)
Comment 11 Srinivasa Ragavan 2008-01-21 18:40:29 UTC
Fejj,

I was trying to speed up the spam checking. If you look at the specific-header check, it downloads the mail and then it checks for the header. What I have done is that I have preserved some headers that we download into the message info, which I use at the junk checking. I think I can extend this to speed up [specific header] also. In case if it isn't there, then may make sense to download the mail and see/perform filter it.

Nice, I think, I can add the lookup address as a filter also. 
Comment 12 Srinivasa Ragavan 2008-01-24 03:04:31 UTC
Committed in trunk with slight modifications.
Comment 13 Jeffrey Stedfast 2008-07-07 21:17:09 UTC
probably shouldn't be using struct _camel_header_param for this...

camel_header_raw is what is used for headers, camel_header_param is for header parameters (as in the content-type/disposition parameters)
Comment 14 Srinivasa Ragavan 2008-07-07 21:27:30 UTC
Ah I see, I will do the necessary changes, thanks Fejj.