After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 362638 - EThread is obsolete
EThread is obsolete
Status: RESOLVED FIXED
Product: evolution
Classification: Applications
Component: Mailer
2.10.x (obsolete)
Other Linux
: High normal
: ---
Assigned To: Veerapuram Varadhan
Evolution QA team
evolution[codecleanup]
Depends on:
Blocks:
 
 
Reported: 2006-10-16 16:07 UTC by Matthew Barnes
Modified: 2007-12-20 18:02 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Proposed patch for Evolution (56.10 KB, patch)
2006-10-16 17:14 UTC, Matthew Barnes
none Details | Review
Proposed patch for Evolution-Data-Server (16.61 KB, patch)
2006-10-16 17:17 UTC, Matthew Barnes
none Details | Review
Amended patch for Evolution (71.14 KB, patch)
2006-10-26 19:46 UTC, Matthew Barnes
none Details | Review
Revised patch for evolution-data-server (4.24 KB, patch)
2006-12-05 18:19 UTC, Matthew Barnes
committed Details | Review
Revised patch for Evolution (64.13 KB, patch)
2007-02-15 23:34 UTC, Matthew Barnes
none Details | Review
Corrected patch for Evolution (65.60 KB, patch)
2007-03-27 23:52 UTC, Matthew Barnes
none Details | Review
Revised patch for Evolution (79.41 KB, patch)
2007-03-30 19:35 UTC, Matthew Barnes
none Details | Review
Yet another revision for Evolution (82.47 KB, patch)
2007-04-11 03:39 UTC, Matthew Barnes
none Details | Review
Revised patch for Evolution (209.82 KB, patch)
2007-05-17 19:41 UTC, Matthew Barnes
none Details | Review
Revised patch for Evolution (209.84 KB, patch)
2007-06-06 16:51 UTC, Matthew Barnes
none Details | Review
Revised patch for Evolution - sync'ed with latest Subversion (210.96 KB, patch)
2007-08-15 19:41 UTC, Matthew Barnes
none Details | Review
Revised patch fixes bug #357175 (210.92 KB, patch)
2007-08-30 02:34 UTC, Matthew Barnes
none Details | Review
Revised patch for Evolution 2.21.1 (210.18 KB, patch)
2007-10-29 18:00 UTC, Matthew Barnes
none Details | Review
Revised patch for Evolution 2.21.4 (211.66 KB, patch)
2007-12-17 17:24 UTC, Matthew Barnes
none Details | Review
Same patch, sans the em-filter-i18n.h hunk (210.66 KB, patch)
2007-12-18 15:54 UTC, Matthew Barnes
committed Details | Review

Description Matthew Barnes 2006-10-16 16:07:19 UTC
EThread is a data structure in libedataserver that provides various means of dispatching messages asynchronously, using a single thread or a pool of threads.  It was written before GLib provided GThreadPool [1], and based on my analysis of EThread usage in Evolution and Evolution-Data-Server, I think it's time to remove or at least deprecate EThread in favor of GThreadPool.

EThread provides three modes of operation, but Evolution only uses two:

   E_THREAD_QUEUE   A single thread processes messages as they arrive on an
                    asynchronous queue.  This guarantees that messages are
                    processed in the order they arrive.  Evolution never
                    limits the size of the queue.

   E_THREAD_NEW     Multiple threads process messages as they arrive on an
                    asynchronous queue.  Messages spend less time waiting on
                    the queue, but processing is unordered.

Evolution uses a single EThread in E_THREAD_NEW mode in mail-mt.c, where it allocates 10 threads to process mail messages.  All other uses of EThread are in E_THREAD_QUEUE mode with unlimited queues.

GThreadPool can easily cover these use cases by setting its max_threads limit to 1 to simulate E_THREAD_QUEUE behavior, and some N > 1 to simulate E_THREAD_NEW behavior.  GThreadPool's callback function is equivalent to EThread's "received" callback, except the function will need to issue a reply or destroy the message itself as appropriate.

[1] GThreadPool has been available since (at least) GLib 2.0.
    http://developer.gnome.org/doc/API/2.0/glib/glib-Thread-Pools.html
Comment 1 Matthew Barnes 2006-10-16 17:14:31 UTC
Created attachment 74829 [details] [review]
Proposed patch for Evolution

In this patch, I use a simple pattern for creating thread pools:

Each thread pool has its own "push message" function.  In each of those functions I use a static GOnce to create a private GThreadPool on-demand, and then just let the thread(s) run forever.  They're all session-long threads so I didn't see much point in explicitly shutting them down.
Comment 2 Matthew Barnes 2006-10-16 17:17:16 UTC
Created attachment 74830 [details] [review]
Proposed patch for Evolution-Data-Server

This patch modifies CamelSession, and also removes EThread from libedataserver.  The latter change can be snipped from the patch if we decide to just deprecate EThread.
Comment 3 Matthew Barnes 2006-10-16 17:30:25 UTC
Oops, a typo slipped into attachment #74829 [details] (Evolution).

"mail_msg_unordered_t" should read "mail_msg_unordered_push" in the following patch hunk:

Index: mail/mail-ops.c
===================================================================
RCS file: /cvs/gnome/evolution/mail/mail-ops.c,v
retrieving revision 1.459
diff -u -p -r1.459 mail-ops.c
--- mail/mail-ops.c	16 Oct 2006 14:25:28 -0000	1.459
+++ mail/mail-ops.c	16 Oct 2006 16:37:40 -0000
@@ -426,7 +426,7 @@ mail_fetch_mail (const char *source, int
 	if (status)
 		camel_filter_driver_set_status_func (fm->driver, status, status_data);
 	
-	e_thread_put (mail_thread_new, (EMsg *)m);
+	mail_msg_unordered_t ((mail_msg_t *) m);
 }
 
 /* ********************************************************************** */
Comment 4 Veerapuram Varadhan 2006-10-18 13:08:49 UTC
Awesome work,  Mathew.  In my original tasklist, I had similar TODO wherein, I planned to replace EThread with an implementation that uses GThreadPool and implements a priority queue.

How about adding a priorty-based-queue handling for messages of type E_THREAD_QUEUE?  This ensures that operations are processed in expected order, even if the order is not same in the queue.  This closes couple of loop-holes that result in a race condition.

Comment 5 Matthew Barnes 2006-10-18 16:25:38 UTC
That's a great idea.  Would it be sufficient to just use a simple high-priority boolean flag or do we need something more sophisticated?

This should be easy to implement.  Just embed the priority directly in the base message structure (CamelSessionThreadMsg, mail_msg_t, etc.), and then provide a suitable sort function for each GThreadPool (by way of g_thread_pool_set_sort_function).

Also, I'm curious.  What are some examples of loop-holes that cause a race condition?
Comment 6 Matthew Barnes 2006-10-26 19:46:31 UTC
Created attachment 75469 [details] [review]
Amended patch for Evolution

This patch adds a few enhancements to the previous one:

1) Adds a 'priority' field to mail_msg_t (type gint).  Messages default to
   priority 0, and the mail queues will prioritize them numerically.  Just
   set the 'priority' field appropriately before pushing the message.

   Note that I'm not actually setting priorities anywhere.  I've just
   implemented the mechanism for Varadhan so he can fix the race conditions
   he mentioned.

2) I wanted to get away from polling a FIFO pipe for internal messages and
   trying to keep it in sync with an asynchronous queue, which seems overly
   complex and brittle.  So I've replaced that mechanism with a function
   that gets called periodically from the main loop.  The function checks
   for and processes any messages that are waiting on what used to be the
   "GUI" and "reply" EMsgPorts, but are now just asynchronous queues.

   The periodic function gets called at 10 Hz, which seems to yield pretty
   good reponsiveness in the GUI without much overhead.  You can tune this
   by changing PERIODIC_RATE_HZ in mail-mt.c.

   I've also added a new function for pushing messages onto the "GUI"
   (or main) thread queue:

       void mail_msg_main_thread_push (mail_msg_t *msg)

3) It looks like the only reason mail_gui_thread was exposed was so that
   it could be compared to the current thread by calling pthread_equal().
   I've provided a more convenient way of doing that:

       gboolean mail_in_main_thread (void)
Comment 7 Veerapuram Varadhan 2006-10-31 13:38:52 UTC
Thanks for the patch, Mathew.  Will update the bug with my observations by this week.
Comment 8 Matthew Barnes 2006-12-05 18:19:32 UTC
Created attachment 77744 [details] [review]
Revised patch for evolution-data-server

In this revision, instead of removing the EThread code I tagged it as
deprecated by wrapping the relevant declarations in a preprocessor conditional
like so:

   #ifndef EDS_DISABLE_DEPRECATED

   /* ... EMutex declarations ... */

   #endif

This is similar to what other GNOME libraries do for deprecated symbols.
Comment 9 Matthew Barnes 2006-12-05 18:20:58 UTC
Varadhan: Ping... Ever get a chance to look at this again?
Comment 10 Veerapuram Varadhan 2007-01-08 16:14:01 UTC
(In reply to comment #9)
> Varadhan: Ping... Ever get a chance to look at this again?
> 
I have committed the EDS part of the fix to HEAD.  Will look into Evo part soon.  Thanks for the patches. :)
Comment 11 Matthew Barnes 2007-02-15 23:34:52 UTC
Created attachment 82646 [details] [review]
Revised patch for Evolution

A downstream bug [1] revealed a flaw in my Evolution patch.  The bug was that choosing either response on the dialog that asks the user whether to accept a certificate caused Evolution to crash.

Here's what the problem was and how I fixed it:

The periodic_processing() function introduced by the patch was intended to replace both GUI EMsgPorts (mail_gui_port and mail_gui_port2), but I overlooked the fact that they have different semantics.  Sending a message to either GUI port triggered the message's receive_msg() callback (if it exists).  But then mail_gui_port tried to trigger a reply via either the message's reply port or its reply callback, whereas mail_gui_port2 left it to the receive_msg() callback to issue the reply.  The periodic_process() function mimicked the mail_gui_port behavior, which caused functions that relied on mail_gui_port2's behavior to malfunction.

alert_user() in mail-session.c is the only function that used mail_gui_port2.  This function, along with its callback function do_user_message(), displays a dialog window to the user and registers another callback function -- user_message_response() -- to handle the user's response.

user_message_response() issues a reply over the message's reply port, which is picked up by the waiting alert_user() function.  This chain of events requires mail_gui_port2's behavior of not issuing a reply message itself.

The only other function that used either mail_gui_port or mail_gui_port2, AND submitted messages with a reply port defined is mail_call_main().  It, and its callback function do_call(), expected the mail_gui_port to automatically issue reply messages.  But one can easily make this compatible with mail_gui_port2's behavior, but having do_call() issue its own reply messages.

That, along with rewriting the periodic_processing() logic to not issue reply messages (similar to mail_gui_port2's behavior), was the solution to the bug.

So only do_call() and periodic_processing() were changed in this revision.

[1] http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=220714
Comment 12 Matthew Barnes 2007-02-23 19:11:57 UTC
I verified the patch in comment #11 applies cleanly to Subversion trunk and that Evolution successfully compiles with EDS_DISABLE_DEPRECATED defined.

Still waiting for a review of the Evolution work.
Comment 13 Matthew Barnes 2007-03-27 23:52:51 UTC
Created attachment 85410 [details] [review]
Corrected patch for Evolution

It seems some changes to mail/em-camel-stream.c somehow got lost in my previous patch.  The patch replaced an "if" expression in a few places:

   -	if (pthread_self() == mail_gui_thread)
   +	if (mail_in_main_thread ())

This patch restores the mail/em-camel-stream.c changes.
Comment 14 Matthew Barnes 2007-03-28 00:22:18 UTC
Okay, so apparently em-camel-stream.c is not distributed.


    2003-09-03  Not Zed  <NotZed@Ximian.com>

            * em-camel-stream.[ch]: removed.


Which means that any subsequent maintenance on these files was wasted time and effort.  These and any other source files that we don't distribute should be deleted.  The code we _do_ distribute is big and complex enough as it is without having these dead weeds littering the repository.
Comment 15 Matthew Barnes 2007-03-30 19:35:44 UTC
Created attachment 85582 [details] [review]
Revised patch for Evolution

At the risk of letting the scope of this bug creep a little, this revision makes the following additional changes:

  - Use an EFlag [1] instead of an EMsgPort when synchronizing mail
    messages between threads.  This alters the definition of mail_msg_t
    slightly in that it no longer "inherits" from EMsg.

  - Refactor and simplify EMSyncStream.  Use an EFlag [1] and an idle
    callback to synchronize message passing, instead of two EMsgPorts
    and a GIOChannel.  This makes the threading logic a lot easier to
    understand.


Varadhan: Ping, can you review this please?


[1] EFlag was proposed in bug #415891.  It's more light-weight than an
    EMsgPort and makes thread synchronization more intuitive.
Comment 16 Matthew Barnes 2007-04-11 03:39:22 UTC
Created attachment 86149 [details] [review]
Yet another revision for Evolution

Ray Strode flushed out another bug (hopefully the last) when his IMAP account approached quota and the server started issuing warning messages [1].  The bug was hiding in a rarely used code path in that pesky do_user_message() function.  This revision corrects the bug.

[1] http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=235096
Comment 17 Matthew Barnes 2007-04-11 04:01:41 UTC
Changing the product since only the Evolution patch remains.
Comment 18 Philip Van Hoof 2007-05-13 11:46:14 UTC
FYI: Attachment #77744 [details] is in Tinymail's camel-lite (verified)
Comment 19 Matthew Barnes 2007-05-17 19:41:11 UTC
Created attachment 88358 [details] [review]
Revised patch for Evolution

This will hopefully be the final revision of the Evolution patch for this bug.

There's been a lot of revisions here so let me quickly recap what all I've done until now, then I'll review what this revision adds.

  - The primary goal is to eliminate the use of EThread, which is now
    deprecated in libedataserver.  GLib provides GThreadPool, which is a
    suitable replacement for EThread.

  - The various message queues are now priority-queues (as requested in
    comment #4), though the patch does not utilize this capability.

  - As the patch evolved, a secondary goal emerged which was to substitute
    all remaining uses of EMsgPort in Evolution with the new EFlag structure
    in libedataserver (bug #415891).  This secondary goal led to some
    refactoring work in EMSyncStream to eliminate the last EMsgPort there.
    See comment #15.

  - Fixed a couple bugs reported by Fedora users along the way.  See
    comment #11 and comment #16.

This revision adds the following:

  - All of the reported bugs in this patch have been related to message
    ownership and trying to figure out who should free the message and when.
    It occurred to me that the solution to these kind of ownership problems
    is to use reference counting, so I added a reference count to each
    message and modified the API accordingly.  Now, instead of calling
    mail_msg_free() we call mail_msg_unref().  This simplifies the parts of
    the logic that were causing all the bugs.

  - Add an EFlag pointer only to the message types that actually need one.
    Previously I had added an EFlag pointer to the base mail message struct.
    This eliminates a parameter in mail_msg_new().

  - Store the message struct size in the same structure as the message
    callback function pointers.  This eliminates another parameter in
    mail_msg_new().

  - Define type names for each of the mail message callback functions
    (e.g. MailMsgExecFunc, MailMsgFreeFunc, etc).  This eliminates a lot of
    needless type casting in the code and makes things a bit more readable.

  - Use idle callbacks instead of the 10 Hz timers I added in comment #6.
    This should satisfy bug #438817, though I'm not using Linux PowerTOP's
    Evolution patch directly.

  - Refactor the mail_cancel_hook_*() functions to use a GHookList instead
    of an EDList.  To be honest, this change has nothing to do with EThread
    but it was adjacent to code I was already hacking on and I gave in to
    the temptation to tack it on.  We can eliminate this part or break it
    into a separate bug if desired, but it's pretty straight-forward.

I'd really like to get this into 2.11.3 if possible so that it gets plenty of testing before 2.12 ships.
Comment 20 André Klapper 2007-05-17 22:07:36 UTC
2.11.3 sounds good, the sooner, the better.
Comment 21 Matthew Barnes 2007-06-06 16:51:08 UTC
Created attachment 89488 [details] [review]
Revised patch for Evolution

A small reference counting error in mail-mt.c:mail_call_main() was causing a bug reported downstream [1] where Evolution would hang on exit if the Send/Receive button was clicked during the session (probably by other triggers too).

This revision fixes the problem.

[1] http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=240507
Comment 22 Matthew Barnes 2007-08-15 19:41:20 UTC
Created attachment 93752 [details] [review]
Revised patch for Evolution - sync'ed with latest Subversion
Comment 23 Matthew Barnes 2007-08-30 02:34:59 UTC
Created attachment 94588 [details] [review]
Revised patch fixes bug #357175

This patch modifies the alert_user() and user_message_exec() functions in mail-session.c to fix bug #357175.

I think I finally have the reference counting correct for these two very tricky functions.

I also discovered that the non-cancellable dialogs were not calling user_message_response(), and so many of the queued dialogs were never getting a chance to run.  That was why Evolution was hanging on exit for many users.
Comment 24 Matthew Barnes 2007-09-18 14:29:32 UTC
varadhan: ping

Do you think you can find time to review this for Evo 2.13.1 (or 2.21.1) ?
I realize it's a large patch but we shipped it in Fedora 7 and I'm pretty confident that all the major bugs have been shaken out.  Let me know if
you have any questions or reservations.
Comment 25 Matthew Barnes 2007-09-18 14:33:32 UTC
See comment #19 for a summary of the changes.
Subsequent comments were just about reference counting bugs.
Comment 26 Matthew Barnes 2007-10-29 18:00:17 UTC
Created attachment 98128 [details] [review]
Revised patch for Evolution 2.21.1

Would be nice to get this in for 2.21.2...
Comment 27 Matthew Barnes 2007-12-17 17:24:38 UTC
Created attachment 101131 [details] [review]
Revised patch for Evolution 2.21.4

Merged with Srini's non-intrusive error feature.
Comment 28 Srinivasa Ragavan 2007-12-18 05:54:46 UTC
Should get in not later than 2.21.5. I'm favor of getting this getting in
Comment 29 André Klapper 2007-12-18 12:24:41 UTC
...then please commit it NOW so we have two weeks to beta-test and fix any regressions.
also please announce the commit to gnome-i18n@ - the patch adds new strings and we are in string change announcement period.
Comment 30 Matthew Barnes 2007-12-18 13:03:43 UTC
The em-filter-i18n.h hunk should not be there and I'll remove it before I commit the patch.  I don't know how that booger keeps getting in there.  Apart from that I don't think I changed any strings.  In fact I recall making it a point not to.  I'll double-check though, of course.

Srini asked me to have Milan smoketest this and I'll do so, but really he and I and the rest of the Fedora community have already been using it for months since it shipped in Fedora 7 and 8.  I had to modify the patch a bit yesterday to accommodate Srini's error reporting changes, so I'll give that a couple days to bake in Rawhide and have Milan scrutinize the code in the meantime.  Then if all goes well, commit later this week.

Sound good?
Comment 31 Srinivasa Ragavan 2007-12-18 13:10:21 UTC
Matt, perfect!
Comment 32 André Klapper 2007-12-18 14:18:54 UTC
(In reply to comment #30)
> Apart from that I don't think I changed any strings.
ah, okay, just compared again against an updated pot file, you're right.
Comment 33 Matthew Barnes 2007-12-18 15:54:46 UTC
Created attachment 101200 [details] [review]
Same patch, sans the em-filter-i18n.h hunk

Just so I don't forget...
Comment 34 Milan Crha 2007-12-20 14:27:40 UTC
With this patch, I got this crash every time I start Evolution (aka I cannot start it). Without the patch it works just fine. 

I found that you changed mail/mail-mt.c:end_event_callback prototype too much, it should be same as before. (the "void *data" part should not be there, the function is called with only 3 parameters). Regarding to this, mail-mt.c: line 261: Is it possible that there is activity_id=0 and the error is set? In that case, the error is not initialized properly and it will crash/leak a bit later probably, so the cleaner solution can be:
...
	} else if (mail_msg->priv->activity_id) {
		activity_id = mail_msg->priv->activity_id;
...

mail/mail-ops.c: em_filter_folder_element_filter: From my point of view, the check for non-NULL is not necessary and can be removed here:
+		camel_folder_sync (folder, FALSE, camel_exception_is_set (&m->base.ex) ? NULL : &m->base.ex);

mail/mail-ops.c: fetch_mail_fetch: do the check with camel_exception_is_set, not with ex.id comparision, (Fejj told me it's not the evo style, when I did almost the same in other bug/patch.)

prepare_offline_desc (struct _set_offline_msg *m, int done) - remove that "int done" parameter, as you did everywhere else.

I also noticed your g_assert calls. You know, srag needs that as something non-crashing.

I would personally prefer g_slice_new0, instead of g_slice_new (just because of future possible enhancement of the structure and in case of forgotten initialization on one place in the code by anyone), so if you like (aka not mandatory), you can change it to g_slice_new0.

All other things seems to be fine for me, so after you fix things above, let it be in trunk and see what will users report.
Comment 35 Matthew Barnes 2007-12-20 16:15:08 UTC
(In reply to comment #34)
> With this patch, I got this crash every time I start Evolution (aka I cannot
> start it). Without the patch it works just fine.

I think that was the extra parameter in end_event_callback().  I fixed in Fedora but I guess I forgot to fix it in the upstream patch.


> I found that you changed mail/mail-mt.c:end_event_callback prototype too much,
> it should be same as before. (the "void *data" part should not be there, the
> function is called with only 3 parameters).

Fixed.


> Regarding to this, mail-mt.c: line 261: Is it possible that there is 
> activity_id=0 and the error is set? In that case, the error is not
> initialized properly and it will crash/leak a bit later probably, so the
> cleaner solution can be:
> ...
>         } else if (mail_msg->priv->activity_id) {
>                 activity_id = mail_msg->priv->activity_id;
> ...

Not sure, that's Srini's logic and I didn't mess with it.  Let's deal with that in the context of his non-intrusive errors work rather than here.

 
> mail/mail-ops.c: em_filter_folder_element_filter: From my point of view, the
> check for non-NULL is not necessary and can be removed here:
> +               camel_folder_sync (folder, FALSE, camel_exception_is_set
> (&m->base.ex) ? NULL : &m->base.ex);

The check is to make sure an existing error doesn't get overwritten.


> mail/mail-ops.c: fetch_mail_fetch: do the check with camel_exception_is_set,
> not with ex.id comparision, (Fejj told me it's not the evo style, when I did
> almost the same in other bug/patch.)

Fixed.

 
> prepare_offline_desc (struct _set_offline_msg *m, int done) - remove that
> "int done" parameter, as you did everywhere else.

Fixed.

 
> I also noticed your g_assert calls. You know, srag needs that as something
> non-crashing.

Fixed.

 
> I would personally prefer g_slice_new0, instead of g_slice_new (just because
> of future possible enhancement of the structure and in case of forgotten
> initialization on one place in the code by anyone), so if you like (aka not
> mandatory), you can change it to g_slice_new0.

Fixed.


Thanks for the review!  It'll be nice to finally get this off my hands.
A quick smoketest shows the modified patch is working okay for me.  I'll
write up the ChangeLogs and commit this.
Comment 36 Milan Crha 2007-12-20 16:23:06 UTC
(In reply to comment #35)
> 
> > mail/mail-ops.c: em_filter_folder_element_filter: From my point of view, the
> > check for non-NULL is not necessary and can be removed here:
> > +               camel_folder_sync (folder, FALSE, camel_exception_is_set
> > (&m->base.ex) ? NULL : &m->base.ex);
> 
> The check is to make sure an existing error doesn't get overwritten.
> 

Oh, my fault, I didn't realize it as opposite check and I misread brackets. Oops.
Comment 37 Matthew Barnes 2007-12-20 18:02:48 UTC
Committed to trunk (revision 34730).