GNOME Bugzilla – Bug 652637
Don't hold locks across async calls
Last modified: 2012-07-23 17:27:01 UTC
This pattern is pervasive in libfolks: lock (something) { ... yield foo (); ... } We are holding a lock across calls, we shouldn't do this.
(mass changing milestones)
Additionally, we have this pattern in several places (BackendStore.prepare()/load_backends(), each of the Backend.prepare() and PersonaStore.prepare() implementations): lock (something) { if (!something) { ... // usually, but not always, a yield here something = true; } } In almost every case, a central class has already run through this code on its own, so any client calls to this code will just fail the "!something" conditional and return. Eg, BackendStore.load_backends() ends up calling Backend.prepare() for all of the enabled backends. So any client calls to Backend.prepare() will return early. I experimented with commenting out the BackendStore's calls to Backend.prepare() and having a test run: yield backend_store.load_backends (); var backend = backend_store.enabled_backends.get ("telepathy"); backend.prepare.begin(() => {...}); backend.prepare.begin(() => {...}); Both handlers were called interleaved in the same thread -- ie, they both hit the code block containing "something = true"; So the locking doesn't truly help. In a separate test case: yield backend_store.load_backends (); var backend = backend_store.enabled_backends.get ("telepathy"); yield backend.prepare(); yield backend.prepare(); As expected, the code was called sequentially in the same thread -- only the first prepare() call hit the code block containing "something = true"; So the locking doesn't matter here either way.
Related to comment #2, an important note about async functions and locking in Vala: * async functions (including the callback handler) are run entirely in the same thread unless they explicitly run their code in a new thread. So, most GIO and D-Bus async functions do run in a separate thread, but most of our custom-written async functions don't. Raul's point in comment #0 still stands, but I think the situation we're solving in this bug may be a little different than we originally thought.
Created attachment 191430 [details] [review] Remove needless locking and fix remaining locking Patches from branch: http://cgit.collabora.com/git/user/treitter/folks.git/log/?h=bgo652637-fix-async-locking-3
Review of attachment 191430 [details] [review]: ::: backends/tracker/lib/trf-persona-store.vala @@ +687,2 @@ Trf.Persona ret = null; + string? contact_urn = yield this._insert_persona (builder.result, Same here and in the rest of the patch; I don't think you can access this._is_prepared without locking. Unless you're now using the assumption that since this is a custom async function it won't be run in a thread? ::: backends/telepathy/tp-backend.vala @@ +35,3 @@ private AccountManager _account_manager; private bool _is_prepared = false; + private bool _prepare_pending = false; Are you sure that access to member variables is atomic? I don't think it is, and thus the access to this._is_prepared needs to be inside a lock, otherwise the code isn't protected from being simultaneously run by separate threads.
(punting non-critical bugs beyond 0.6.0)
(Setting target bugs for Folks 0.6.1. Search for this phrase to deal with bug mail spam.)
Didn't make this release; punting to 0.6.2.
(Punting bugs to 0.6.4)
(mass bump to gnome-3.4 milestone in our new milestone scheme; search this string mass move bug mail)
Philip, For this bug what is needed, adding locks around the _prepare_pending and _is_perpared access and setting code? If that's all that is needed I can take a look at this and updating the patch soon.
(In reply to comment #11) > Philip, > > For this bug what is needed, adding locks around the _prepare_pending and > _is_perpared access and setting code? If that's all that is needed I can take > a look at this and updating the patch soon. I think we might want to consider removing locks, actually. I’m fairly sure libfolks is not thread-safe at all, since the locking we do is fairly haphazard and limited to prepare() methods. Further, I don’t think any users of libfolks use it from multiple threads, so there seems little point in spending time on making things thread-safe. I wonder if, assuming that no users of libfolks are actually using multiple threads (this will need checking), we could just remove all the dodgy locking and update the documentation to state that folks can only be used from a single thread. It would tidy up the code and fix this bug.
Philip, If what you mean by removing locks is getting rid of the _prepare_pending that wouldn't work since that was added to fix a bug in empathy. But if you mean removing the lock (this._is_prepared) that may work indeed. From looking at the patch, it seems some of that has already made it into master (the addition of _prepare_pending) I'll try applying the rest here, removing the locks, and preparing a new patch if that's what is likely going to solve this bug.
(In reply to comment #13) > If what you mean by removing locks is getting rid of the _prepare_pending that > wouldn't work since that was added to fix a bug in empathy. But if you mean > removing the lock (this._is_prepared) that may work indeed. Yes, I meant removing the “lock (…)” but not the _is_prepared stuff.
Didn't make gnome-3.4 release, punting to gnome-3.6 milestone
Created attachment 219270 [details] [review] Remove lock calls from prepare and unprepare methods.
Review of attachment 219270 [details] [review]: ::: backends/eds/lib/edsf-persona-store.vala @@ +757,3 @@ + finally + { + this._prepare_pending = false; I think the original code was wrong to do this. We should only set _prepare_pending to false at the very end of the function, or we risk two async calls to prepare() (in the same thread) happening concurrently; one at the top of the function, and one down below this line. This function should be rearranged so that there’s a single top-level “try…finally” block which handles _prepare_pending, and several “try” blocks below it which actually handle exceptions. ::: backends/libsocialweb/lib/swf-persona-store.vala @@ +322,1 @@ { Would be good to make this return early instead of conditionalising the whole function body. The code below should then be stuck in a big “try…finally” block like in other backends, so that we gracefully unset _prepare_pending right before exiting the function if an exception appears. ::: backends/libsocialweb/sw-backend.vala @@ +101,1 @@ { Same points as in swf-persona-store.vala. @@ +139,1 @@ + this._prepare_pending = true; This function also needs a “try…finally” block. ::: backends/telepathy/tp-backend.vala @@ +138,2 @@ + try + { There seems to be a “this._prepare_pending = true” missing here.
Created attachment 219335 [details] [review] Updated travis branch and removed locks from my previous patch This time I rebased travis branch on master, tweaked the result by removing some more locks on _is_prepared and combined the finaly prepare_pending = false clauses into one big try finally in the eds backend as per Philip's review.
So this has the async locking unit test travis created and it passes also.
Review of attachment 219335 [details] [review]: ::: backends/eds/lib/edsf-persona-store.vala @@ +955,2 @@ this._is_prepared = true; this._prepare_pending = false; This _prepare_pending is now redundant. ::: backends/libsocialweb/lib/swf-persona-store.vala @@ +322,1 @@ { As I said before, it would be cleaner if this returned early on the negation of this condition, rather than putting the whole function in an ‘if’ block. Similarly, the whole function is still missing a big ‘try…finally’ block. ::: backends/libsocialweb/sw-backend.vala @@ +126,1 @@ + this.unref (); Ack. Looking at this again, the bulk of prepare() happens in this callback, so the ‘try…finally’ approach won’t work here. I guess the ‘try…finally’ block should move inside the callback. @@ +163,3 @@ + this._is_prepared = false; + this._prepare_pending = false; + this.notify_property ("is-prepared"); Missing a ‘try…finally’ block around this lot. ::: backends/telepathy/lib/tpf-persona-store.vala @@ +426,3 @@ public override async void prepare () throws GLib.Error { + Extraneous new line. @@ +543,3 @@ + + this._is_prepared = true; + this._prepare_pending = false; This _prepare_pending is now redundant. ::: backends/telepathy/tp-backend.vala @@ +138,3 @@ + try + { + this._account_manager.account_enabled.disconnect ( There’s still a “this._prepare_pending = true” missing here. ::: backends/tracker/lib/trf-persona-store.vala @@ +1146,3 @@ { this._prepare_pending = false; } I _think_ this first ‘finally’ block is redundant now that you’ve got the second one below/outside it, but it’s probably best to check the generated C code when removing it. @@ +1393,1 @@ { What’s the rationale for locking this._personas here, but not above? Please add a comment explaining it. :-) ::: tests/folks/async-locking.vala @@ +10,3 @@ + MainLoop _loop; + + public AsyncLockingTests () Might be an idea to stick a comment somewhere explaining the purpose of this set of tests, or linking to this bug report, or something.
Created attachment 219349 [details] [review] Removed trf-persona-store locks, etc. Ok, removed the trf-persona-store locks, thanks Philip for investigating those. Also fixed all the other issues from the previous review.
Review of attachment 219349 [details] [review]: A few final little problems, then I think it’s good to go. Please commit as soon as you’ve fixed these (no need to slow things down with another review). ::: backends/eds/lib/edsf-persona-store.vala @@ +955,2 @@ this._is_prepared = true; this._prepare_pending = false; This _prepare_pending is still redundant. ::: backends/key-file/kf-persona-store.vala @@ +234,3 @@ + filename, e1.message); + this.removed (); + this._prepare_pending = false; This _prepare_pending (and the ones below, excluding the one in the ‘finally’ block) shouldn’t be necessary, since the ‘finally’ block should always be executed on leaving this block of code (even via a ‘return’). ::: backends/libsocialweb/lib/swf-persona-store.vala @@ +350,3 @@ + /* Remove the persona store on error */ + this.removed (); + this._prepare_pending = false; As with Kf.PersonaStore, this _prepare_pending shouldn’t be necessary. Neither should be the ones below. ::: backends/libsocialweb/sw-backend.vala @@ +114,2 @@ { this._prepare_pending = true; This _prepare_pending has to go _outside_ the get_services() call, otherwise we could easily end up with two parallel get_services() calls. The second one would be allowed if the callback for the first one hadn’t yet been executed.