After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 733500 - Vala's handling of GMutex/GCond structs causes deadlocks
Vala's handling of GMutex/GCond structs causes deadlocks
Status: RESOLVED FIXED
Product: vala
Classification: Core
Component: Code Generator
unspecified
Other Linux
: Normal critical
: ---
Assigned To: Vala maintainers
Vala maintainers
: 734262 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2014-07-21 12:57 UTC by Iain Lane
Modified: 2014-09-07 02:28 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
strace -f -tt -s1024 (509.30 KB, application/x-xz)
2014-07-21 13:00 UTC, Iain Lane
Details
trace (7.30 KB, text/plain)
2014-07-23 13:40 UTC, Iain Lane
Details
thread apply all bt full (63.95 KB, text/plain)
2014-07-23 14:54 UTC, Iain Lane
Details

Description Iain Lane 2014-07-21 12:57:02 UTC
After uploading 2.41.2 to Utopic an automatically triggered package test started failing with a timeout.

  https://jenkins.qa.ubuntu.com/view/Utopic/view/AutoPkgTest/job/utopic-adt-shotwell/65/ARCH=amd64,label=adt/console

This is far from a minimal testcase - this testcases uses the autopilot functional testing framework to poke at Shotwell, and autopilot uses D-Bus to communicate with the application it's testing. GLib's testsuite itself passes. Can walk you through reproducing this failure if you want. strace log attached.

I bisected 2.41.1 to 2.41.2 and the offending commit is 49b59e5ac4428a6a99a85d699c3662f96efc4e9d: GLib: implement GMutex natively on Linux.
Comment 1 Iain Lane 2014-07-21 13:00:32 UTC
Created attachment 281307 [details]
strace -f -tt -s1024
Comment 2 Iain Lane 2014-07-23 13:40:01 UTC
Created attachment 281478 [details]
trace
Comment 3 Iain Lane 2014-07-23 13:40:56 UTC
Without autopilot, you can reproduce this using Shotwell:

  - Plug in a camera, or use something like umockdev to emulate one
  - Open Shotwell
  - Click the camera
  - Click a picture
  - "Import selected"

Shotwell has its own semaphore implementation, and it is hanging inside here.
Comment 4 Allison Karlitskaya (desrt) 2014-07-23 14:20:33 UTC
Can you attach a backtrace with all threads?  Thanks.
Comment 5 Iain Lane 2014-07-23 14:54:51 UTC
Created attachment 281483 [details]
thread apply all bt full

Sure, here you go.
Comment 6 Allison Karlitskaya (desrt) 2014-07-23 15:32:36 UTC
Thanks for that.  Confirms my suspicion that this might have had something to do with the condition variables rather than the locks...
Comment 7 Allison Karlitskaya (desrt) 2014-07-24 14:52:57 UTC
Behold, generated vala code:

void abstract_semaphore_wait (AbstractSemaphore* self) {
        g_return_if_fail (self != NULL);
        g_mutex_lock (&self->priv->mutex);
        while (TRUE) {
                AbstractSemaphoreWaitAction _tmp0_ = 0;
                GMutex _tmp1_ = {0};
                _tmp0_ = abstract_semaphore_do_wait (self);
                if (!(_tmp0_ == ABSTRACT_SEMAPHORE_WAIT_ACTION_SLEEP)) {
                        break;
                }
                _tmp1_ = self->priv->mutex;
                g_cond_wait (&self->priv->monitor, &_tmp1_);
        }
        g_mutex_unlock (&self->priv->mutex);
}
Comment 8 Allison Karlitskaya (desrt) 2014-07-24 14:57:44 UTC
more verbosely: vala is doing a by-value copy of the mutex and unlocking the copy rather than the original one.  When GMutex was implemented via pthreads it contained only a pointer, so this was OK (since it was still a pointer to the same thing).  Now that we put state directly into the struct, this sort of behaviour is causing problems...
Comment 9 Maciej (Matthew) Piechotka 2014-07-26 08:08:58 UTC
It looks like a duplicate of 690686.
Comment 10 Luca Bruno 2014-07-26 10:21:53 UTC
No it's the same as the weakref bug that we did fix. The fix is to simply add lvalue_access = false to Mutex and Cond structs.
Comment 11 Luca Bruno 2014-08-03 18:43:35 UTC
commit 7616e0339c2099243acaba7dd3cc47210db97bdd
Author: Luca Bruno <lucabru@src.gnome.org>
Date:   Sun Aug 3 20:42:09 2014 +0200

    Add lvalue_access = false to Mutex and Cond
    
    Fixes bug 733500

Please reopen if needed.
Comment 12 Michael Catanzaro 2014-09-07 02:28:14 UTC
*** Bug 734262 has been marked as a duplicate of this bug. ***