After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 736295 - multiqueue: posts buffering message holding lock
multiqueue: posts buffering message holding lock
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gstreamer (core)
git master
Other All
: Normal blocker
: 1.4.2
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2014-09-08 23:53 UTC by Thiago Sousa Santos
Modified: 2014-09-16 21:28 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
multiqueue: unlock when posting message (1.20 KB, patch)
2014-09-14 02:22 UTC, Thiago Sousa Santos
reviewed Details | Review
multiqueue: do not post messages holding the lock (8.38 KB, patch)
2014-09-15 17:10 UTC, Thiago Sousa Santos
none Details | Review
multiqueue: do not post messages holding the lock (8.70 KB, patch)
2014-09-16 03:51 UTC, Thiago Sousa Santos
needs-work Details | Review

Description Thiago Sousa Santos 2014-09-08 23:53:51 UTC
multiqueue is currently posting its buffering messages while holding the GST_MULTI_QUEUE_MUTEX_LOCK. This is not a good practice as other operations on it might block while the message is handled.

One of these situations appeared after fixing: https://bugzilla.gnome.org/show_bug.cgi?id=727255

There is a deadlock with playbin and multiqueue:

Thread A (playbin exposing a group)
A.1) Gets the source-group lock
A.2) Links the new pad (causing a reconfigure event to be pushed)
A.3) Multiqueue receives the reconfigure and gets the multiqueue lock
A.4) Multiqueue handles the reconfigure

Thread B (multiqueue posting a buffering message)
B.1) Multiqueue gets its lock
B.2) Multiqueue posts the buffering message
B.3) Playbin intercepts the buffering message and needs to check if it should pass it to the application (behavior introduced by #727255)
B.4) Playbin gets the source-group lock
B.5) Playbin decides if it should post the message or not

As you can see, threads A and B both make use of the source-group and multiqueue lock, but as they get them in different orders it might lead to a deadlock.

The solution to #727255 seems good enough. I believe the problem is that multiqueue shouldn't post the message while still holding the lock. This should be no problem as long as the messages are posted in the same order as they are created.
Comment 1 Sebastian Dröge (slomo) 2014-09-12 12:16:06 UTC
Yes, it should not hold any locks while posting messages... same as for pushing events or buffers.
Comment 2 Thiago Sousa Santos 2014-09-14 02:22:45 UTC
Created attachment 286152 [details] [review]
multiqueue: unlock when posting message

Is this a too naive solution? Should we have a properly locked variable to hold the buffering message to post to make sure we don't have races posting 99% 100% where 100% would be posted first?

I have been using this in my DASH tests and it seems to work but I still think that theoretically the worse can happen
Comment 3 Sebastian Dröge (slomo) 2014-09-15 08:06:22 UTC
It seems racy indeed, yes. multiqueue can post the buffering messages from different threads as it accumulates the percentages of every stream, so you could get into the situation you mention above... maybe it would be sufficient if we introduce a new lock just for posting the buffering messages and protecting the internal percentage variable?
Comment 4 Thiago Sousa Santos 2014-09-15 17:10:11 UTC
Created attachment 286225 [details] [review]
multiqueue: do not post messages holding the lock

It might cause deadlocks to post messages while holding the multiqueue
lock. To avoid this a new boolean flag is set whenever a new buffering percent
is found. The message is posted after the lock can be released.

To make sure the buffering messages are posted in the right order, messages
are posted holding another lock. This prevents 2 threads trying to post
messages at the same time.
Comment 5 Thiago Sousa Santos 2014-09-16 03:51:01 UTC
Created attachment 286254 [details] [review]
multiqueue: do not post messages holding the lock

It might cause deadlocks to post messages while holding the multiqueue
lock. To avoid this a new boolean flag is set whenever a new buffering percent
is found. The message is posted after the lock can be released.

To make sure the buffering messages are posted in the right order, messages
are posted holding another lock. This prevents 2 threads trying to post
messages at the same time.
Comment 6 Thiago Sousa Santos 2014-09-16 21:27:49 UTC
commit 3f83ca9c43cb9577cb5f47af8b103d97160a78ed
Author: Thiago Santos <thiagoss@osg.samsung.com>
Date:   Thu Sep 11 18:01:58 2014 -0300

    multiqueue: do not post messages holding the lock
    
    It might cause deadlocks to post messages while holding the multiqueue
    lock. To avoid this a new boolean flag is set whenever a new buffering percent
    is found. The message is posted after the lock can be released.
    
    To make sure the buffering messages are posted in the right order, messages
    are posted holding another lock. This prevents 2 threads trying to post
    messages at the same time.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=736295


1.4:
3bbecbb857b28b52de08d0b8987903c1a5c1a02e
Comment 7 Thiago Sousa Santos 2014-09-16 21:28:24 UTC
Comment on attachment 286254 [details] [review]
multiqueue: do not post messages holding the lock

Marking as obsolete as I did one change to the patch to actually disable the perc_changed when posting the buffering message.