After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 758285 - glib: Don't store CF run loop to avoid racy cleanup
glib: Don't store CF run loop to avoid racy cleanup
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: cerbero
git master
Other Mac OS
: Normal normal
: 1.6.2
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2015-11-18 14:17 UTC by Heinrich Fink
Modified: 2015-12-01 13:07 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
glib: Don't store CF run loop to avoid racy cleanup (6.74 KB, patch)
2015-11-18 14:18 UTC, Heinrich Fink
committed Details | Review

Description Heinrich Fink 2015-11-18 14:17:57 UTC
On OSX, the pipeline/simple_launch_lines unit test (i.e. test_2_elements)
crashes.

This seems to be caused by subsequent calls to gst_bus_poll (the test checks
for message types), each causing a GMainContext to be acquired/released on the
main thread, which exposes a race condition in the Cocoa-specific handling.

The race is between the Cocoa-specific portion of g_main_context_release and
the internal cocoa_select thread:
  
(1) A GMainContext is created, the CF run loop is acquired and the cocoa
select thread is polling some descriptors
(2) The instance from (1) is released, the CF runloop variable is set to NULL.
(3) The select thread is done with polling from (1), and tries to wake up the
CF runloop via the static variable that is now set to NULL.

(3) will crash with CFRunLoopWakeUp being called on the NULL pointer of the
static runloop variable.

After reading through the related code, I think there are three ways to fix
this: 

A: shut down the cocoa_select thread when releasing the context
B: Check CF run loop var for NULL in a thread-safe manner before calling
CFRunLoopWakeUp
C: Avoid storing the CoreFoundation main run loop in a static variable

ad A: not sure about repercussion when using recursive main contexts. The
teardown and restart of the select thread might be a bit more involved to do
this correctly, and introduce further race conditions

ad B: Guarding CFRunLookWakeUp in the cocoa_select thread with 

if (!g_atomic_pointer_get (&cocoa_main_context))
  CFRunLoopWakeUp (cocoa_main_thread_run_loop);

... so basically, when we re-set the cocoa_main_context to NULL, we also
wouldn't wake up the run loop anymore.

ad C: Instead of storing the CF main thread run loop, we can just use
CFRunLoopGetMain where needed. In the above race condition, the run loop would
then be woken up to do nothing, which is fine and not racy anymore. There is
always a single main run loop per process anyway, so there is no need to store
this. Implementation of CFRunLoopGetMain is seen here:
https://github.com/opensource-apple/CF/blob/master/CFRunLoop.c#L1482
A plus is that this call checks for the process being forked, in which case
all of our other code would probably not work, anyway.

For [C], I have attached a patch (for the glib patch in cerbero). I like this
solution most because it removes a static global variable, which is always
prone to race conditions, and because it requires only minimal changes.

Besides pipeline/simple_launch_lines, other tests failed with the same crash
as described above:

- gst/gstbin test_stream_start
- elements/multiqeue test_simple_shutdown_while_running

After applying the submitted patch those tests and simple_launch_lines will
pass again (on OSX 10.11).
Comment 1 Heinrich Fink 2015-11-18 14:18:02 UTC
Created attachment 315827 [details] [review]
glib: Don't store CF run loop to avoid racy cleanup

After polling for file descriptors, the Cocoa select thread did wake up
the main run loop instance that has been stored in a static variable.
This variable might have already been set to NULL by
g_main_context_release, resulting in a segfault of CFRunLoopWakeUp. This
race is fixed by this commit by simply not storing the main run loop in
a variable, but instead calling CFRunLoopGetMain locally where needed.
Within a single process, the main run loop is always the same, and
always accessible. It is therefore not necessary anyway to remember the
run loop instance when acquiring the context.
Comment 2 Sebastian Dröge (slomo) 2015-11-18 14:23:05 UTC
Attachment 315827 [details] pushed as e75ddb0 - glib: Don't store CF run loop to avoid racy cleanup