After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 737752 - rtsp-client: crash when cleaning up session
rtsp-client: crash when cleaning up session
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-base
git master
Other Linux
: Normal normal
: 1.4.4
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2014-10-01 21:42 UTC by Aleix Conchillo Flaqué
Modified: 2014-10-24 19:43 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
rtspconnection: call notify before releasing watch resources (1.24 KB, patch)
2014-10-01 22:20 UTC, Aleix Conchillo Flaqué
committed Details | Review

Description Aleix Conchillo Flaqué 2014-10-01 21:42:00 UTC
If the client watch is being destroyed and client_watch_notify is called while client_session_removed (via gst_rtsp_session_pool_cleanup) is also being called we might end with a crash.

This is because client_watch_notify is called just after everything in the watch has been freed (mutex, etc.) in gst_rtsp_source_finalize (gstrtspconnection.c). If client_session_removed is executed just at that time client->watch might still have a valid pointer but its internal data might be invalid.

This is the error:

GLib (gthread-posix.c): Unexpected error from C library during 'pthread_mutex_lock': Invalid argument.  Aborting.

And this is the backtrace:

Thread 1 (Thread 0x7fcbd24e5700 (LWP 36192))

  • #0 __GI_raise
    at ../nptl/sysdeps/unix/sysv/linux/raise.c line 64
  • #1 __GI_abort
    at abort.c line 91
  • #2 g_thread_abort
    at gthread-posix.c line 80
  • #3 g_mutex_lock
    at gthread-posix.c line 217
  • #4 gst_rtsp_watch_set_send_backlog
    at gstrtspconnection.c line 3589
  • #5 client_session_removed
    at rtsp-client.c line 2210
  • #6 ffi_call_unix64
    from /usr/lib/x86_64-linux-gnu/libffi.so.6
  • #7 ffi_call
    from /usr/lib/x86_64-linux-gnu/libffi.so.6
  • #8 g_cclosure_marshal_generic
    at gclosure.c line 1448
  • #9 g_closure_invoke
    at gclosure.c line 768
  • #10 signal_emit_unlocked_R
    at gsignal.c line 3553
  • #11 g_signal_emit_valist
    at gsignal.c line 3309
  • #12 g_signal_emit
    at gsignal.c line 3365
  • #13 gst_rtsp_session_pool_cleanup
    at rtsp-session-pool.c line 528
  • #14 session_cleanup_timeout
    at XXXXXXX.C line 125
  • #15 g_timeout_dispatch
    at gmain.c line 4473
  • #16 g_main_dispatch
    at gmain.c line 3064
  • #17 g_main_context_dispatch
    at gmain.c line 3663
  • #18 g_main_context_iterate
    at gmain.c line 3734
  • #19 g_main_context_iterate
    at gmain.c line 3671
  • #20 g_main_loop_run
    at gmain.c line 3928
  • #21 thread_func
    at XXXXXXXXX.C line 361
  • #22 start_thread
    at pthread_create.c line 308
  • #23 clone
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S line 112
  • #24 ??

Comment 1 Aleix Conchillo Flaqué 2014-10-01 22:20:03 UTC
Created attachment 287546 [details] [review]
rtspconnection: call notify before releasing watch resources

This is a first attempt to fix this. It is a patch for gstrtspconnection.c (gst-plugins-base).

If we call the watch notify function before releasing the watch resources in gst_rtsp_source_finalize, the notify callback can lock and set priv->watch to NULL so other functions can finish properly.

In the particular case of this bug, this is what would happen (with this patch): 

client_session_removed is called with or without priv->watch==NULL. If not NULL, watch backlog will be set with watch being valid. It is valid because client_watch_notify might have been executing at the same time but is blocked in the client lock (notify hasn't finished, so watch resources are still valid).

I have to say that I'm not fully sure about this one. It seems to me, someone should keep a reference to the watch, but I don't see who or how.
Comment 2 Aleix Conchillo Flaqué 2014-10-01 22:23:55 UTC
With patch from bug 737690 I could run my test for more than 12 hours. The test consists of 6 clients connecting to different streams each one and disconnecting after a few random seconds and reconnecting. After the 12 hours more than 70000 connections/disconnections succeeded but I ended up with this crash. Let's see if I can go a bit further with patch in comment 1.
Comment 3 Aleix Conchillo Flaqué 2014-10-01 23:04:14 UTC
(In reply to comment #2)
> With patch from bug 737690 I could run my test for more than 12 hours. The test
> consists of 6 clients connecting to different streams each one and
> disconnecting after a few random seconds and reconnecting. After the 12 hours
> more than 70000 connections/disconnections succeeded but I ended up with this
> crash. Let's see if I can go a bit further with patch in comment 1.

Sorry, that is 6 streams and 3 clients per stream. So 18 clients in total connecting and disconnecting randomly. Streams have media sharing enabled.
Comment 4 Aleix Conchillo Flaqué 2014-10-03 16:37:36 UTC
With patch in comment 1 and using queue instead of multiqueue (see bug 737794). The server has been running flawlessly after 20 hours and ~100k connections/disconnections.
Comment 5 Wim Taymans 2014-10-21 08:07:02 UTC
commit 66abee92b081a1f74c0e3ca5600b99e78fb3b569
Author: Aleix Conchillo Flaqué <aleix@oblong.com>
Date:   Wed Oct 1 15:04:09 2014 -0700

    rtspconnection: call watch notify before freeing any watch resources
    
    This gives control to the notify function allowing it to finish other
    watch related functionality.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=737752