GNOME Bugzilla – Bug 732226
Races when sessions time out
Last modified: 2014-07-09 14:18:44 UTC
I am having trouble because of a race in gst-rtsp-server when expired sessions are being removed. In my case the GSource responsible for cleaning up the timed-out sessions is attached to the application default context. The GSource is set up the following way: source = gst_rtsp_session_pool_create_watch () g_source_set_callback (source, (GSourceFunc) session_timeout_cb, NULL, NULL); And the GSource callback looks like this: session_timeout_cb(pool) { gst_rtsp_session_pool_filter (pool, remove_expired_session, NULL); } remove_expired_session() { if (gst_rtsp_session_is_expired (session)) { return GST_RTSP_FILTER_REMOVE; } return GST_RTSP_FILTER_KEEP; } And this code introduces a race in rtsp-client.c. The race is between the week reference callback client_session_finalized () run in the main context and the rest of the functions in RTSPClient manipulating the internal sessions list (the code does not even keep hard references to the sessions) from clients own context. The result I am getting is segmentation faults of different kind when the server is under high load and sessions start to expire.
Created attachment 279215 [details] [review] Keep hard refs to the sessions from the client object Here I have implemented a solution where client keeps hard references to the session objects, session are removed from the internal list when they expire (session::expired signal emitted) or the client side sends teardown.
Created attachment 279239 [details] [review] Keep hard refs to the sessions from the client object Slightly more correct version of the patch. I should also mention that due to the extra client ref owned by the session expired handler, the RTSPClient object will only be freed after either the client sends TEARDOWN or the session expires. In both cases the expired handler will be removed and the extra client ref will be dropped. But if there is no session cleanup mechanism enabled for the server and if a client never sends TEARDOWN then the RTSPClient object will be leaked. Which is not very good...
I reworked it slightly. The signal is now on the session, because that's the one emitting the signal. The signal is emitted when the session is removed from the pool (not only when it expired) so you get the same effect when filtering the sessions on the client. If the session was shared between multiple client objects, they would override the "gst-rtsp-expired-handler" data on the session. I changed it so that there is only one handler on the configured session pool. Please check if that also works for you. commit 5e2afcefdda5066641c01d243444c4f670054de0 Author: Wim Taymans <wtaymans@redhat.com> Date: Tue Jul 1 14:41:14 2014 +0200 client: protect sessions with lock Protect the list of sessions with the lock. Fixes https://bugzilla.gnome.org/show_bug.cgi?id=732226 commit fe081e73015dc269d585f4be3efbc0d93ead128b Author: Wim Taymans <wtaymans@redhat.com> Date: Tue Jul 1 12:13:47 2014 +0200 Client: keep a ref to the session Don't just keep a weak ref to the session objects but use a hard ref. We will be notified when a session is removed from the pool (expired) with the new session-removed signal. Don't automatically close the RTSP connection when all the sessions of a client are removed, a client can continue to operate and it can create a new session if it wants. If you want to remove the client from the server, you have to use gst_rtsp_server_client_filter() now. Based on patch from Ognyan Tonchev <ognyan.tonchev at axis.com> See https://bugzilla.gnome.org/show_bug.cgi?id=732226 commit 964ca3c98842c5dd7d640b81bc018c0a6687c4ef Author: Wim Taymans <wtaymans@redhat.com> Date: Mon Jun 30 15:14:34 2014 +0200 session-pool: add session-removed signal Add a signal to be notified when a session is removed from the pool.
There is still a race which i actually manage to trigger when stressing the server a bit. Session remove handler can be called after the client object has been freed. We should probably use g_signal_connect_data() instead and introduce a circular reference which we can break in teardown/timeout/disconnect.
Created attachment 280261 [details] [review] client: keep ref to client for the session removed handler How about this patch which. We basically keep a ref to the client object for the session removed handler. This extra ref will be dropped when all client sessions have been removed. A session is removed when a client sends teardown, closes its endpoint of the TCP connection or the sessions expires.
Created attachment 280262 [details] [review] server tests: send teardown to cleanup session
Created attachment 280263 [details] [review] client tests: send teardown to cleanup session
Created attachment 280264 [details] [review] client: check if watch is set in handle_teardown() This is an already existing bug which i am hitting now when running the unit tests.
commit 6543082d2b6eb75da4bddc5b43aa24e759123fae Author: Ognyan Tonchev <ognyan@axis.com> Date: Wed Jul 9 15:16:08 2014 +0200 client: check if watch is set in handle_teardown() The unit tests run without a watch commit bfd498585a7fba9c9da91b18da8058204ce84afc Author: Ognyan Tonchev <ognyan@axis.com> Date: Wed Jul 9 14:19:10 2014 +0200 client tests: send teardown to cleanup session commit f78886e7cb6992a5d25098978e6edc62baeee32e Author: Ognyan Tonchev <ognyan@axis.com> Date: Wed Jul 9 14:17:46 2014 +0200 server tests: send teardown to cleanup session commit e0bc97e40cdf49de2911038c92275b6181113ba6 Author: Ognyan Tonchev <ognyan@axis.com> Date: Wed Jul 9 15:01:31 2014 +0200 client: keep ref to client for the session removed handler This extra ref will be dropped when all client sessions have been removed. A session is removed when a client sends teardown, closes its endpoint of the TCP connection or the sessions expires. Fixes https://bugzilla.gnome.org/show_bug.cgi?id=732226