Bug 797289 – dispatch write never ends

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 797289 - dispatch write never ends


Summary:	dispatch write never ends


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-rtsp-server
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	1.15.1
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2018-10-16 08:26 UTC by Göran Jönsson
Modified:	2018-10-23 07:20 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
gst-rtsp-server patch (2.27 KB, patch) 2018-10-16 08:26 UTC, Göran Jönsson	none	Details \| Review
gst-rtsp-server patch (4.85 KB, patch) 2018-10-18 07:37 UTC, Göran Jönsson	none	Details \| Review
gst-rtsp-server patch (5.11 KB, patch) 2018-10-19 07:15 UTC, Göran Jönsson	none	Details \| Review
gst-rtsp-server patch (5.30 KB, patch) 2018-10-19 10:42 UTC, Göran Jönsson	none	Details \| Review
gst-rtsp-server patch (6.98 KB, patch) 2018-10-22 04:56 UTC, Göran Jönsson	none	Details \| Review
gst-rtsp-server patch (5.36 KB, patch) 2018-10-23 07:03 UTC, Göran Jönsson	committed	Details \| Review

Description Göran Jönsson 2018-10-16 08:26:32 UTC

Created attachment 373939 [details] [review]
gst-rtsp-server patch

When sending JPEG (much data) tunneled it can sometimes happen that the dispatch write never ends. This will cause the dispatch read no never be called.

The reason for this is to be find in on_message_sent in file rtsp-stream.

When sending data with gst_rtsp_watch_write_data(gstrtspconnection.c) sometimes the data is queued. When data is later sent by gst_rtsp_source_dispatch_write
it also call the callback on_message_sent  .

..
watch->funcs.message_sent (watch, watch->write_id, watch->user_data);
..

In on_message_sent if there are more data that can be sent it will be sent (in
dispath write context) and this is queued then dispatc write finds more data in queue and call on_message_sent again. This can go on for a long period of time.
Since dispatch write is never ending dispatch read can't run. This way the server 
don't able to detect that remote is sending teardown.

I attach a patch that introduce an idle source to break this chain.

Comment 1 Sebastian Dröge (slomo) 2018-10-16 08:39:45 UTC

Review of attachment 373939 [details] [review]:

::: gst/rtsp-server/rtsp-stream.c
@@ +4350,3 @@
+  g_mutex_unlock (&priv->lock);
+  g_object_unref (stream);
+  return FALSE;

G_SOURCE_REMOVE

@@ +4388,3 @@
+    current_src = g_main_current_source ();
+    if (!current_src)
+      send_tcp_message (stream, idx);

I don't understand this part of the code. Why do you care if there's a current GSource running on this thread and calling into us? There will be one also in the case of the idle source above and the message directly being sent without queue, there will be one if the message was sent from the queue. There might only be none if we're called directly from the appsink callback or not?

@@ +4390,3 @@
+      send_tcp_message (stream, idx);
+    /* underlaying layer is running this callback */
+    else {

{} for the if above please

@@ +4394,3 @@
+      g_source_set_callback (idle_src, (GSourceFunc) cb_send_tcp_message,
+          g_object_ref (stream), NULL);
+      g_source_attach (idle_src, g_source_get_context (current_src));

g_source_get_context() why? Shouldn't we attach to the main context of the rtsp-server (from where the connection is used) here?

Comment 2 Göran Jönsson 2018-10-16 10:35:53 UTC

You wrote:
"I don't understand this part of the code. Why do you care if there's a current GSource running on this thread and calling into us? There will be one also in the case of the idle source above and the message directly being sent without queue, there will be one if the message was sent from the queue. There might only be none if we're called directly from the appsink callback or not? "

Answer:
I want a difference between when calling from appsink (no source) then there are
no problems. But when called from dispatch_write or
the new idle source we just wanted short executions so dispatch read also can be called. The case when the idle source is sending directly without queuing haven't I think of at all. But when doing that I can't see that there are any special thing to consider.

You wrote:
"g_source_get_context() why? Shouldn't we attach to the main context of the rtsp-server (from where the connection is used) here? "

Answer:
The two context I have consider when doing this patch is the
* main context , but since there are a mutex involved I don't want to run this on main thread.
* The context that handle dispatch write and dispatch read.
Is main context of the rtsp-server a third ? How can I get access to it ?

General:
I just want to say that this is just a suggestion to solve the problem.A proposal to start a discussion.

I think a better solution is to detect that on_message_sent is executed by
dispatch write and there are more data to send. And then handover the execution
to the threads that normally send rtp/rtcp. But I can't find a solution like that.

Comment 3 Sebastian Dröge (slomo) 2018-10-16 10:55:17 UTC

(In reply to Göran Jönsson from comment #2)
> You wrote:
> "I don't understand this part of the code. Why do you care if there's a
> current GSource running on this thread and calling into us? There will be
> one also in the case of the idle source above and the message directly being
> sent without queue, there will be one if the message was sent from the
> queue. There might only be none if we're called directly from the appsink
> callback or not? "
> 
> Answer:
> I want a difference between when calling from appsink (no source) then there
> are no problems. But when called from dispatch_write or 
> the new idle source we just wanted short executions so dispatch read also
> can be called. The case when the idle source is sending directly without
> queuing haven't I think of at all. But when doing that I can't see that
> there are any special thing to consider.  

Ok so that needs a comment and reasoning in the code.

In summary that means that a) when called from appsink we always send as much as we can, and b) when called from the idle callback or the watch callback we would first queue another idle probe.

> You wrote:
> "g_source_get_context() why? Shouldn't we attach to the main context of the
> rtsp-server (from where the connection is used) here? "
> 
> Answer:
> The two context I have consider when doing this patch is the 
> * main context , but since there are a mutex involved I don't want to run
> this on main thread.
> * The context that handle dispatch write and dispatch read.
> Is main context of the rtsp-server a third ? How can I get access to it ?

See bottom of do_send_data(). You want to use priv->watch_context here unconditionally.

> General:
> I just want to say that this is just a suggestion to solve the problem.A
> proposal to start a discussion.
>
> I think a better solution is to detect that on_message_sent is executed by
> dispatch write and there are more data to send. And then handover the
> execution to the threads that normally send rtp/rtcp. But I can't find
> a solution like that.

Which threads do you mean? There's the appsink thread, which we don't have access to. And then there's the thread to which priv->watch_context belongs and that's the one we can make use of here.

That's the thread that also calls the watch callback so probably exactly the one you want?

You can check if you're called from that thread by g_main_context_is_owner()

Comment 4 Göran Jönsson 2018-10-16 12:21:06 UTC

>Ok so that needs a comment and reasoning in the code.
>
>In summary that means that a) when called from appsink we always send as much >as we can, and b) when called from the idle callback or the watch callback we >would first queue another idle probe.
OK I will do better comments about this. 

>> You wrote:
>> "g_source_get_context() why? Shouldn't we attach to the main context of the
>> rtsp-server (from where the connection is used) here? "
>> 
>> Answer:
>> The two context I have consider when doing this patch is the 
>> * main context , but since there are a mutex involved I don't want to run
>> this on main thread.
>> * The context that handle dispatch write and dispatch read.
>> Is main context of the rtsp-server a third ? How can I get access to it ?

>See bottom of do_send_data(). You want to use priv->watch_context here >unconditionally.

I believe I do.
When reaching code line
g_source_attach (idle_src, g_source_get_context (current_src));

it can only be dispatch write or the new idle source that are calling. 
They both run in same context. So
g_source_get_context (current_src) will give the same result as 
priv->watch_context.

Please notice that in rtsp-stream there are no access to rtsp-client's 
priv->watch_context .

Comment 5 Sebastian Dröge (slomo) 2018-10-16 12:50:15 UTC

I see... yes that's unfortunate. Getting the context via any currently active source seems rather fragile with regards to future code changes.

Comment 6 Göran Jönsson 2018-10-17 07:51:12 UTC

If I create a new func 
gst_rtsp_stream_set_watch_context (GstRTSPStream * stream, GMainContext context)
and call that from rtsp-client.c func handle_setup. 
,,
gst_rtsp_stream_set_watch_context (stream, priv->watch_context )

What is your opinon on that ?

Comment 7 Sebastian Dröge (slomo) 2018-10-17 15:34:39 UTC

Seems fine to me

Comment 8 Göran Jönsson 2018-10-18 07:37:54 UTC

Created attachment 373956 [details] [review]
gst-rtsp-server patch

New patch attached

Comment 9 Sebastian Dröge (slomo) 2018-10-18 08:01:10 UTC

Review of attachment 373956 [details] [review]:

::: gst/rtsp-server/rtsp-client.c
@@ +2569,3 @@
   }
 
+  gst_rtsp_stream_set_watch_context(stream, g_main_context_ref (priv->watch_context));

Let the function itself take a new reference

::: gst/rtsp-server/rtsp-stream.c
@@ +4355,3 @@
+  g_mutex_unlock (&priv->lock);
+  g_object_unref (stream);
+  return FALSE;

G_SOURCE_REMOVE

@@ +4400,3 @@
+      g_source_attach (idle_src, priv->watch_context);
+    }
+    else

{} please

Comment 10 Göran Jönsson 2018-10-19 07:15:14 UTC

Created attachment 373969 [details] [review]
gst-rtsp-server patch

New patch

Comment 11 Sebastian Dröge (slomo) 2018-10-19 07:20:58 UTC

Review of attachment 373969 [details] [review]:

Almost there :)

::: gst/rtsp-server/rtsp-stream.c
@@ +4354,3 @@
+    send_tcp_message (stream, idx);
+  g_mutex_unlock (&priv->lock);
+  g_object_unref (stream);

This should be removed and ...

@@ +4392,3 @@
+    /* When appsink running this callback we want to send as much as we can
+     * But when idle callback or watch callback is running we will first
+     * queue an idle probe */

Maybe add some explanation here why we do that, what it prevents.

@@ +4397,3 @@
+      idle_src = g_idle_source_new ();
+      g_source_set_callback (idle_src, (GSourceFunc) cb_send_tcp_message,
+          g_object_ref (stream), NULL);

... be added here as destroy notify instead.

Otherwise you'll leak the stream if the main context is never polled again (e.g. because it is shut down).

Comment 12 Göran Jönsson 2018-10-19 10:42:24 UTC

Created attachment 373970 [details] [review]
gst-rtsp-server patch

New patch

Comment 13 Sebastian Dröge (slomo) 2018-10-19 10:49:49 UTC

Comment on attachment 373970 [details] [review]
gst-rtsp-server patch

Breaks the rtsp-client test:

gstcheck.c:286:F:general:test_setup_tcp:0: Unexpected critical/warning: g_main_context_ref: assertion 'context != NULL' failed
gstcheck.c:286:F:general:test_setup_tcp_two_streams_same_channels:0: Unexpected critical/warning: g_main_context_ref: assertion 'context != NULL' failed

Comment 14 Göran Jönsson 2018-10-22 04:56:10 UTC

Created attachment 373991 [details] [review]
gst-rtsp-server patch

New patch.

Comment 15 Sebastian Dröge (slomo) 2018-10-22 07:22:43 UTC

Review of attachment 373991 [details] [review]:

Looks good just one more question

::: tests/check/gst/client.c
@@ +682,3 @@
   fail_unless (gst_rtsp_client_set_connection (client, conn));
 
+  maincontext = g_main_context_new ();

There is no main loop running on this main context. Is this a problem? We could have GSources pending that are never called but are supposed to be called. Like the one you just added

Comment 16 Göran Jönsson 2018-10-23 07:03:57 UTC

Created attachment 374008 [details] [review]
gst-rtsp-server patch

New patch

Comment 17 Sebastian Dröge (slomo) 2018-10-23 07:19:58 UTC

commit 7cfd59820af4bbd69f682dd91cd6c37f7c38b501 (HEAD -> master, origin/master, origin/HEAD)
Author: Göran Jönsson <goranjn@axis.com>
Date:   Thu Oct 18 07:25:05 2018 +0200

    rtsp-stream: use idle source in on_message_sent
    
    When the underlying layers are running on_message_sent, this sometimes
    causes the underlying layer to send more data, which will cause the
    underlying layer to run callback on_message_sent again. This can go on
    and on.
    
    To break this chain, we introduce an idle source that takes care of
    sending data if there are more to send when running callback
    
    https://bugzilla.gnome.org/show_bug.cgi?id=797289