GNOME Bugzilla – Bug 796842
srt: Refactor plugin to support non-blocking SRT APIs
Last modified: 2018-11-03 14:28:35 UTC
Add support "srt://localhost:port" style uri, and change the default host to "localhost"
Created attachment 373098 [details] [review] srt: Remove white space
Created attachment 373099 [details] [review] [2/4] srt: Reduce useless call depth Providing simplified _client_connect() is not required for now.
Created attachment 373100 [details] [review] [3/4] srt: Remove duplicated code for setting server socket
Created attachment 373101 [details] [review] [4/4] srt: Allow the host name "localhost"
Review of attachment 373098 [details] [review]: this one looks good
Review of attachment 373099 [details] [review]: looks good too
Review of attachment 373101 [details] [review]: The problem is that name resolution is a blocking operation here.. and gst_*src_start() shouldn't be blocking. That would mean moving the thing to the async api .. with gst_base_src_set_async() and then create a thread if the name resolution is blocking and then call gst_base_src_start_complete() when it's done. I guess for the sink it's already in the streaming thread, so it doesn't matter as much. Also, we awnt to merge clientsink/serversink into a srtsink and the same at the source, there is a branch on github that does vaguely that.
Review of attachment 373100 [details] [review]: ::: ext/srt/gstsrt.c @@ +215,3 @@ + /* This is a sink, we're always a sender */ + srt_setsockopt (sock, 0, SRTO_SENDER, &(int) { + 1}, sizeof (int)); But you call this form the source also.
(In reply to Olivier Crête from comment #7) > Review of attachment 373101 [details] [review] [review]: > > The problem is that name resolution is a blocking operation here.. and > gst_*src_start() shouldn't be blocking. That would mean moving the thing to > the async api .. with gst_base_src_set_async() and then create a thread if > the name resolution is blocking and then call gst_base_src_start_complete() > when it's done. > > I guess for the sink it's already in the streaming thread, so it doesn't > matter as much. I guess the point which you are worrying about is that source's start()/stop() is most likely app thread and (i.e., triggering set_state()) blocking the app thread might cause problem, right? If my understanding is correct, I think "blocking" could be ugly and agree with your point, but I'm wondering whether it's really critical or not, because async source element also looks tricky. The problem is, - I couldn't find any async source element for now, and even some elements which are using name resolving (such as udp/tcp source) are not async. So, there seems no good example for me. And when I saw basesrc's impl., finding proper room for calling gst_base_src_start_complete() seems to be a little tricky, because we should meet following two conditions (if it's wrong, please correct me) 1) gst_base_src_start_complete() shouldn't be called in gst_base_src_get_range() thread (loop thread) since both are taking stream_lock. 2) gst_base_src_start_complete() should be called before gst_base_src_get_range() in order for create()/fill() methods not to return null buffer.
(In reply to Seungha Yang from comment #10) > - I couldn't find any async source element for now, and even some elements > which are using name resolving (such as udp/tcp source) are not async. So, > there seems no good example for me. And when I saw basesrc's impl., finding > proper room for calling gst_base_src_start_complete() seems to be a little > tricky, because we should meet following two conditions (if it's wrong, > please correct me) > 1) gst_base_src_start_complete() shouldn't be called in > gst_base_src_get_range() thread (loop thread) since both are taking > stream_lock. > 2) gst_base_src_start_complete() should be called before > gst_base_src_get_range() in order for create()/fill() methods not to return > null buffer. In theory yes, but I can't find any other element that uses it, so maybe it's not useable as-is.. Lets ignore this for now.
(In reply to Olivier Crête from comment #11) > In theory yes, but I can't find any other element that uses it, so maybe > it's not useable as-is.. Lets ignore this for now. Ok :). I'm doing refacto srtbasesrc and it's subclasses to make them actually cancellable with unit test code. I'll update them at once in this ticket soon. (possibly reworked srtbasesink also updated)
Created attachment 373197 [details] [review] srtclient: Fix bind-port and rendez-vous property getter
Created attachment 373198 [details] [review] [3/17] srtclient: Fix bind-port and rendez-vous property getter
Created attachment 373199 [details] [review] [4/17] srtclientsink: Do not leak string
Created attachment 373200 [details] [review] [5/17] srtserversrc: Emit client-closed signal in _stop() ... instead of client-added
Created attachment 373201 [details] [review] [6/17] srt: Fix signal args mismatch client-added and client-{removed,closed} signals were defined to be emitted with two arguments (socket fd and address)
Created attachment 373202 [details] [review] [7/17] srt: Remove duplicated code for setting server socket
Created attachment 373203 [details] [review] [8/17] srt: Add "const" keyword to a function argument Given "passphrase" shouldn't be modified
Created attachment 373204 [details] [review] [9/17] srtbasesrc: Use GstBaseSrc's timestamp impl. ... instead of doing it ourselves. Otherwise, we should add more logic here (such as checking GstClock and etc) which was already provided by GstBaseSrc.
Created attachment 373205 [details] [review] [10/17] srt: Do not ignore SRT socket error event ... and set SRT_EPOLL_IN flag if the client is not sender (i.e., source element) since waiting readable event for srt_recvmsg() makes more sense.
Created attachment 373206 [details] [review] [11/17] srtclientsink: Fix SRT socket option setting SRTClientSink is sender
Created attachment 373207 [details] [review] [12/17] srtbasesrc: Refacto class structure * Move common code to baseclass from subclass * Protect properties of baseclass with lock * Access private structure using pointer * Make source elements cancellable
Created attachment 373208 [details] [review] [13/17] srt: Enable proxy libsrt logging For debugging libsrt issues
Created attachment 373209 [details] [review] [14/17] srtbasesink: Refacto class structure * Move common code to baseclass from subclass * Protect properties of baseclass with lock * Access private structure using pointer * Make srtclientsink non-blocking * Change return type of send_buffer vfunc for representing various status (e.g., flushing, error, etc), and pass GstBuffer instead of GstMapInfo. For non-blocking SRTServerSink impl., buffers should be queued.
Created attachment 373210 [details] [review] [15/17] srtserversink: Make serversink non-blocking The use of srt_sendmsg2() with blocking client socket might have negative effect on performance. To make client socket non-blocking, * Make buffer queue per client * Add client socket to srt_epoll and wait writable event (if there are queued buffer) * After draining all queued buffer of a client, update the client socket's epoll option in order to prevent burst wakeup event.
Created attachment 373211 [details] [review] [16/17] srt: Allow the host name "localhost" Add support "srt://localhost:port" style uri, and change the default host to "localhost"
Created attachment 373212 [details] [review] [17/17] tests: srt: Add unit test
*** Bug 796843 has been marked as a duplicate of this bug. ***
Created attachment 373213 [details] [review] [12/17] srtbasesrc: Refactor class structure
Created attachment 373214 [details] [review] [14/17] srtbasesink: Refactor class structure
Review of attachment 373198 [details] [review]: Ooops.
Review of attachment 373199 [details] [review]: Looks good
Review of attachment 373200 [details] [review]: ++
Review of attachment 373201 [details] [review]: ++
Review of attachment 373203 [details] [review]: ++
Review of attachment 373204 [details] [review]: ++
Review of attachment 373206 [details] [review]: ++
Review of attachment 373208 [details] [review]: ::: ext/srt/gstsrt.c @@ +353,3 @@ +#ifndef GST_DISABLE_GST_DEBUG + _set_srt_debug_level (); We may want to run that on element creation, not just plugin init.
Review of attachment 373202 [details] [review]: ++
Review of attachment 373204 [details] [review]: ++ for real
Review of attachment 373205 [details] [review]: Was this tested to make sure it behaves as expected?
Review of attachment 373210 [details] [review]: As a general note on the concept, I don't understand how that would block if you have a pipeline like "videotestsrc ! x264enc ! mpegtsmux ! srtserversink sync=0".. won't it end up using all of the available RAM ? I think the easiest way to prevent that is to block until at least one of the client queues is empty, to make sure that we never send faster than the fastest receiver. But also set a max queue length so we limit the memory usage in any case (and a property to decide to drop or block). ::: ext/srt/gstsrtserversink.c @@ +52,3 @@ + GstTask *event_task; + GRecMutex event_lock; + GRecMutex client_lock; Do you really need a recursive mutex? They're hard to get wrong and error prone! @@ +163,3 @@ + priv->clients = g_list_append (priv->clients, client); + g_hash_table_insert (priv->client_hash, + &client->sock, srt_client_ref (client)); May not need the linked list if we also have the hash table and iterators @@ +164,3 @@ + g_hash_table_insert (priv->client_hash, + &client->sock, srt_client_ref (client)); + priv->num_clients++; That counter is also duplicative with g_hash_table_size()
Review of attachment 373211 [details] [review]: ++
Review of attachment 373212 [details] [review]: ++
Review of attachment 373213 [details] [review]: ::: ext/srt/gstsrtbasesrc.c @@ +385,1 @@ + /** Any reason to switch from the property array to the other API? @@ +393,3 @@ + G_MAXINT32, DEFAULT_POLL_TIMEOUT, + G_PARAM_READWRITE | GST_PARAM_MUTABLE_READY | + G_PARAM_STATIC_STRINGS)); Why have a timeout on the polling at all? What would that gain us? ::: ext/srt/gstsrtclientsrc.c @@ +248,1 @@ + gst_element_class_set_metadata (element_class, Same question here on why change the API away from the array one? ::: ext/srt/gstsrtclientsrc.h @@ +44,2 @@ /*< private >*/ + GstSRTClientSrcPrivate *priv; No point in having a separate private, the whole plugin's implementation is private. @@ +47,1 @@ gpointer _gst_reserved[GST_PADDING]; We actually shouldn't have a reserved either ::: ext/srt/gstsrtserversrc.h @@ +45,2 @@ /*< private >*/ + GstSRTServerSrcPrivate *priv; Same here
Review of attachment 373214 [details] [review]: ::: ext/srt/gstsrtclientsink.h @@ +44,2 @@ /*< private >*/ + GstSRTClientSinkPrivate *priv; Same as for the src, everything is private, so no need for a priv struct.
Comment on attachment 373098 [details] [review] srt: Remove white space Attachment 373098 [details] pushed as aafdfdb - srt: Remove white space
Review of attachment 373213 [details] [review]: ::: ext/srt/gstsrtbasesrc.c @@ +212,3 @@ + /* HACK: Since srt_epoll_wait() is not cancellable, install our event fd to + * epoll */ + srt_epoll_add_ssock (self->poll_id, self->event_fd, NULL); With this, we don't need the poll-timeout anymore. I think we can just poll forever and use the cancellation with our own socket to unblock it.
(In reply to Olivier Crête from comment #50) > Review of attachment 373213 [details] [review] [review]: > > ::: ext/srt/gstsrtbasesrc.c > @@ +212,3 @@ > + /* HACK: Since srt_epoll_wait() is not cancellable, install our event fd > to > + * epoll */ > + srt_epoll_add_ssock (self->poll_id, self->event_fd, NULL); > > With this, we don't need the poll-timeout anymore. I think we can just poll > forever and use the cancellation with our own socket to unblock it. I agree. Now "poll-timeout" is useless anymore. I'll remove "poll-timeout" from source/sink
(In reply to Olivier Crête from comment #47) > Review of attachment 373213 [details] [review] [review]: > > ::: ext/srt/gstsrtbasesrc.c > @@ +385,1 @@ > + /** > > Any reason to switch from the property array to the other API? No reason :( They were changed during copied and pasted from somewhere. I'll rollback not to make diff there.
(In reply to Seungha Yang from comment #51) > (In reply to Olivier Crête from comment #50) > > Review of attachment 373213 [details] [review] [review] [review]: > > > > ::: ext/srt/gstsrtbasesrc.c > > @@ +212,3 @@ > > + /* HACK: Since srt_epoll_wait() is not cancellable, install our event fd > > to > > + * epoll */ > > + srt_epoll_add_ssock (self->poll_id, self->event_fd, NULL); > > > > With this, we don't need the poll-timeout anymore. I think we can just poll > > forever and use the cancellation with our own socket to unblock it. > > I agree. Now "poll-timeout" is useless anymore. I'll remove "poll-timeout" > from source/sink Oh, the "poll-timeout" property was exposed in 1.14. I think we shouldn't remove it but it needs to be specified as deprecated instead. Is it correct?
(In reply to Seungha Yang from comment #53) > Oh, the "poll-timeout" property was exposed in 1.14. I think we shouldn't > remove it but it needs to be specified as deprecated instead. Is it correct? No need, the API in -bad is not guaranteed to be stable!
Current status I'm observing some unexpected behavior of libsrt and I'm doubtful about srt_sendmsg() API's blocking behavior. Regardless of socket blocking option, and also regardless of how fast gstreamer is pushing buffer, libsrt returns srt_sendmsg() immediately. When a pipeline is configured as "sync disabled" as follows "videotestsrc ! x264enc key-int-max=30 ! mpegtsmux alignmeng=7 ! srtserversink uri=srt://localhost:9999 sync=false" I can observe some libsrt log message which indicates a lot of buffers are dropped by libsrt. Note that the remote pipeline (srtclientsrc) was configured without "sync disabled". I guess, libsrt doesn't check sender buffer queue size, but just keep allowing calling srt_sendmsg(). Instead, the library seems to be dropping timeout'ed buffers. Also, when above case, sender side pipeline unloading is delayed until receiver pipeline is unloaded and/or receiver (srtclientsrc) received all buffers. I could be a few seconds. Weird :(
I asked on SRT's slack what the expected behaviour is.
(In reply to Olivier Crête from comment #56) > I asked on SRT's slack what the expected behaviour is. Hum... I think libsrt is unstable for now, especially congestion control seems to unreliable. <Observation 1> The libsrt's default sender queue size is 8192 * (1500 - 28) bytes and I guess my environment cannot hit the max bound of the queue size with following command. "videotestsrc ! x264enc key-int-max=30 ! mpegtsmux alignmeng=7 ! srtserversink uri=srt://localhost:9999 sync=false" Note that, from Gstreamer point of view, the default queue size is needlessly too large in my opinion anyway. So, I tested using movie file but it did not show difference (cannot hit the bound). <Observation 2> To hit the bound, I tested reduced queue size using SRTO_SNDBUF srt socket option, then I could hit the bound. The problem was, [Case 1. non-blocking clientsink - non-blocking serversrc] clientsink will stuck in srt_epoll_wait() after hit the bound several times. That means there is no SRT_EPOLL_OUT event. [Case 2. blocking clientsink - non-blocking serversrc] locked by srt_sendmsg2() and no return ... Both cases show "No room to store incoming packet" message at receiver side. I guess https://github.com/Haivision/srt/issues/409 issue may be related this behavior, but not sure. <Observation 3> To avoid stuck, I tried to control queued buffer manually. For example, waiting call srt_sendmsg2() until half of queue size are available. But nothing changed. libsrt was stuck in srt_sendmsg2() or srt_epoll_wait().
Comment on attachment 373098 [details] [review] srt: Remove white space Remove
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/755.