GNOME Bugzilla – Bug 740803
waylandsink: Double-check the frame callback before dropping frames
Last modified: 2018-05-07 15:44:44 UTC
Created attachment 291625 [details] [review] waylandsink: Double-check the frame callback before dropping frames I wrote the comment that is the same as the attached patch against waylandsink. The wl_display events dispatching runs asynchronously in a seperate thread from the streaming thread. This means that processing a frame callback event could be slightly late, causing unnecessary frame drops when the video framerate is the almost same as a display vsync frequency. This change alleviates this problem by dispatching any pending events with wl_display_roundtrip() in the streaming thread before double-checking whether the frame should actually be dropped.
Any comments on this bug and patch? Does this sound reasonable from a wayland API point of view?
(In reply to Damian Hobson-Garcia from comment #1) > Any comments on this bug and patch? Does this sound reasonable from a > wayland API point of view? You really need to use wl_display_roundtrip_queue, passing sink->display->queue as the event queue to dispatch on. Short of that, I don't really know whether blocking in the streaming thread is sensible, so would have to defer to other GStreamer people.
(In reply to Daniel Stone from comment #2) > Short of that, I don't really know whether blocking in the streaming thread > is sensible, so would have to defer to other GStreamer people. Blocking in the streaming thread can be done, as long as the blocking operation can be cancelled at anytime (see unlock() virtual / FLUSH_START event). At the moment, using roundtrip would require creating a specific queue for this operation.
Review of attachment 291625 [details] [review]: ::: ext/wayland/gstwaylandsink.c @@ +604,3 @@ + redraw_flag = g_atomic_int_get (&sink->redraw_pending); + while (redraw_flag == TRUE && retry > 0) { + wl_display_roundtrip (sink->display->display); wl_display_roundtrip_queue() @@ +605,3 @@ + while (redraw_flag == TRUE && retry > 0) { + wl_display_roundtrip (sink->display->display); + redraw_flag = g_atomic_int_get (&sink->redraw_pending); This is no longer an atomic op, it's protected by the render_lock now. That basically means, you'll get a deadlock. You need to drop the render lock before calling wl_display_roundtrip_queue(). @@ +606,3 @@ + wl_display_roundtrip (sink->display->display); + redraw_flag = g_atomic_int_get (&sink->redraw_pending); + retry--; If retry is 1, why do you keep this code ?
> This is no longer an atomic op, it's protected by the render_lock now. That > basically means, you'll get a deadlock. You need to drop the render lock > before calling wl_display_roundtrip_queue(). It is better to check using g_cond_wait_until()/g_cond_signal() APIs and completely avoid wl_display_roundtrip_queue() IMO. wl_display_roundtrip() is a blocking call which is not returned until frame_redraw_cb() and buffer_release() callbacks are not completed. Whereas g_cond may be signalled as soon as frame_redraw_cb() is received. Conditional timeout of one frame duration (fps_d/fps_n) is reasonable which can be finalized after testing further.
Created attachment 353731 [details] [review] waylandsink: Conditional wait for FrameRedrawCallback before dropping the frame While rendering, if frame_redraw_pending() is not called from wayland server yet, then render()/expose() APIs should wait for atleast 1 frame duration before dropping the frame completely. This conditional wait operation will handle race conditions when render requests are made from multiple paths.
Review of attachment 353731 [details] [review]: Why do need to wait for the duration of a frame, additionally to what basesink already wait for ?
Something I really don't like with this patch, is that it makes the code much more complicated, specially that you would need to hook this up to unlock(). Its also counter productive, as this should be solved the right way with the presentation timestamp patch-set.
This patch will create issues in the streaming thread when the wayland server stops rendering the video window (because it may not be visible or maybe because the lock screen is on). It is perfectly valid for the server to do this and may take a long time (in case of the lock screen for example) for the frame callback to be called. In the meantime, the streaming thread *should* continue working at the same rate and drop those frames that are not going to be displayed anyway. If you start waiting an additional frame duration for every frame sync, then GstBaseSink will find that every one in two frames is delivered too late and will start firing warnings and QoS events. (In reply to Praveen R Jadhav from comment #6) > This conditional wait operation will handle race conditions when render > requests are made from multiple paths. I'm not sure how you mean that multiple requests can be made from multiple paths. If you mean expose() and show_frame() being called at the same time, I think the solution is to just remove the implementation of expose() completely. I implemented it a couple of years ago just to follow the pattern for all video sinks, but I haven't found any practical use for it in wayland.
(In reply to George Kiagiadakis from comment #9) > This patch will create issues in the streaming thread when the wayland > server stops rendering the video window (because it may not be visible or > maybe because the lock screen is on). It is perfectly valid for the server > to do this and may take a long time (in case of the lock screen for example) > for the frame callback to be called. In the meantime, the streaming thread > *should* continue working at the same rate and drop those frames that are > not going to be displayed anyway. If you start waiting an additional frame > duration for every frame sync, then GstBaseSink will find that every one in > two frames is delivered too late and will start firing warnings and QoS > events. I too accept that 1 frame duration is a bit longer and is a worst case scenario. I have tested various scenarios with this patch, but din't find any case where frames are dropped because of timeout.(Regarding wayland server delay during screen lock, profiles I work on don't render video in this scenario). About the streaming scenario, if the wayland server delays intentionally, then QoS events will be triggered for sure. Reducing wait time to a lesser duration will minimize render thread faltering. > > (In reply to Praveen R Jadhav from comment #6) > > This conditional wait operation will handle race conditions when render > > requests are made from multiple paths. > > I'm not sure how you mean that multiple requests can be made from multiple > paths. If you mean expose() and show_frame() being called at the same time, > I think the solution is to just remove the implementation of expose() > completely. I implemented it a couple of years ago just to follow the > pattern for all video sinks, but I haven't found any practical use for it in > wayland. While rendering 2 videos(local and remote) during video call, If user swaps video screens, then set_render_rect() and expose() are called to update new display regions. In this scenario, we can clearly observe background if expose() request to render last frame is dropped. This patch ensures last frame is updated forcefully and flicker issues are not observed. @ Nicolas Dufresne: As mentioned above, 1 frame duration is the worst case scenario. I tested using wl_display_roundtrip_queue()/wl_display_roundtrip(). They are blocking calls and render thread has to wait even though frame_redraw_callback() is already returned from server. This delays even further. Using g_cond is one approach reduces overall delay because of blocking call in render thread. Using presentation time will not help for expose() cases.
I'm sorry, but I don't think we will merge this patch. We'll need a better solution. Ideally if you could provide a small application to reproduce the issue, that would be ideal. I think your description indicate that this is relatively easy to simulate.
Closing this bug report as no further information has been provided. Please feel free to reopen this bug report if you can provide the information that was asked for in a previous comment. Thanks!