GNOME Bugzilla – Bug 740689
decodebin/multiqueue/max-size-buffers is not set in playing state
Last modified: 2015-08-31 15:38:45 UTC
We see strange buffer events at startup from playbin pipeline on 1.4.4 version. We can easily reproduce this problem with aac and ogg content played from local http source by this gst-launch command: gst-launch-1.0 -m playbin uri=http://192.168.0.9/~brano/Music/Convert-Ogg-112.ogg The buffer level increases at the start, drops down and then the buffer level events are insane. The 100% buffer level comes in the middle of buffering, gst-launch changes to playing state and audio problems follow. This is regression between 1.2 and 1.4 branch.
Created attachment 291465 [details] gst1.2.4.log
Created attachment 291466 [details] gst1.4.4.log
Can you test with master or latest version on 1.4 branch? 8d835ec400819140e5fc8532a90010c694e1212b 5a759011a013b3a3e9249f11f6f3e476947451e1 Those 2 are the patches on master that I hope have fixed this. In 1.4 they are: cafe1f2fc90acccd7aceb6d3c09bee6496396672 81f13a5a2c83ed296f497a7999a7f584aa31a2db
I cherry picked the commit on top of 1.4.4. But the problem is still there. Enabling debug logs for multiqueue makes this problem much harder to reproduce GST_DEBUG=1,multiqueue:5 gst-launch-1.0 -m playbin uri=http://192.168.0.9/~brano/Music/Convert-Ogg-112.ogg I run the pipeline on ARMv7 CPU at 1GHz.
Created attachment 291468 [details] gst1.4.4patched.log - debug for multiqueue makes the problem nearly disappear
I am sorry, I cherry picked only one commit. I will post again soon.
Problem still there with both commits applied. Increasing log level for queue2 does not help to fix the problem. Increasing log level for multiqueue fixes the problem.
It is hard to see what exactly is happening only by looking at the buffering log. Can you provide a full log or a log that contains both the buffering messages and the queue debug messages?
Created attachment 291988 [details] aac-multiqueue-info-1.4-86ca8c41f.log - problem is visible without debug
Created attachment 291989 [details] aac-multiqueue-debug-1.4-86ca8c41f.log.gz - problem disappears with debug
That is likely because it is a racy issue, does it happen if you try it multiple times with debug?
I tried it about 100 times. I never reproduced the problem when multiqueue is at debug level.
The problem disappears even when logging for queue element is set to DEBUG. ST_DEBUG=3,queue:5 gst-launch-1.0 -m playbin uri=http://192.168.0.9/~brano/Music/Convert-aac-128.m4a My current theory is that the buffering events are messed up since the aqueue (GstQueue) element switches to READY state.
Created attachment 292449 [details] queue_at_log_level.txt gstreamer1.4.4 with all commits on 1.4 branch in gstreamer.git GstQueue2 modified to log "Going to post buffering" at INFO level Problem disappears when queue is set to debug level. Queue introduced very little logging. How is it possible that log level change in of GstQueue at playbin0/playsink/abin/aqueue changes behavior in playbin0/uridecodebin0/queue2-0?
I can reproduce the problem also on my desktop with streaming from other device and high CPU load - 40 instances of # nice --20 yes > /dev/null & $ LD_LIBRARY_PATH=/usr/local/lib nice -20 gst-launch-1.0 playbin uri=http://p1/Convert-aac-128.m4a Setting pipeline to PAUSED ... Pipeline is PREROLLING ... Prerolled, waiting for buffering to finish... Pipeline is PREROLLED ... Setting pipeline to PLAYING ... New clock: GstPulseSinkClock Buffering, setting pipeline to PAUSED ... Done buffering, setting pipeline to PLAYING ... Buffering, setting pipeline to PAUSED ... Done buffering, setting pipeline to PLAYING ... Buffering, setting pipeline to PAUSED ... Done buffering, setting pipeline to PLAYING ... Buffering, setting pipeline to PAUSED ... Done buffering, setting pipeline to PLAYING ... Buffering, setting pipeline to PAUSED ... Done buffering, setting pipeline to PLAYING ... Buffering, setting pipeline to PAUSED ... Done buffering, setting pipeline to PLAYING ... Buffering, setting pipeline to PAUSED ... Done buffering, setting pipeline to PLAYING ... Buffering, setting pipeline to PAUSED ... Done buffering, setting pipeline to PLAYING ... ^Chandling interrupt. Interrupt: Stopping pipeline ... Execution ended after 0:00:00.946035246 Setting pipeline to PAUSED ... Setting pipeline to READY ... Setting pipeline to NULL ... Freeing pipeline ... gstreamer1.0: branch 1.4 86ca8c41fa3a9f4d1be44d4970c13bc5dc6cd83a gst-plugins-base, gst-plugins-good, gst-libav: tag 1.4.4
Created attachment 292607 [details] [review] 0001-decodebin-Set-queue-size-in-no_more_pads.patch The multiqueue0 element does not limit buffering to couple of buffers but it is limited only by 2MB size. For 128kbit input stream, queue2-0 buffers only 32KB for 128kbit input stream and can be drained by queue2-0 to 0%. This seems to be introduced in gst-plugins-base commit db771185ed750627a6a1824c42b651d739e1b4a4.
What exactly is the problem here? When the pads are exposed by decodebin it will set the multiqueue limits to exactly those values, which should happen very soonish after oggdemux emitted no-more-pads. Or is the problem that the pads are exposed much later after no-more-pads (why?)?
Method decodebin_set_queue_size() is called only from gst_decode_group_new(). Argument preroll is TRUE and thus the number of buffers in multiqueue is unlimited. My patch reverts call decodebin_set_queue_size in no_more_pads_cb with preroll=FALSE. An alternative way to fix this problem is to limit the number of buffers also in preroll state. Is there a good reason to have unlimited buffers in preroll state?
It's also called in gst_decode_group_reset_buffering, which is called indirectly from gst_decode_bin_expose. The assumption was that this sets the proper limits on everything. Why doesn't it? :)
0:00:00.483482384 10507 0x4180a4f0 DEBUG decodebin gstdecodebin2.c:2660:no_more_pads_cb:<qtdemux0> Setting group 0x418b48c8 to complete 0:00:00.483567154 10507 0x4180a4f0 DEBUG decodebin gstdecodebin2.c:2666:no_more_pads_cb:<decodebin0> Setting group 0x418b48c8 multiqueue to 'playing' buffering mode 0:00:00.483650384 10507 0x4180a4f0 DEBUG decodebin gstdecodebin2.c:3197:decodebin_set_queue_size:<multiqueue0> use buffering 0 0:00:00.483707923 10507 0x4180a4f0 DEBUG decodebin gstdecodebin2.c:3241:decodebin_set_queue_size:<multiqueue0> setting limits 2097152 bytes, 5 buffers, 0 .... 0:00:00.833996769 10507 0x41837ac0 DEBUG decodebin gstdecodebin2.c:3961:gst_decode_bin_expose:<decodebin0> Exposing currently active chains/groups 0:00:00.834138615 10507 0x41837ac0 DEBUG decodebin gstdecodebin2.c:4037:gst_decode_bin_expose:<decodebin0> About to expose dpad decodepad1 as src_0 0:00:00.834259000 10507 0x41837ac0 DEBUG decodebin gstdecodebin2.c:3947:debug_sticky_event:<'':src_0> sticky event stream-start 0:00:00.834320077 10507 0x41837ac0 DEBUG decodebin gstdecodebin2.c:3947:debug_sticky_event:<'':src_0> sticky event caps 0:00:00.838859307 10507 0x41837ac0 INFO decodebin gstdecodebin2.c:4056:gst_decode_bin_expose:<decodebin0:src_0> added new decoded pad 0:00:00.855702615 10507 0x41837ac0 DEBUG decodebin gstdecodebin2.c:4073:gst_decode_bin_expose:<decodebin0:src_0> unblocking 0:00:00.855840384 10507 0x41837ac0 DEBUG decodebin gstdecodebin2.c:4270:gst_decode_pad_set_blocked:<decodebin0:src_0> blocking pad: 0 0:00:00.856129769 10507 0x41837ac0 DEBUG decodebin gstdecodebin2.c:4075:gst_decode_bin_expose:<decodebin0:src_0> unblocked 0:00:00.857123769 10507 0x41837ac0 DEBUG decodebin gstdecodebin2.c:4081:gst_decode_bin_expose:<decodebin0> Exposed everything Pipeline is PREROLLED ... gst_decode_bin_expose is called and also gst_decode_bin_reset_buffering. But use_buffering is FALSE. gst_decode_bin_reset_buffering (GstDecodeBin * dbin) { if (!dbin->use_buffering) return; GST_DEBUG_OBJECT (dbin, "Reseting multiqueues buffering"); CHAIN_MUTEX_LOCK (dbin->decode_chain); gst_decode_chain_reset_buffering (dbin->decode_chain); CHAIN_MUTEX_UNLOCK (dbin->decode_chain);
So maybe it should go in there even if use_buffering is false :) Does that make sense and work?
I am not familiar with the code, so I have hard time to state what makes sense and what not. But you are doing good job pushing me into the code :) The first thing that does not make sense for me is that the number of buffers for multiqueue is unlimited in prerolling state. The 2MB limit is like infinity for compressed audio. I propose to remove special buffering limits for prerolling state. That fixes our regression and makes the code simpler. Purely removing `if (!dbin->use_buffering)` in gst_decode_bin_reset_buffering blocks my pipeline (gst-launch playbing uri=...) at 100% buffer level. I would like to understand the motivation for special buffering in preroll state before I go this way.
I also don't understand why property use_buffering influences the buffer sizes. Use_buffering is documented to only emit buffering messages.
> So maybe it should go in there even if use_buffering is false :) Does that make sense and work? If we update the queue size in gst_decode_bin_reset_buffering, it will be updated much later than in no_more_pads_cb. Another reason for me to dislike this approach.
Why is that a problem though? But I think the use_buffering check there is wrong independent of it, and if we call it again in no_more_pads_cb it should probably also be called in multiqueue_overrun_cb for consistency.
> Why is that a problem though? Uridecodebin has queue2 at the input. It posts buffering events to application and these buffering events are used to transition to playing state and are also shown to the user. Unlimited buffering in decodebin can non-deterministically drive uridecodebin's queue2 anywhere from 100% to 0%.
That's a good point, but at no time it should be really unlimited. I think the proposed change makes sense, but it also seems like it would hide another problem.
> but at no time it should be really unlimited. Consider 128kbit audio stream. As soon as uridecodebin detects the bitrate, it configures its queue2 max-size to 32000 bytes. 2MB buffer in multiqueue is nearly 2 orders of magnitude bigger. I don't like the patch I posted. I would rather see decodebin to limit its multiqueue in buffer units all the time. I don't want to hide problem. What is correct solution?
I don't know the correct solution, sorry :) Maybe the buffer limits in decodebin should also consider the type of stream or its bitrate (but for that you would need to know the timestamps, and then you could as well use the time limits).
Created attachment 294346 [details] [review] decodebin: do call set_queue_size in no_more_pads_cb I changed the patch to call decodebin_set_queue_size also in multi_queue_overrun_cb. I am not sure whether it is ok to call decodebin_set_queue_size without lock in multi_queue_overrun_cb. But it was called without explicit lock before commit db771185ed750627a6a1824c42b651d739e1b4a4. Do you want to call decodebin_set_queue_size also in gst_decode_chain_expose? It was that way in before db771185ed750627a6a1824c42b651d739e1b4a4.
commit d16df7f70df2e703ccb24ea0822b0c804235127d Author: Branislav Katreniak <bkatreniak@nuvotechnologies.com> Date: Mon Jan 12 14:38:09 2015 +0100 decodebin: do call set_queue_size in no_more_pads_cb Consider pipeline: gst-launch-1.0 playbin uri=http://example.com/a.ogg Consider 128kbit audio stream. As soon as uridecodebin detects the bitrate, it configures its input queue2 max-size to 32000 bytes. The 2MB buffer in multiqueue is nearly 2 orders of magnitude bigger. This non-deterministically drives queue2 buffer anywhere from 100% to 0% until multiqueue is filled. This patch sets multiqueue size to 5 buffers early in no_more_pads_cb. Partly reverts commit db771185ed750627a6a1824c42b651d739e1b4a4. https://bugzilla.gnome.org/show_bug.cgi?id=740689
*** Bug 751497 has been marked as a duplicate of this bug. ***
See https://bugzilla.gnome.org/show_bug.cgi?id=733235#c56