GNOME Bugzilla – Bug 697672
VP8 passed through rtpbin decodes a single frame and then fails to decode until a key frame passed through
Last modified: 2013-04-15 10:58:26 UTC
When decoding VP8 through rtpbin you get a single frame or a short section of video decoded and then vp8dec fails to decode until a keyframe. The issue is caused by caps renegotiation happening after the first frame - which causes an issue in rtpvp8depay and vp8enc. rtpvp8depay detects caps being set and marks the current buffer as having a discontinuity, vp8enc in gst_vp8_dec_set_format destroys the codec. I have been able to get streams working nearly correctly by commenting out the destroy of the vp8 codec from gst_vp8_dec_set_format. Though I am not sure it's really correct. Been reproduced on Mac and Ubuntu. I can reproduce the issue using the following commands: gst-launch-1.0 -v rtpbin name=rtpvp8 videotestsrc pattern=ball ! textoverlay text="VP8" ! 'video/x-raw,width=320,height=240,framerate=15/1' ! vp8enc cpu-used=15 keyframe-max-dist=10000 ! rtpvp8pay ! rtpvp8.send_rtp_sink_0 rtpvp8.send_rtp_src_0 ! udpsink async=false port=5000 host=::FFFF:7F00:0001 name=vrtpsink rtpvp8.send_rtcp_src_0 ! udpsink port=5001 host=::FFFF:7F00:0001 sync=false async=false name=vrtcpsink udpsrc name=receiver_rtcp_in port=5003 caps = "application/x-rtcp" ! rtpvp8.recv_rtcp_sink_0 udpsrc name=vp8recv port=5002 caps="application/x-rtp, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96, media=(string)video" ! rtpvp8.recv_rtp_sink_0 rtpvp8. ! rtpvp8depay ! vp8dec ! videoconvert ! osxvideosink gst-launch-1.0 -v rtpbin name=rtpvp8 videotestsrc pattern=ball ! textoverlay text="VP8" ! 'video/x-raw,width=320,height=240,framerate=15/1' ! vp8enc cpu-used=15 keyframe-max-dist=10000 ! rtpvp8pay ! rtpvp8.send_rtp_sink_0 rtpvp8.send_rtp_src_0 ! udpsink async=false port=5002 host=::FFFF:7F00:0001 name=vrtpsink rtpvp8.send_rtcp_src_0 ! udpsink port=5003 host=::FFFF:7F00:0001 sync=false async=false name=vrtcpsink udpsrc name=receiver_rtcp_in port=5001 caps = "application/x-rtcp" ! rtpvp8.recv_rtcp_sink_0 udpsrc name=vp8recv port=5000 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96" ! rtpvp8.recv_rtp_sink_0 rtpvp8. ! rtpvp8depay ! vp8dec ! videoconvert ! osxvideosink The pipeline you run first will get a single static image of the ball. Running this pipeline with export GST_DEBUG=*:2,vp8dec:2,libav:4,avdec_h264:4,rtpvp8depay:6,rtpbasedepayload:6,GST_EVENT:4 shows relevant information. Excerpt from the output: ... 0:00:00.505060000 98869 0x7f9afa037190 DEBUG rtpbasedepayload gstrtpbasedepayload.c:191:gst_rtp_base_depayload_setcaps:<rtpvp8depay0> Set caps 0:00:00.505071000 98869 0x7f9afa037190 DEBUG rtpbasedepayload gstrtpbasedepayload.c:201:gst_rtp_base_depayload_setcaps:<rtpvp8depay0> NPT start 0 0:00:00.505080000 98869 0x7f9afa037190 DEBUG rtpbasedepayload gstrtpbasedepayload.c:209:gst_rtp_base_depayload_setcaps:<rtpvp8depay0> NPT stop 18446744073709551615 0:00:00.505097000 98869 0x7f9afa037190 INFO GST_EVENT gstevent.c:628:gst_event_new_caps: creating caps event 0x7f9af9911990 /GstPipeline:pipeline0/GstRtpVP8Depay:rtpvp8depay0.GstPad:src: caps = video/x-vp8, framerate=(fraction)0/1 /GstPipeline:pipeline0/GstVP8Dec:vp8dec0.GstPad:sink: caps = video/x-vp8, framerate=(fraction)0/1 /GstPipeline:pipeline0/GstRtpVP8Depay:rtpvp8depay0.GstPad:sink: caps = application/x-rtp, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96, media=(string)video, ssrc=(uint)1283409451 /GstPipeline:pipeline0/GstRtpBin:rtpvp8.GstGhostPad:recv_rtp_src_0_1283409451_96.GstProxyPad:proxypad5: caps = application/x-rtp, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96, media=(string)video, ssrc=(uint)1283409451 0:00:00.505290000 98869 0x7f9afa037190 LOG rtpbasedepayload gstrtpbasedepayload.c:284:gst_rtp_base_depayload_chain:<rtpvp8depay0> discont 1, seqnum 1322, rtptime 2231324518, pts 0:00:00.050053000, dts 0:00:00.050053000 0:00:00.505304000 98869 0x7f9afa037190 WARN rtpvp8depay gstrtpvp8depay.c:118:gst_rtp_vp8_depay_process:<rtpvp8depay0> Discontinuity, flushing adapter 0:00:00.505337000 98869 0x7f9afa037190 LOG rtpvp8depay gstrtpvp8depay.c:176:gst_rtp_vp8_depay_process:<rtpvp8depay0> Pushing buffer end of frame - seq 1322 0:00:00.505349000 98869 0x7f9afa037190 LOG rtpbasedepayload gstrtpbasedepayload.c:520:set_headers:<rtpvp8depay0> Marking DISCONT on output buffer 0:00:00.505357000 98869 0x7f9afa037190 INFO GST_EVENT gstevent.c:709:gst_event_new_segment: creating segment event 0x110fd6ee8 0:00:00.505374000 98869 0x7f9afa037190 DEBUG rtpbasedepayload gstrtpbasedepayload.c:559:gst_rtp_base_depayload_prepare_push:<rtpvp8depay0> Pushed newsegment event on this first buffer 0:00:00.505400000 98869 0x7f9afa037190 WARN vp8dec gstvp8dec.c:418:open_codec:<vp8dec0> VPX preprocessing error: unsupported bitstream 0:00:00.505433000 98869 0x7f9afa037190 ERROR videodecoder gstvideodecoder.c:2257:gst_video_decoder_prepare_finish_frame:<vp8dec0> No buffer to output ! 0:00:00.505456000 98869 0x7f9afa037190 LOG rtpbasedepayload gstrtpbasedepayload.c:284:gst_rtp_base_depayload_chain:<rtpvp8depay0> discont 0, seqnum 1323, rtptime 2231330517, pts 0:00:00.116708511, dts 0:00:00.116708511 0:00:00.505470000 98869 0x7f9afa037190 LOG rtpvp8depay gstrtpvp8depay.c:176:gst_rtp_vp8_depay_process:<rtpvp8depay0> Pushing buffer end of frame - seq 1323 0:00:00.505483000 98869 0x7f9afa037190 WARN vp8dec gstvp8dec.c:418:open_codec:<vp8dec0> VPX preprocessing error: unsupported bitstream 0:00:00.505492000 98869 0x7f9afa037190 ERROR videodecoder gstvideodecoder.c:2257:gst_video_decoder_prepare_finish_frame:<vp8dec0> No buffer to output ! ... The errors repeat until a keyframe is received.
Please note that since the clients are launched one after the other the second is expected not to play video until the first key frame. On Mac the osxvideosink stays as a green screen. I've done similar pipelines with H264 that show it working as expected: gst-launch-1.0 -v rtpbin name=rtph264 videotestsrc pattern=ball ! textoverlay text="H264" ! 'video/x-raw,width=640,height=480,framerate=15/1' ! x264enc tune=zerolatency speed-preset=ultrafast ! rtph264pay ! rtph264.send_rtp_sink_0 rtph264.send_rtp_src_0 ! udpsink async=false port=5102 host=::FFFF:7F00:0001 ts-offset=0 name=hrtpsink rtph264.send_rtcp_src_0 ! udpsink port=5103 host=::FFFF:7F00:0001 sync=false async=false name=hrtcpsink udpsrc name=receiver_rtcp_in port=5101 caps = "application/x-rtcp" ! rtph264.recv_rtcp_sink_0 udpsrc name=h264recv port=5100 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96" ! rtph264.recv_rtp_sink_0 rtph264. ! rtph264depay ! avdec_h264 ! videoconvert ! osxvideosink gst-launch-1.0 -v rtpbin name=rtph264 videotestsrc pattern=ball ! textoverlay text="H264" ! 'video/x-raw,width=640,height=480,framerate=15/1' ! x264enc tune=zerolatency speed-preset=ultrafast ! rtph264pay ! rtph264.send_rtp_sink_0 rtph264.send_rtp_src_0 ! udpsink async=false port=5100 host=::FFFF:7F00:0001 ts-offset=0 name=hrtpsink rtph264.send_rtcp_src_0 ! udpsink port=5101 host=::FFFF:7F00:0001 sync=false async=false name=hrtcpsink udpsrc name=receiver_rtcp_in port=5103 caps = "application/x-rtcp" ! rtph264.recv_rtcp_sink_0 udpsrc name=h264recv port=5102 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96" ! rtph264.recv_rtp_sink_0 rtph264. ! rtph264depay ! avdec_h264 ! videoconvert ! osxvideosink
Maybe GstVideoDecoder should ignore setcaps if the actual caps haven't changed ?
I decided to see if I could simplify the means to reproduce. Turns out you only need RTP in one direction: Run this first: gst-launch-1.0 -v rtpbin name=rtpvp8 udpsrc name=vp8recv port=5002 caps="application/x-rtp, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96, media=(string)video" ! rtpvp8.recv_rtp_sink_0 rtpvp8. ! rtpvp8depay ! vp8dec ! videoconvert ! osxvideosink Run this second: gst-launch-1.0 -v rtpbin name=rtpvp8 videotestsrc pattern=ball ! textoverlay text="VP8" ! 'video/x-raw,width=320,height=240,framerate=15/1' ! vp8enc cpu-used=15 keyframe-max-dist=10000 ! rtpvp8pay ! rtpvp8.send_rtp_sink_0 rtpvp8.send_rtp_src_0 ! udpsink async=false port=5002 host=::FFFF:7F00:0001 name=vrtpsink You should see a very short section of video before it pauses (sometimes just a static image). Interestingly if the receiver cuts out rtpbin then it works OK. Instead of the first pipeline run: gst-launch-1.0 -v udpsrc name=vp8recv port=5002 caps="application/x-rtp, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96, media=(string)video" ! rtpvp8depay ! vp8dec ! videoconvert ! osxvideosink Unfortunately this is not a solution for me as I need rtpbin for RTCP etc. The difference in logging appears to be related to the setting of caps twice: ... 0:00:03.199149000 8345 0x7fe44310a190 INFO GST_EVENT gstevent.c:1313:gst_event_new_reconfigure: creating reconfigure event /GstPipeline:pipeline0/GstRtpBin:rtpvp8.GstGhostPad:recv_rtp_src_0_1996705949_96: caps = application/x-rtp, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96, media=(string)video, ssrc=(uint)1996705949 0:00:03.199235000 8345 0x7fe44310a190 DEBUG rtpbasedepayload gstrtpbasedepayload.c:191:gst_rtp_base_depayload_setcaps:<rtpvp8depay0> Set caps 0:00:03.199243000 8345 0x7fe44310a190 DEBUG rtpbasedepayload gstrtpbasedepayload.c:201:gst_rtp_base_depayload_setcaps:<rtpvp8depay0> NPT start 0 ... 0:00:03.265670000 8345 0x7fe44310a190 INFO GST_EVENT gstevent.c:628:gst_event_new_caps: creating caps event 0x7fe448011540 /GstPipeline:pipeline0/GstRtpBin:rtpvp8/GstRtpSession:rtpsession0.GstPad:recv_rtp_sink: caps = application/x-rtp, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96, media=(string)video /GstPipeline:pipeline0/GstRtpBin:rtpvp8/GstRtpSession:rtpsession0.GstPad:recv_rtp_sink: caps = application/x-rtp, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96, media=(string)video 0:00:03.265779000 8345 0x7fe44310a190 DEBUG rtpbasedepayload gstrtpbasedepayload.c:191:gst_rtp_base_depayload_setcaps:<rtpvp8depay0> Set caps /GstPipeline:pipeline0/GstRtpBin:rtpvp8/GstRtpSession:rtpsession0.GstPad:recv_rtp_sink: caps = application/x-rtp, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96, media=(string)video 0:00:03.265791000 8345 0x7fe44310a190 DEBUG rtpbasedepayload gstrtpbasedepayload.c:201:gst_rtp_base_depayload_setcaps:<rtpvp8depay0> NPT start 0 0:00:03.265804000 8345 0x7fe44310a190 DEBUG rtpbasedepayload gstrtpbasedepayload.c:209:gst_rtp_base_depayload_setcaps:<rtpvp8depay0> NPT stop 18446744073709551615 /GstPipeline:pipeline0/GstRtpBin:rtpvp8.GstGhostPad:recv_rtp_sink_0: caps = application/x-rtp, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96, media=(string)video ... On the good one the rtpvp8depay only get's "Set caps" once: ... 0:00:00.237910000 8221 0x7fb5eb0efa30 INFO GST_EVENT gstevent.c:628:gst_event_new_caps: creating caps event 0x7fb5ec82b0f0 /GstPipeline:pipeline0/GstUDPSrc:vp8recv.GstPad:src: caps = application/x-rtp, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96, media=(string)video 0:00:00.237999000 8221 0x7fb5eb0efa30 DEBUG rtpbasedepayload gstrtpbasedepayload.c:191:gst_rtp_base_depayload_setcaps:<rtpvp8depay0> Set caps 0:00:00.238007000 8221 0x7fb5eb0efa30 DEBUG rtpbasedepayload gstrtpbasedepayload.c:201:gst_rtp_base_depayload_setcaps:<rtpvp8depay0> NPT start 0 ... I will attach full debug logs.
Created attachment 241120 [details] Output from failing gst-launch-1.0 command Attached output from: gst-launch-1.0 -v rtpbin name=rtpvp8 udpsrc name=vp8recv port=5002 caps="application/x-rtp, clock-rate=(int)90000, encoding-name=(string)VP8-DRAFT-IETF-01, payload=(int)96, media=(string)video" ! rtpvp8.recv_rtp_sink_0 rtpvp8. ! rtpvp8depay ! vp8dec ! videoconvert ! osxvideosink 2>&1 | tee gst-launch-output.txt With GST_DEBUG=*:2,vp8dec:2,libav:4,vp8dec:4,rtpvp8depay:6,rtpbasedepayload:6,GST_EVENT:4
Simpler version of the h264 pipelines that work. gst-launch-1.0 -v rtpbin name=rtph264 udpsrc name=h264recv port=5100 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96" ! rtph264.recv_rtp_sink_0 rtph264. ! rtph264depay ! avdec_h264 ! videoconvert ! osxvideosink gst-launch-1.0 -v rtpbin name=rtph264 videotestsrc pattern=ball ! textoverlay text="H264" ! 'video/x-raw,width=640,height=480,framerate=15/1' ! x264enc tune=zerolatency speed-preset=ultrafast ! rtph264pay ! rtph264.send_rtp_sink_0 rtph264.send_rtp_src_0 ! udpsink async=false port=5100 host=::FFFF:7F00:0001 ts-offset=0 name=hrtpsink
I think what is happening is that caps are set correctly on the startup of the pipeline, and when the rtpbin get's it's recv_rtp_src_0_XXXX_XXX pad added. This gets to the point where rtpvp8depay logs "Set caps" and marks discont on the stream - since this is before the first packets it's perfectly happy it throws away nothing since nothing has gone through. A little bit later there is another caps event passing which clears the payload type map and hence causes caps events on rtpvp8depay and vp8dec - killing both. For H.264 this works I believe partially by some sensible code and partially by a mechanism I don't understand (or perhaps sheer luck). The gstavviddec.c:415 checks if caps events actually change caps and so the video codec doesn't re-start - this is sensible. It seems that the rtph264depay element receives the caps event just after it's pushed the buffer it was working on so the discontinuity doesn't lead it to clear part of the packet. Possible fixes: 1. Stop the second caps event - not sure how 2. Stop the rtp depayloader's and video codec's from resetting when the caps don't actually change 3. Stop gstrtpptdemux.c from sending downstream caps events or clearing it's payload type map when it get's upstream caps. Presumably if the application actually wants to change the payload type to caps mapping it will use the clear-pt-map signal anyway I propose that 2 and 3 could be done regardless at least from the point of view that restarting codecs etc is a bad thing and should be avoided if possible. I really have no clue why the second caps event happens - perhaps someone could tell me (I'd be glad to learn).
After some discussions on IRC I have a plan to check if caps have changed and not change things if the caps are the same. I also think I've worked out what is going wrong - basically the caps change due to working out information about the actual video being passed through which causes a caps event to pass up stream - clearing the rtpptdemux payload type map. The next packet to go through then causes the rtpptdemux to get the caps and send them back downstream - they are the same caps but it breaks the video codec. By checking if the caps have changed and avoiding destroying the decoder's state or rtp payloader's state in the case where they are the same I can ensure the stream starts reliably.
Created attachment 241207 [details] [review] Patch for video decoder base class This patch uses the existing state to check if caps have changed.
Created attachment 241208 [details] [review] Patch for rtp depayloader base class This patch adds state into the base class to hold a reference to the last successfully negotiated caps. If the caps have not changed then nothing more needs to be done.
Having spent an hour looking into the gstrtpptdemux I'm not 100% that fixing it is required (and TBH right now I don't have the time). I was also thinking that where do I stop as perhaps I should patch other elements - at which point it might be worth considering doing this in GstPad once and for all. The patches above fix the issue reported and also improved stability of the video once established it seems (I'm doing RTP so there is a certain amount of loss which can cause codec issues I think that having a better start-up helps the codec to cope).
Comment on attachment 241207 [details] [review] Patch for video decoder base class commit 0b83d13231ed25d624d75a2f4df6d43f0f64de7c Author: Sebastian Dröge <sebastian.droege@collabora.co.uk> Date: Mon Apr 15 09:42:22 2013 +0200 videoencoder: Ignore caps events if the caps did not change commit 3023521366ca6aa0b9bb9daa3939f344d08cbf37 Author: Tom Greenwood <tcdgreenwood@hotmail.com> Date: Wed Apr 10 19:07:00 2013 +0100 videodecoder: Ignore caps events if the caps did not change https://bugzilla.gnome.org/show_bug.cgi?id=697672
Comment on attachment 241208 [details] [review] Patch for rtp depayloader base class commit 789ddf42a90b2185bc0cc6cf5e0f1319ce72d296 Author: Tom Greenwood <tcdgreenwood@hotmail.com> Date: Wed Apr 10 20:45:37 2013 +0100 rtpbasedepayload: Ignore caps events if the caps did not change https://bugzilla.gnome.org/show_bug.cgi?id=697672
(In reply to comment #10) > Having spent an hour looking into the gstrtpptdemux I'm not 100% that fixing it > is required (and TBH right now I don't have the time). I was also thinking > that where do I stop as perhaps I should patch other elements - at which point > it might be worth considering doing this in GstPad once and for all. Doing that in GstPad would IMHO be wrong as it's element specific behaviour and not valid for all pads, e.g. when other information is piggy backed with the caps event. So everything is working now after the patches I pushed?
I haven't re-tested on master, but it should be all working with those patches. It's just a question of whether more changes would be a good thing. If you think that not changing other elements is the right way forward then I'm happy to bow to your greater knowledge - I was thinking of changing GstPad because it had been suggested on IRC as something that shouldn't break anything, but seemed dangerous. Thanks for your help.
Ok, let's just close this then until new problems show up :)