GNOME Bugzilla – Bug 777984
isomp4: initial ftyp/moov streamheader missing
Last modified: 2018-11-03 15:16:15 UTC
i've tried to randomly enter a stream generated with mp4dashmux from https://bugzilla.gnome.org/show_bug.cgi?id=668091 to play it in the browser using MSE. unfortunately, this failed, using a tcpserversink element and trying the various sync-methods. looking at the captured streams i discovered that the global headers are missing: when it should start with: 0000 0000: 00 00 00 1C 66 74 79 70 69 73 6F 35 00 00 00 01 ....ftyp iso5.... 0000 0010: 69 73 6F 35 69 73 6F 32 64 61 73 68 00 00 03 45 iso5iso2 dash...E 0000 0020: 6D 6F 6F 76 00 00 00 6C 6D 76 68 64 00 00 00 00 moov...l mvhd.... the randomly entered streams only start with: 0000 0000: 00 00 00 14 73 74 79 70 6D 73 64 68 00 00 00 00 ....styp msdh.... 0000 0010: 6D 73 64 68 00 00 00 60 6D 6F 6F 66 00 00 00 10 msdh...` moof.... 0000 0020: 6D 66 68 64 00 00 00 00 00 00 00 6B 00 00 00 48 mfhd.... ...k...H qtmux does implement some streamheader handling, but it seems to be missing something for my use case. i'll investiage.
with a GST_DEBUG=*:5 of my test pipeline gst-launch-1.0 videotestsrc pattern=ball is-live=1 ! timeoverlay ! video/x-raw,framerate=30/1,width=1280,height=720 ! videoconvert ! x264enc speed-preset=superfast tune=zerolatency key-int-max=1 b-adapt=0 option-string=scenecut=0 ! video/x-h264,profile=main ! mp4dashmux streamable=true faststart=true fragment-duration=500 ! tcpserversink host=127.0.0.1 port=9001 sync-method=1 the only occurences of "streamheader" in the debug output are: 0:00:04.225773980 27975 0x55b2d08d6f20 DEBUG multihandlesink gstmultihandlesink.c:1042:gst_multi_handle_sink_client_queue_buffer:<tcpserversink0> [socket 0x55b2d08f9bb0] no previous caps for this client, send streamheader 0:00:04.225785091 27975 0x55b2d08d6ca0 DEBUG basesink gstbasesink.c:2005:gst_base_sink_get_sync_times:<tcpserversink0> got times start: 99:99:99.999999999, stop: 99:99:99.999999999, do_sync 0 0:00:04.225799713 27975 0x55b2d08d6f20 DEBUG multihandlesink gstmultihandlesink.c:1103:gst_multi_handle_sink_client_queue_buffer:<tcpserversink0> [socket 0x55b2d08f9bb0] no new streamheader, so nothing to send so apparently qtmux doesn't set them at all.
Created attachment 345297 [details] [review] WIP isomp4: experiment on comitting streamheaders correctly this is a first-shot attempt of writing all necessary stream headers. now, the ftyp atom will be also prepended for newly connected tcp clients hexdump looks already a lot better this way resulting files are playable with vlc, but the browser doesn't start rendering yet (with MSE) so apparently something else might be missing. it may have to do with codec data of the h264 stream since that may only get written once in the first fragment (just a quick guess from looking at the hexdump)
I think that should be handled in this patch https://bugzilla.gnome.org/show_bug.cgi?id=706509
unfortunately not, i've already applied that patch (and it did actually fix playback from the very beginning of the stream, but not from random entry points)
You can't stream the output of mp4dashmux to a client through tcpserversink, that won't be DASH, just a fragmented MP4 stream. For that you need dashsink, that takes several streams and create DASH content (the MPD and files/segments for each stream) that a web client can consume as DASH. So the correct way would be to use dashsink to produce the DASH content and write it to the filesystem and use a regular HTTP server to serve the content exposed in the MPD to the client.
i want to achieve something similar to what dashsink does. but i don't want seperate fragment files or the manifest because it's not supposed to serve as VOD but instead i want to use this as a direct transport of a low latency live stream to one or a few web browser mse clients via websocket. what else is dashsink doing to allow randomly entering at any fragment, does it prepend an initalization fragment with the necessary codec data and stuff or is every fragment playable by itself?
You can stream streamable formats like matroska, flv or mpeg-ts with tcpserversink because: 1) the format provide inband information about the frames boundaries 2) tcpserversink starts new connections appending the streamehaders 3) tcpserversink syncs on a keyframe for seeking. And that's all you need in the client side to start playback with these formats. With mp4, there is no inband information about frames boundaries, it's only available in the container headers, in the stream index. With fragmented mp4 the whole header is split into parts, the streams description at the beginning than for each segment a moof atom with the stream index for that range. So for an MP4 segment to be autocontained you need the streamheaders cointaining the moov atom (streams info and codec data) to initialize both the demuxer and the decoder, and for each segment, the moof atom whith the index for this particular segment. On your client implementation you will probably have to do the demuxing your self, parsing the moof fore each segment to split the input stream in frames to feed into the decoder using the MSE API.
With fragmented MP4 you can make it work with tcpserversink if: a) only the fragment start is marked as keyframe (start of moof) b) the streamheaders contain the general headers (moov) c) tcpserversink always sends streamheaders to a new client and then starts from the last (or next) keyframe (i.e. fragment, start of moof) AFAIU that's what Andreas is trying to do here. That's not too different from how DASH works, there you also have the moov separately (or at the very beginning of the first fragment) and then each segment is moof+mdat
right, that's what i want to do :) if tcpserversink knows the correct streamheaders to prepend for each clients' new stream entry, no additional parsing/demuxing on the client side is necessary because it won't look any different from the dash stream that starts from the beginning (and those already play correctly) so now i just need to get these streamheaders right i guess
If I am correct, DASH implementations in the client like [1] implement the demuxing their self, you can't feed the MSE API with streams, just with encoded frames. So fragmented MP4 playback as-is in a browser is probably not implemented, this is why I was suggesting demuxing and pushing frames to the MSE API. From what Andreas showed in the first message, tcpserversink is already doing the correct thing, his message was: the randomly entered streams only start with: 0000 0000: 00 00 00 14 73 74 79 70 6D 73 64 68 00 00 00 00 ....styp msdh.... This means for seeks, it's starting the push segments from the header. If you combine that with my patch that inserts the missing atoms in the muxer streamheaders you should have everything you need to do playback in the client side. [1] https://github.com/Dash-Industry-Forum/dash.js/wiki
Andoni, on the client side, i'm actually using a really simple javascript that does nothing but request the payload from a websocket, then does addSourceBuffer('video/mp4; codecs="avc3.4d401e, mp4a.40.2"'); and then for each chunk of data received do sourceBuffer.appendBuffer without touching the payload at all. however, this requires the stream to begin with proper headers like a pipeline of gst-launch videotestsrc pattern=ball is-live=1 ! timeoverlay ! video/x-raw,framerate=30/1,width=1280,height=720 ! videoconvert ! x264enc speed-preset=superfast tune=zerolatency key-int-max=1 b-adapt=0 option-string=scenecut=0 ! video/x-h264,profile=main ! mp4dashmux streamable=true faststart=true fragment-duration=500 ! filesink (on current master) generates it. so the browser eats an mp4 stream and demuxes it itself, but it's very picky about the format. chrome seems to be a bit pickier than firefox
Understood now, from you firt comment I though that mp4dashmux was already syncing correctly on moof atoms. Like Sebastian already commented, you should probably need to mark the beggining of fragments as key units or add a new sync method in tcpserversink to sync in GstForceKeyUnit, since these in-band events are forwarded from the muxer downstream. thiago ported most of the commits in that branch: https://cgit.freedesktop.org/~thiagoss/gst-plugins-good/log/?h=dashsink Looking at it I see there is another patch needed to insert tfdt in moof atoms, make sure you have it too. Most of the work might have already been merged upstream... who knows all my work is more than 5 years old, at that time no body cared and know I don't care anymore :)
i've rebased the whole branch of thiago to current git master, which includes e.g. that tfdt into moof patch and a few other things i've observed that with my test pipeline of gst-launch-1.0 videotestsrc pattern=ball is-live=1 ! timeoverlay ! video/x-raw,framerate=30/1,width=1280,height=720 ! videoconvert ! x264enc speed-preset=superfast tune=zerolatency key-int-max=1 b-adapt=0 option-string=scenecut=0 ! video/x-h264,profile=main ! mp4dashmux streamable=true faststart=true fragment-method=event ! fakesink silent=false -v there are no buffers going into the sink. with method=time, dataflow is constant shouldn't the keyframes issued by the encoder every second trigger new fragments?
You are using fragment-method=event wihtout a downstream element properly issuing GstForceKeyUnit events. This fragmentation method is used when you have a sink like dashsink driving the pipeline to generate new fragments for all upstream elements at the same timestamp. In your case you will have to switch to time-based fragmentation, each second for example. You will also need to configure your encoder to create keyframes at a regular interval, which is what GstForceKeyUnit event does, but since you have it, make sure that your encoder forces a keyframe at regular intervals, ideally matching your fragment duration.
A good improvement to mp4dashmux would be to add a new fragment-method=keyunit that issues fragments for each input keyframe.
yes andoni, that sounds like a good idea. i've been wondering how to properly sync keyframe generation by the encoder and fragment splitting. but currently the point being is that apparently, the streams issued by tcpserversink aren't self-contained coherently up to the point where my mse browser could independently play them because some headers are still missing.
Time-based fragment generation at the next possible keyframe seems more useful. Also this should probably be just somehow be merged with splitmuxsink, which already does all these complications.
(In reply to Sebastian Dröge (slomo) from comment #17) > Time-based fragment generation at the next possible keyframe seems more > useful. Also this should probably be just somehow be merged with > splitmuxsink, which already does all these complications. That's exactly what this commit in the same branch was doing :) https://cgit.freedesktop.org/~thiagoss/gst-plugins-good/commit/?h=dashsink&id=2daec4378f14eafaf8ed360f78719ccb3a5ae1dd
i've noticed that when checking the directly saved file with mediainfo General Complete name : smpte-min-max-keyint-30-fragment-duration-1000-direct.mp4 Format : iso5 Codec ID : iso5 (iso5/iso2/dash) File size : 160 KiB Duration : 33 s 333 ms Overall bit rate : 39.4 kb/s Encoded date : UTC 2017-02-13 13:12:20 Tagged date : UTC 2017-02-13 13:12:20 Video ID : 1 Format : AVC Format/Info : Advanced Video Codec Format profile : Main@L3.1 Format settings, CABAC : Yes Format settings, ReFrames : 1 frame Format settings, GOP : M=1, N=30 Codec ID : avc1 Codec ID/Info : Advanced Video Coding Duration : 33 s 333 ms Bit rate : 37.2 kb/s Nominal bit rate : 2 048 kb/s Width : 1 280 pixels Height : 720 pixels Display aspect ratio : 16:9 Frame rate mode : Constant Frame rate : 30.000 FPS Color space : YUV Chroma subsampling : 4:2:0 Bit depth : 8 bits Scan type : Progressive Bits/(Pixel*Frame) : 0.001 Stream size : 151 KiB (94%) Writing library : x264 core 148 r2708 86b7198 Encoding settings : cabac=1 / ref=1 / deblock=1:0:0 / analyse=0x1:0x1 / me=dia / subme=1 / psy=1 / psy_rd=1,00:0,00 / mixed_ref=0 / me_range=16 / chroma_me=1 / trellis=0 / 8x8dct=0 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=0 / threads=4 / lookahead_threads=4 / sliced_threads=1 / slices=4 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=0 / weightp=1 / keyint=30 / keyint_min=16 / scenecut=0 / intra_refresh=0 / rc_lookahead=0 / rc=cbr / mbtree=0 / bitrate=2048 / ratetol=1,0 / qcomp=0,60 / qpmin=0 / qpmax=69 / qpstep=4 / vbv_maxrate=2048 / vbv_bufsize=1228 / nal_hrd=none / filler=0 / ip_ratio=1,40 / aq=1:1,00 Language : English Encoded date : UTC 2017-02-13 13:12:20 Tagged date : UTC 2017-02-13 13:12:20 Color range : Limited Color primaries : BT.709 Transfer characteristics : BT.709 Matrix coefficients : BT.709 against the stream from tcpserversink General Complete name : min-max-keyint-30-fragment-duration-1000-entry-0.mp4 Format : iso5 Codec ID : iso5 (iso5/iso2/dash) File size : 32.9 KiB Duration : 7 s 0 ms Overall bit rate : 38.5 kb/s Encoded date : UTC 2017-02-13 13:13:31 Tagged date : UTC 2017-02-13 13:13:31 Video ID : 1 Format : AVC Format/Info : Advanced Video Codec Format profile : Main@L3.1 Format settings, CABAC : Yes Format settings, ReFrames : 1 frame Format settings, GOP : M=1, N=30 Codec ID : avc1 Codec ID/Info : Advanced Video Coding Duration : 7 s 0 ms Bit rate : 35.5 kb/s Width : 1 280 pixels Height : 720 pixels Display aspect ratio : 16:9 Frame rate mode : Constant Frame rate : 30.000 FPS Color space : YUV Chroma subsampling : 4:2:0 Bit depth : 8 bits Scan type : Progressive Bits/(Pixel*Frame) : 0.001 Stream size : 30.4 KiB (92%) Language : English Encoded date : UTC 2017-02-13 13:13:31 Tagged date : UTC 2017-02-13 13:13:31 Color range : Limited Color primaries : BT.709 Transfer characteristics : BT.709 Matrix coefficients : BT.709 that the Writing library and the Encoding settings are missing in the latter case. that's what i had meant above when looking at the hexdump. apparently x264enc writes that only once at the beginning of the stream. could it be that the browser wants to know about that, or some other initialization sequence from the h264 stream before starting to render?
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/345.