After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 734413 - mp4mux: interleave audio and video in fragments, and reduce interleave
mp4mux: interleave audio and video in fragments, and reduce interleave
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
git master
Other All
: Normal enhancement
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2014-08-07 10:46 UTC by Richard Mitic
Modified: 2018-11-03 14:53 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Richard Mitic 2014-08-07 10:46:57 UTC
The following pipeline will produce a multiplexed video/audio mp4 file with 1-second fragments.
 
gst-launch-1.0 mp4mux name=mux fragment-duration=1000 ! filesink location=out.mp4 videotestsrc num-buffers=1500 ! "video/x-raw,framerate=25/1" ! x264enc tune=zerolatency ! mux. audiotestsrc num-buffers=2812 ! "audio/x-raw,rate=48000" ! voaacenc ! mux.
 
Currently, each media stream is muxed as a separate movie fragment, i.e. for two stream A and V, mp4mux produces A1 V1 A2 V2 A3 V3 etc. However, there is a discrepancy in the length of each associated audio and video fragment, due to the fact that only an integer number of audio or video frames can be packed into a fragment.
 
In this example, the video fragments contain 25 frames equaling exactly 1 second, but the audio fragments contain 46 frames, equaling 0.9813333 seconds. This error is corrected for by adding two audio fragments in a row when there is enough data buffered. The result is that AN and VN will not necessarily contain media that starts at the same time, with the maximum error being the length of a fragment.
 
To fix this, the muxer could vary the number of audio frames in each fragment so that the fragment boundary is as close as possible in time to the video fragment boundary. Ideally, the audio and video media data would also be placed in the same ‘mdat’ box, with the associated 'moof' box containing one track fragment for each stream. Just aligning the movie fragment boundaries at the frame level is acceptable though.
Comment 1 Edward Hervey 2015-01-22 09:42:22 UTC
mp4mux should put audio and video into one fragment (instead of separate ones).

In addition to that, an even better solution would be to put several run in a single fragment.

I.e. in a 1s fragment you could have 4 track runs (which also corresponds to how the data is laid out in the following mdat).
1 run of video of ~500ms
1 run of audio of ~500ms
1 run of video of ~500ms
1 run of audio of ~500ms

This would solve your issues and also reduce the interleave to 500ms (instead of the 2s you're seeing)
Comment 2 GStreamer system administrator 2018-11-03 14:53:40 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/125.