GNOME Bugzilla – Bug 702206
dashdemux: media range requests not supported
Last modified: 2013-08-22 09:20:34 UTC
MPEG-DASH MPDs are allowed to specify media segments using a "mediaRange" attribute that gives a byte range in a larger file. Currently, dashdemux is sending a URL to uridownloader that looks like this: https://www.mydomain.com/dash/file.mov?range=345-678 uridownloader searches for an element that know how to handle the given URI (in this case, souphttpsrc). souphttpsrc does not know how to download a subset of a file using this URI format. Internally, it supports the use of the HTTP "Range" header, but it is only for resuming a paused stream from the previous location. souphttpsrc does support the ability to provide "other" HTTP headers, but it is possible that some small changes would be needed to make sure that its internal use of "Range" headers would not conflict with those set by the user. Even with that, how does our range request get to souphttpsrc if uridownloader only takes a simple URI? uridownloader would not know how to pass a byte range to whatever element was constructed to service the URI request. Not sure what the correct solution is at this point.
I think the range request should be done using a seek event in bytes format, with the start and stop positions.
Created attachment 247278 [details] [review] souphttpsrc: allow seeks in ready On is_seekable, check if the server's headers have already been received. If not, do a HEAD request to get them before responding to basesrc.
Created attachment 247279 [details] [review] souphttpsrc: use seek segment stop as byte range end for requests To download only a part of a file, souphttpsrc should use the seek requests start and stop as the range in request headers. Stop was being ignored.
Those 2 patches should make souphttpsrc capable of handling seeks properly for this use case. Next we need to make uridownloader have a range API and then make dashdemux use it.
Created attachment 247291 [details] [review] uridownloader: add support for range based downloads Adds a new API gst_uri_downloader_fetch_uri_with_range that allows downloading only a byte range from an URI. It uses a seek event sent to the source to signal the range to be downloaded.
Any stream available for testing?
(In reply to comment #6) > Any stream available for testing? http://www-itec.uni-klu.ac.at/ftp/datasets/mmsys12/BigBuckBunny/bunny_4s/BigBuckBunnyNonSeg_4s_isoffmain_DIS_23009_1_v_2_1c2_2011_08_30.mpd
Created attachment 247297 [details] [review] souphttpsrc: use seek segment stop as byte range end for requests New patch version, fixing a few issues. I still think get_size needs to be updated to return the minimum between content-length or the requested range
Created attachment 247299 [details] [review] dashdemux: add support for range based segments Use the mediaRange information and pass it to the uridownloader to correctly download only the segment ranges indicated in the MPD
> New patch version, fixing a few issues. I still think get_size needs > to be updated to return the minimum between content-length or the > requested range Actually, just re-read the gstbasesrc documentation and it wants the full size of the content, so no need for more patches.
What about https://bugzilla.gnome.org/show_bug.cgi?id=699926 ?
*** Bug 699926 has been marked as a duplicate of this bug. ***
Created attachment 247320 [details] [review] souphttpsrc: add better support for range requests Also consider the stop positions in seeks
Created attachment 247785 [details] [review] souphttpsrc: allow seeks in ready New patch that cares more about concurrency and only starts the HEAD request if there is no other request pending.
I was inspecting the fragments manually using: curl -r rangestart-rangeend url > file And the resulting files seems to not be properly aligned. At the beginning you can easily find files that won't start with a atom header (4 bytes indicating size and then 4 bytes with the fourcc). I'm starting to suspect this sample is broken.
(In reply to comment #15) > And the resulting files seems to not be properly aligned. At the beginning you > can easily find files that won't start with a atom header (4 bytes indicating > size and then 4 bytes with the fourcc). > > I'm starting to suspect this sample is broken. I haven't looked at the range delineations in detail yet, but I just wanted to remind you of the Initialization segment. Each range is designed to either: a) Be streamed directly after the previous range in the same Representation or b) Be streamed after the Initialization segment upon a Representation switch (or at the start of the Period). You may still be correct, this is just an FYI. I'll take a closer look at the ranges later today.
(In reply to comment #16) > You may still be correct, this is just an FYI. I'll take a closer look at the > ranges later today. I'm starting to agree that the content looks invalid. It looks like all of the byte ranges are off by 1. For example, in http://www-itec.uni-klu.ac.at/ftp/datasets/mmsys12/BigBuckBunny/bunny_4s/bunny_4s_50kbit/bunny_50kbit_dashNonSeg.mp4 (MPD is http://www-itec.uni-klu.ac.at/ftp/datasets/mmsys12/BigBuckBunny/MPDs/BigBuckBunnyNonSeg_4s_isoffmain_DIS_23009_1_v_2_1c2_2011_08_30.mpd): the first segment (for the 50kb representation) starts at offset 863. I believe is should start at 862 -- at that point in the stream there is a four byte size header (0x00, 0x00, 0x00, 0x18), followed by the STYP fourcc. I will try to contact the folks that host this content to see what they have to say.
(In reply to comment #17) > It looks like all of the byte ranges are off by 1. Actually, it looks like the number of bytes we are off increments with every segment. The second segment is off by 2 bytes. I have sent an email off to the folks at ITEC.
Review of attachment 247785 [details] [review]: ::: ext/soup/gstsouphttpsrc.c @@ +533,3 @@ + * async handlers might set src->msg to NULL, so we keep the pointer */ + msg = src->msg; + soup_session_send_message (src->session, src->msg); How is this unblocked when you receive a flush or going to NULL state ? I'm under the impression that simply doing g_main_loop_quit() may leak some data with this method, what do you think ? @@ +674,3 @@ + g_main_context_unref (src->context); + return FALSE; + } g_main_loop_new() never fails, but I think you just moved that code over.
Created attachment 248075 [details] [review] souphttpsrc: allow seeks in ready Improved patch after the review. Changed the implementation to use the same paths for both GET and HEAD requests
Review of attachment 247291 [details] [review]: ::: gst-libs/gst/uridownloader/gsturidownloader.c @@ +375,3 @@ gst_uri_downloader_fetch_uri (GstUriDownloader * downloader, const gchar * uri) { + return gst_uri_downloader_fetch_uri_with_range (downloader, uri, 0, 0); Why not -1 for the end?
Review of attachment 247320 [details] [review]: Looks good expect for the commit message, this does not improve anything other than adding support for the stop position of a seek, right? Please fix that before pushing :)
Review of attachment 248075 [details] [review]: Looks good but I think you have to signal the GCond in GstBaseSrc::unlock() too to prevent deadlocks.
Updated the patches and pushed the fixes, thanks for all the reviews. * In good: commit 66dbe3151a3426f6de271c2999ee908a8e89b65c Author: Andoni Morales Alastruey <ylatuya@gmail.com> Date: Thu Jun 20 09:41:48 2013 -0300 souphttpsrc: also consider stop positions in seeks Use seek stop position as range end for requests https://bugzilla.gnome.org/show_bug.cgi?id=702206 commit 5bc5f4a0f6d9b82f72b39926d8ad81600dac4604 Author: Thiago Santos <thiago.sousa.santos@collabora.com> Date: Wed Jun 19 14:06:40 2013 -0300 souphttpsrc: allow seeks in ready On is_seekable, check if the server's headers have already been received. If not, do a HEAD request to get them before responding to basesrc. https://bugzilla.gnome.org/show_bug.cgi?id=702206 * In bad: commit 4f17112392d392e0cb841ea63382cef29154f33a Author: Thiago Santos <thiago.sousa.santos@collabora.com> Date: Thu Jun 20 03:52:31 2013 -0300 dashdemux: add support for range based segments Use the mediaRange information and pass it to the uridownloader to correctly download only the segment ranges indicated in the MPD https://bugzilla.gnome.org/show_bug.cgi?id=702206 commit e76f3e95fd0846bccccab387755aace9c0050c2b Author: Thiago Santos <thiago.sousa.santos@collabora.com> Date: Wed Jun 19 18:28:15 2013 -0300 uridownloader: add support for range based downloads Adds a new API gst_uri_downloader_fetch_uri_with_range that allows downloading only a byte range from an URI. It uses a seek event sent to the source to signal the range to be downloaded. https://bugzilla.gnome.org/show_bug.cgi?id=702206
The souphttpsrc HEAD request change is about to be reverted, see bug #705371 .