GNOME Bugzilla – Bug 748316
hlsdemux: The end offset (range_end) of a segment is not calculated properly in the m3u8 parser
Last modified: 2015-10-14 14:00:29 UTC
I have an HLS stream which is byte-range segmented (stream.ts). The number of segments is 361. When I run the following command gst-launch-1.0 filesrc location=index.m3u8 ! hlsdemux ! filesink location=out_stream.ts the stream out_stream.ts is not the same with the stream.ts. I would expect to be the same. Checking the size, the out_stream.ts is 361 bytes less than the original stream stream.ts. Further analysis between the out_stream.ts and stream.ts using a binary diff tool (vbindiff) reveals that the size of the segments are -1 byte. Next following a snapshot of the binary diff tool when a new segment is changing. The new segment starts at the byte with value 47 at the second line. You can see that in the out_stream.ts it is shifted to the left by 1 byte stream.ts 0021 8EA0: FF FF FF FF FF FF FF FF C6 7A 43 BA 69 9E B0 A3 ........ .zC.i... 0021 8EB0: CE BC 9E 75 51 18 35 10 47 40 00 11 00 00 B0 0D ...uQ.5. G@...... 0021 8EC0: 41 57 C1 00 00 41 57 E0 20 9D E6 30 91 FF FF FF AW...AW. ..0.... out_stream.ts 0021 8EA0: FF FF FF FF FF FF FF FF C6 7A 43 BA 69 9E B0 A3 ........ .zC.i... 0021 8EB0: CE BC 9E 75 51 18 35 47 40 00 11 00 00 B0 0D 41 ...uQ.5G @......A 0021 8EC0: 57 C1 00 00 41 57 E0 20 9D E6 30 91 FF FF FF FF W...AW. ..0..... So, looking closely at the parsing of the m3u8 file in the hlsdemux, I came across the following snippet in the gst_m3u8_client_get_next_fragment function in the m3u8.c file if (range_end) *range_end = file->size != -1 ? file->offset + file->size - 1 : -1; This snippet calculates the end offset of the segment. If the size of the segment has a value other than -1, then the end offset will be start_offset+size_segment-1. I don't believe that this is correct, it should be start_offset+size_segment. Removing the -1 I rerun the gst-launch-1.0 command and I got an identical file with the original one. Removing the -1 is the proper solution for this issue?
I think your HLS stream or the server is broken. Consider the following case: segment 1: offset 0, size 100 segment 2: offset 100, size 100 We would get ranges like [0, 99] and [100, 199] here then. Which for an HTTP server means that it would have to give us 100 bytes each (the range is inclusive). Or maybe I'm missing something here, not sure :)
I agree with Sebastian here. The byte ranges in HTTP requests are inclusive and hlsdemux uses the same approach.
Created attachment 302175 [details] Debug output of gst-launch-1.0 The debug_out.log is the debug output of the command GST_DEBUG="hlsdemux:5,fragmented:7,filesrc:6,basesrc:6" ./bin/gst-launch-1.0 filesrc location=./stream/master.m3u8 ! hlsdemux ! filesink location=out_stream.ts The first segment has the following byterange values #EXT-X-BYTERANGE:2199224@0 I didn't run it for all the file, but this includes the first segment. The interesting part starts at the 0:00:00.026539515, where the basesrc receives the seek event from the hlsdemux. The start of the segment is 0 and the stop is 2199223. Next at the 0:00:00.026958572, the gst_base_src_update_length is called where a maxsize is calculated to 2199223 which is equals to the stop segment. I believe this value should be 2199224, which is the size of the segment as seen in the m3u8 file. If you count all the occurences of gst_base_src_update_length for the first segment and add the buffer size, you will see that's is -1 byte. cat debug_out.log | grep gst_base_src_update_length | grep "segment.stop 2199223" | wc -l 537 It's 536 buffers with size 4096 bytes and one with 3767 bytes (see 0:00:01.379081420). If we do the math 536 * 4096 + 3767 = 2199223 != 2199224
The debug log was produced using the master branches.
This problems appears when the HLS playlist and the video file are located on the local pc, and they are not served by a http server. I haven't tested that scenario (http server), so I cannot say if there is a problem there. At first I thought, like yourself(@slomo, @thiago), that the calculation is correct and the problem is not there. But later, when I started to think more on file context and not on http context, it made more sense to remove the -1. If you have a file and you want to get a size_segment bytes segment from offset start_offset, I believe that the end_offset would be start_offset+size_segment. I had consider some other solutions for this issue, which are related to the gst_base_src_update_length function in the libs/gst/base/gstbasesrc.c file. The first is to increase the local stop variable by 1. This is because the stop variable is used to calculate maxsize which is not correct. Next following a patch for this change --- a/libs/gst/base/gstbasesrc.c +++ b/libs/gst/base/gstbasesrc.c @@ -2338,7 +2338,7 @@ gst_base_src_update_length (GstBaseSrc * src, guint64 offset, guint * length, bclass = GST_BASE_SRC_GET_CLASS (src); - stop = src->segment.stop; + stop = src->segment.stop + 1; /* get total file size */ size = src->segment.duration; The other solution is to increase the length of the last buffer by 1. Next following a patch for this change --- a/libs/gst/base/gstbasesrc.c +++ b/libs/gst/base/gstbasesrc.c @@ -2381,7 +2381,7 @@ gst_base_src_update_length (GstBaseSrc * src, guint64 offset, guint * length, /* else we can clip to the end */ if (G_UNLIKELY (offset + *length >= maxsize)) - *length = maxsize - offset; + *length = maxsize - offset + 1; } } I am not really sure about the last two proposed solutions because I am not so familiar with gstreamer and basesrc seems a module that can be used by a number of other modules.
I can reproduce the issue here. First we need to define if the stop value of a segment is inclusive or not to define where we should fix it. At the moment filesrc and souphttpsrc have different behaviors regarding that. Filesrc considers the segment.stop non-inclusive while souphttpsrc does consider it inclusive.
Asked around in IRC and it seems that GstSegment is indeed non-inclusive. It makes sense to use the same in hlsdemux to make it consistent. Can you provide a patch? It would also be nice to have a comment about it in adaptivedemux's structure that holds the fragment data. We likely need to fix souphttpsrc as well.
Yes I can provide a patch. Just give me a day or two to create the patch and test it.
I want some feedback regarding the changes that needs to be done. Now, the change of the hlsdemux is pretty straightforward. Next it's a diff with the change diff --git a/ext/hls/m3u8.c b/ext/hls/m3u8.c index cd9ede9..bf4662f 100644 --- a/ext/hls/m3u8.c +++ b/ext/hls/m3u8.c @@ -1191,7 +1191,7 @@ gst_m3u8_client_get_next_fragment (GstM3U8Client * client, if (range_start) *range_start = file->offset; if (range_end) - *range_end = file->size != -1 ? file->offset + file->size - 1 : -1; + *range_end = file->size != -1 ? file->offset + file->size : -1; if (key) *key = file->key; if (iv) My question is regarding the change of the souphttpsrc. I made the following change diff --git a/ext/soup/gstsouphttpsrc.c b/ext/soup/gstsouphttpsrc.c index ad87223..8d0d6cb 100644 --- a/ext/soup/gstsouphttpsrc.c +++ b/ext/soup/gstsouphttpsrc.c @@ -1907,7 +1907,7 @@ gst_soup_http_src_do_seek (GstBaseSrc * bsrc, GstSegment * segment) /* Wait for create() to handle the jump in offset. */ src->request_position = segment->start; - src->stop_position = segment->stop; + src->stop_position = segment->stop - 1; return TRUE; } I tested it with an HLS stream and the resulting stream is ok. I believe the souphttpsrc is used by a number of other containers which needs to be tested. I am not 100% if the solution above is the most appropriate, can you comment on this. Also, the changes and the patches for souphttpsrc should be captured on this issue or another issue needs to be created?
I think that change makes sense, additionally we should also document that in GstSegment for the stop position. Also need to be double checked if the right thing is done for ranges in dashdemux and mssdemux.
And also keeping track in this bug is ok, no need to spread it over multiple. But please attach patches instead of copying them inline into comments :)
It wasn't meant to be patches, just asking if my way of think was on the correct path. :)
Created attachment 302303 [details] [review] Proposed solution for the hlsdemux in master branch Please review this patch for master branch
Created attachment 302304 [details] [review] Proposed solution for the souphttpsrc in master branch Please review.
Created attachment 302305 [details] [review] Proposed solution for the hlsdemux in 1.4 branch My initial find was in the 1.4 branch, for that reason I added the patch for this branch. Please review.
Created attachment 302306 [details] [review] Proposed solution for the souphttpsrc in 1.4 branch Please review.
Review of attachment 302304 [details] [review]: ::: ext/soup/gstsouphttpsrc.c @@ +1920,3 @@ /* Wait for create() to handle the jump in offset. */ src->request_position = segment->start; + src->stop_position = segment->stop - 1; This is not correct yet. segment->stop can be GST_CLOCK_TIME_NONE, in which case it should stay as that. And src->stop_position is compared against segment->stop a few lines above, which will now always fail.
Created attachment 302390 [details] [review] segment: Document that the segment stop position is non-inclusive And also consider that in all the segment functions that didn't do that yet.
Created attachment 302437 [details] [review] Revised solution for souphttpsrc according to review comments (In reply to Sebastian Dröge (slomo) from comment #17) > Review of attachment 302304 [details] [review] [review]: > > ::: ext/soup/gstsouphttpsrc.c > @@ +1920,3 @@ > /* Wait for create() to handle the jump in offset. */ > src->request_position = segment->start; > + src->stop_position = segment->stop - 1; > > This is not correct yet. segment->stop can be GST_CLOCK_TIME_NONE, in which > case it should stay as that. And src->stop_position is compared against > segment->stop a few lines above, which will now always fail. This revised patch is according to the comment above. It's for the master. If everything is ok, I will add one for the 1.4 branch. I added the other patches as well, the one for hlsdemux and the segment, for my testing. I have made the following tests - Test hlsdemux with filesrc. Run the gst-launch-1.0 filesrc location=master.m3u8 ! hlsdemux ! filesink location=out_stream.ts and the in_stream.ts was the same with the out_stream.ts - Test hlsdemux with souphttpsrc. Serve the same stream from an http server on my machine. Run the gst-launch-1.0 souphttpsrc "location=http://127.0.0.1/master.m3ua" ! hlsdemux ! filesink location=out_stream.ts. Again the out_stream was the same with the in_stream.ts - Test qtdemux with filesrc. Run the gst-launch-1.0 filesrc location=trailer.mp4 ! qtdemux name=demux demux.video_0 ! queue ! mux.video_0 demux.audio_0 ! queue ! mux.audio_0 qtmux name=mux ! filesink location=out.mp4. The out.mp4 was not similar, but this was expected. I could playback with no issues. - Test qtdemux with souphttpsrc. Run gst-launch-1.0 souphttpsrc "location=http://media.w3.org/2010/05/sintel/trailer.mp4" ! qtdemux name=demux demux.video_0 ! queue ! mux.video_0 demux.audio_0 ! queue ! mux.audio_0 qtmux name=mux ! filesink location=out.mp4. Again the out.mp4 was not the same with the initial stream. I could playback with no issues. For my testing regarding the qtdemux(filesrc and souphttpsrc), I used this file http://media.w3.org/2010/05/sintel/trailer.mp4.
Review of attachment 302437 [details] [review]: Some comments in the code, what do you think? Is stop=0 something we should worry about? ::: ext/soup/gstsouphttpsrc.c @@ +1891,3 @@ + + if (segment->stop != GST_CLOCK_TIME_NONE) { + stop = segment->stop - 1; /* make stop non-inclusive */ You are actually making it inclusive here. @@ +1897,3 @@ + GST_DEBUG_OBJECT (src, "do_seek(%" G_GUINT64_FORMAT "-%" G_GINT64_FORMAT + ")", segment->start, (gint64) stop); + } What happens here if stop == 0? Is this a valid use case? I guess theoretically it is possible to receive a seek with start=0 and stop=0, not sure if it is useful but having the segment.stop non-inclusive and stop inclusive doesn't allow us to represent this case. Perhaps we want to keep segment_stop as non-inclusive and just subtract one when adding to the request headers?
Let me put some scenarios and how I think it myself. The scenarios are the following - segment.stop = -1, non-inclusive range [0,-1). If there is no `if` statement, the stop = -2, which is an invalid value, so for this reason I leave it as a -1. So, stop = -1. - segment.stop = 0, non-inclusive range [0,0). As it is now, the stop = -1, is an invalid value. I believe that we need to leave as it is, because if we consider that segment.stop non-inclusive, then the [0,0) is invalid. So, stop = -1. - segment.stop = 1, non-inclusive range [0,1). The stop = 0 and we have valid values. So, stop = 0. (In reply to Thiago Sousa Santos from comment #20) > > Perhaps we want to keep segment_stop as non-inclusive and just subtract one > when adding to the request headers? If I understand correctly, the gist of what you are saying is the following diff --git a/ext/soup/gstsouphttpsrc.c b/ext/soup/gstsouphttpsrc.c index ad87223..ced1117 100644 --- a/ext/soup/gstsouphttpsrc.c +++ b/ext/soup/gstsouphttpsrc.c @@ -742,11 +742,13 @@ gst_soup_http_src_add_range_header (GstSoupHTTPSrc * src, guint64 offset, gint rc; + guint64 stop = stop_offset - 1 < 0 ? -1 : stop_offset - 1; + soup_message_headers_remove (src->msg->request_headers, "Range"); - if (offset || stop_offset != -1) { - if (stop_offset != -1) { + if (offset || stop != -1) { + if (stop != -1) { rc = g_snprintf (buf, sizeof (buf), "bytes=%" G_GUINT64_FORMAT "-%" - G_GUINT64_FORMAT, offset, stop_offset); + G_GUINT64_FORMAT, offset, stop); } else { rc = g_snprintf (buf, sizeof (buf), "bytes=%" G_GUINT64_FORMAT "-", offset); If that's the case then I believe that the problem just moves from one place to another. If the stop_offset = -1, again the stop = -2, so we need to change this and for this reason I have used the condition that if it's negative, than the value will be -1. I am not really sure what is the best/clean approach. Personally speaking I would consider the first approach :). Maybe it would be beneficial to add a comment there with the above edge cases. Anyone else to comment about this?
Or, now I am thinking about it, you(@Thiago) mean something like the following diff --git a/ext/soup/gstsouphttpsrc.c b/ext/soup/gstsouphttpsrc.c index ad87223..ff54c1e 100644 --- a/ext/soup/gstsouphttpsrc.c +++ b/ext/soup/gstsouphttpsrc.c @@ -1345,13 +1345,14 @@ gst_soup_http_src_chunk_allocator (SoupMessage * msg, gsize max_len, gsize length; GstFlowReturn rc; SoupGstChunk *chunk; + gsize new_max_len = max_len - 1; - if (max_len) - length = MIN (basesrc->blocksize, max_len); + if (new_max_len) + length = MIN (basesrc->blocksize, new_max_len); else length = basesrc->blocksize; GST_DEBUG_OBJECT (src, "alloc %" G_GSIZE_FORMAT " bytes <= %" G_GSIZE_FORMAT, - length, max_len); + length, new_max_len); rc = GST_BASE_SRC_CLASS (parent_class)->alloc (basesrc, -1, length, &gstbuf); if (G_UNLIKELY (rc != GST_FLOW_OK)) {
(In reply to Stavros from comment #22) > Or, now I am thinking about it, you(@Thiago) mean something like the > following > > diff --git a/ext/soup/gstsouphttpsrc.c b/ext/soup/gstsouphttpsrc.c > index ad87223..ff54c1e 100644 > --- a/ext/soup/gstsouphttpsrc.c > +++ b/ext/soup/gstsouphttpsrc.c > @@ -1345,13 +1345,14 @@ gst_soup_http_src_chunk_allocator (SoupMessage * > msg, gsize max_len, > gsize length; > GstFlowReturn rc; > SoupGstChunk *chunk; > + gsize new_max_len = max_len - 1; > > - if (max_len) > - length = MIN (basesrc->blocksize, max_len); > + if (new_max_len) > + length = MIN (basesrc->blocksize, new_max_len); > else > length = basesrc->blocksize; > GST_DEBUG_OBJECT (src, "alloc %" G_GSIZE_FORMAT " bytes <= %" > G_GSIZE_FORMAT, > - length, max_len); > + length, new_max_len); > > rc = GST_BASE_SRC_CLASS (parent_class)->alloc (basesrc, -1, length, > &gstbuf); > if (G_UNLIKELY (rc != GST_FLOW_OK)) { I was actually meaning we could update gst_soup_http_src_add_range_header to subtract 1 when adding the header to the http request.
What's the status here now? Anybody still interested in this, anything still left to be fixed?
This is still a problem, at least on 1.4. We're carrying the following patches (these are attachments to this issue). We haven't moved to 1.6 as yet. https://bugzilla.gnome.org/attachment.cgi?id=302437 https://bugzilla.gnome.org/attachment.cgi?id=302305
Did someone check if dash/smoothstreaming are doing this correctly already (probably not, otherwise they wouldn't work) or need changes too? If they also need changes, those should be done :) The two patches make sense though.
Created attachment 313242 [details] [review] mpdparser: Set default last_byte_pos to -1 The value is optional in the range, and if it is absent it means we should download until the end of stream. Not until position 0.
Created attachment 313243 [details] [review] adaptivedemux: HTTP ranges are inclusive, GStreamer segment.stop is exclusive Translate the values accordingly.
Created attachment 313245 [details] [review] souphttpsrc: Make non-inclusive segment boundaries inclusive The problem is that the filesrc and souphttpsrc are behaving differently regarding the calculation of the segment boundaries. The filesrc is using a non-inclusive boundaries, while the souphttpsrc uses inclusive. Currently the hlsdemux calculates the boundaries as inclusive, so for this reason there is no problem with the souphttpsrc, but there is an issue in the filesrc. The GstSegment is non-inclusive, so the proposed solution is to use non-inclusive boundaries in the hlsdemux in order to be consistent. Make the change in the hlsdemux, will break the souphttpsrc, which will expect inclusive boundaries, but the hlsdemux will offer non-inclusive. This change makes sure that the non-inclusive boundaries are converted to inclusive.
Created attachment 313246 [details] [review] segment: Document that the segment stop position is non-inclusive And also consider that in all the segment functions that didn't do that yet, while keeping the special case that start==stop means that position==start==stop is inside the segment.
Created attachment 313247 [details] [review] segment: Convert function to macro in unit test to get proper line numbers on failures
Comment on attachment 313246 [details] [review] segment: Document that the segment stop position is non-inclusive Now the problem here is that it breaks the gst/gstsegment unit test. For reverse playback, should segment->start be inside the segment or segment->stop? Not sure!
Don't the stop has to be inclusive to support RTP where buffers have no duration ? I believe I'm seen some code that has this special case when duration == -1 only or something. I would need to search the code a bit.
Ok, so even more special cases that are not backed by gstsegment.c :)
Comment on attachment 313246 [details] [review] segment: Document that the segment stop position is non-inclusive So, this patch is not good then. Following observations: 1) segment.stop is always outside the segment 2) buffer with start==stop==segment.stop is inside that segment but has zero duration (in a byte segment, it has no bytes!) gst_segment_clip() does it right. gst_segment_to/from_running/stream_time() is annoying, it currently must allow position==stop inside the segment because of 2). In 2.0 we should make it return a signed value, and always return a value independent of segment boundaries.
Review of attachment 313245 [details] [review]: ::: ext/soup/gstsouphttpsrc.c @@ +798,3 @@ if (offset || stop_offset != -1) { if (stop_offset != -1) { + /* FIXME: If stop_offset == 0, this will still download a single byte */ If start==stop, we should output nothing.
Stavros, can you update the souphttpsrc patch accordingly?
Review of attachment 313243 [details] [review]: Good, please commit after the souphttpsrc patch is fixed
Created attachment 313266 [details] [review] souphttpsrc: Make non-inclusive segment boundaries inclusive The problem is that the filesrc and souphttpsrc are behaving differently regarding the calculation of the segment boundaries. The filesrc is using a non-inclusive boundaries, while the souphttpsrc uses inclusive. Currently the hlsdemux calculates the boundaries as inclusive, so for this reason there is no problem with the souphttpsrc, but there is an issue in the filesrc. The GstSegment is non-inclusive, so the proposed solution is to use non-inclusive boundaries in the hlsdemux in order to be consistent. Make the change in the hlsdemux, will break the souphttpsrc, which will expect inclusive boundaries, but the hlsdemux will offer non-inclusive. This change makes sure that the non-inclusive boundaries are converted to inclusive.
Created attachment 313267 [details] [review] souphttpsrc: EOS immediately if we have an empty seek segment
Review of attachment 313267 [details] [review]: Looks good. I don't like the repeated checks before build_message() but we can catch the issue with the assert anyway when we maintain this in the future, so I don't mind.
Review of attachment 313266 [details] [review]: Good after the follow-up patch.
Comment on attachment 313267 [details] [review] souphttpsrc: EOS immediately if we have an empty seek segment commit 51adc9d8894878642fca75f269695267edb01693 Author: Sebastian Dröge <sebastian@centricular.com> Date: Wed Oct 14 15:53:26 2015 +0300 souphttpsrc: EOS immediately if we have an empty seek segment https://bugzilla.gnome.org/show_bug.cgi?id=748316
Comment on attachment 313266 [details] [review] souphttpsrc: Make non-inclusive segment boundaries inclusive commit 21c6da6764c0cd015e9f3c5eecba36e297187deb Author: Stavros Vagionitis <stavrosv@digisoft.tv> Date: Wed Oct 14 10:43:19 2015 +0300 souphttpsrc: Make non-inclusive segment boundaries inclusive The problem is that the filesrc and souphttpsrc are behaving differently regarding the calculation of the segment boundaries. The filesrc is using a non-inclusive boundaries, while the souphttpsrc uses inclusive. Currently the hlsdemux calculates the boundaries as inclusive, so for this reason there is no problem with the souphttpsrc, but there is an issue in the filesrc. The GstSegment is non-inclusive, so the proposed solution is to use non-inclusive boundaries in the hlsdemux in order to be consistent. Make the change in the hlsdemux, will break the souphttpsrc, which will expect inclusive boundaries, but the hlsdemux will offer non-inclusive. This change makes sure that the non-inclusive boundaries are converted to inclusive. https://bugzilla.gnome.org/show_bug.cgi?id=748316
Comment on attachment 313242 [details] [review] mpdparser: Set default last_byte_pos to -1 commit 1f7156d478b4cb6f67c029c76da66ad4c7323f19 Author: Sebastian Dröge <sebastian@centricular.com> Date: Wed Oct 14 10:31:23 2015 +0300 mpdparser: Set default last_byte_pos to -1 The value is optional in the range, and if it is absent it means we should download until the end of stream. Not until position 0. https://bugzilla.gnome.org/show_bug.cgi?id=748316
Comment on attachment 313243 [details] [review] adaptivedemux: HTTP ranges are inclusive, GStreamer segment.stop is exclusive commit e523bd2a33ae78bda8fee4b3f2ca3b78a80c5054 Author: Sebastian Dröge <sebastian@centricular.com> Date: Wed Oct 14 10:34:58 2015 +0300 adaptivedemux: HTTP ranges are inclusive, GStreamer segment.stop is exclusive Translate the values accordingly. https://bugzilla.gnome.org/show_bug.cgi?id=748316
Comment on attachment 313247 [details] [review] segment: Convert function to macro in unit test to get proper line numbers on failures commit f7f83128128772762e48146afda5c4fe37850bac Author: Sebastian Dröge <sebastian@centricular.com> Date: Wed Oct 14 11:03:22 2015 +0300 segment: Convert function to macro in unit test to get proper line numbers on failures https://bugzilla.gnome.org/show_bug.cgi?id=748316