After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 658309 - souphttpsrc: Add support for getting content size via content-range header
souphttpsrc: Add support for getting content size via content-range header
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
unspecified
Other Linux
: Normal enhancement
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2011-09-06 00:30 UTC by Davy
Modified: 2017-11-10 09:40 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch which will determine content size using content-range or timeseekrange headers (8.99 KB, patch)
2013-09-17 18:51 UTC, Lori Anderson
needs-work Details | Review
Code clean up and got rid of the use of TimeSeekRange header. (8.67 KB, patch)
2013-09-18 20:08 UTC, Lori Anderson
needs-work Details | Review
updated patch (7.01 KB, patch)
2013-12-24 14:00 UTC, Steve Maynard
needs-work Details | Review
additional updates per request (7.08 KB, patch)
2013-12-24 15:47 UTC, Steve Maynard
reviewed Details | Review
replaces all prior patches - code consolidation per request (7.52 KB, patch)
2013-12-30 15:24 UTC, Steve Maynard
needs-work Details | Review

Description Davy 2011-09-06 00:30:27 UTC
souphttpsrc change the seeable status to true, if the content-length was received.
However, some servers deploy HTTP chunked encoding type for more fast interaction.
-	DNLA is one of good example. –

The problem is:
In case of chucked encoding transfer, there is no way to seek. do_seek function always return false value. Even though server can process HTTP partial GET including Range header. 

According to HTTP RFC 2616,
It should be judged after receiving server’s respond. 
If a server can handle HTTP partial GET then it will return “206 partial content” otherwise it will return 200 OK or the other response code.

Let’s share some ideas to resolve this bug.

BR,
Davy
Comment 1 Davy 2011-09-07 08:33:55 UTC
I modified start() function as below in order to know server’s seekable status.
It works fine but I’d like to know what you think

gst_soup_http_src_start(…)

... any place
src-seekable = is_server_seekable(src);

is_server_seekable(…)
{
	Create soup synchronous session
	Build message with Range header (bytes=0-1024)
         - I picked up the small range value(1K) to reduce latency
        Send message (Partial GET) to src->location
        Check server’s response, status_line
        Return statue_line == SOUP_STATUS_PARTIAL_CONTENT ? 1 : 0
}

Pros.
- let client know EXACT server’s seekable status
-client can seek, even thought server response with chunked encoding without content length

Cons.
- latency will be increased when start() is called. 
- unknown side-effect
Comment 2 Sebastian Dröge (slomo) 2011-09-07 09:37:21 UTC
Could you attach this as a patch, e.g. in "git format-patch" style?
Comment 3 Tim-Philipp Müller 2011-09-25 13:56:10 UTC
Removing blocker status (this is not a regression, is it?)
Comment 4 Tim-Philipp Müller 2013-02-11 23:24:30 UTC
If you have a DLNA server that can't provide a proper full BYTE range  e.g. because it is live transcoding or so, there is probably a TIME range and any seeks should be done in TIME format then. There will be some demuxer-related issues to go with that as well then (this can only really be supported sanely for streaming containers, I think).
Comment 5 Lori Anderson 2013-09-11 19:24:34 UTC
In terms of dlna servers, two headers provide information which indicates if content is seekable:  Range and TimeSeekRange.dlna.org.  When determining if a server supports seeking (IMHO), it should not be dependent on encoding scheme (streaming vs chunked), but the responses received in the HEAD request for these two headers.  I would suggest issuing a HEAD request with Range: bytes=0- and a dlna.org.TimeSeekRange: npt=0-.  The server will return either or both, something like: 
content-range: 0-168042671/168042672
TimeSeekRange.DLNA.org: npt=0-69/69 bytes=0-168042671/168042672.
Comment 6 Tim-Philipp Müller 2013-09-12 10:08:44 UTC
We don't support seeking in TIME format over http yet, fwiw, and if the server is transcoding/muxing it might be able to seek in time but not in bytes.
Comment 7 Lori Anderson 2013-09-12 12:47:55 UTC
That's OK the seeking in TIME format over http is not support.  But I think there is still value in using the Range or TimeSeekRange header in the initial HEAD request in order to get back the content size.  The main point is to separate determining content length from the encoding scheme.
Comment 8 Lori Anderson 2013-09-17 18:49:26 UTC
I've got a patch which allows seeking for servers that don't respond with content-length but do response to Range header.
Comment 9 Lori Anderson 2013-09-17 18:51:06 UTC
Created attachment 255132 [details] [review]
Patch which will determine content size using content-range or timeseekrange headers
Comment 10 Lori Anderson 2013-09-18 20:08:12 UTC
Created attachment 255250 [details] [review]
Code clean up and got rid of the use of TimeSeekRange header.

Attaching new patch - code clean up and got rid of the use of TimeSeekRange header.
Comment 11 Sebastian Dröge (slomo) 2013-10-07 13:59:29 UTC
Review of attachment 255132 [details] [review]:

Don't use // comments

::: ext/soup/gstsouphttpsrc.c
@@ +1432,3 @@
+
+static void
+gst_soup_http_src_determine_size (GstSoupHTTPSrc * src)

Maybe the content-length handling should be moved in here too

@@ +1457,3 @@
+
+      // Parse out size from string in format:
+      //   TimeSeekRange.dlna.org: npt=0:00:00.000-0:04:58.783/0:04:58.783 bytes=0-31683075/31683076

You could just use something like sscanf() here. Would also be good to do something with the time values here :)

@@ +1463,3 @@
+      for (i = 0; value[i]; i++) {
+        value[i] = toupper (value[i]);
+      }

There's a glib function to convert a complete string to upper case. g_str_toupper() or g_strup() or something like that... but can't you just use sscanf() on that without converting to uppercase?
Comment 12 Sebastian Dröge (slomo) 2013-10-07 14:01:31 UTC
Review of attachment 255250 [details] [review]:

No // comments

::: ext/soup/gstsouphttpsrc.c
@@ +795,3 @@
+    /* Use HEAD response headers to determine content size if content-length not included */
+    GST_INFO_OBJECT (src, "Determine size based on HTTP HEAD response headers");
+    gst_soup_http_src_determine_size (src);

Move content-length handling in there too
Comment 13 Lori Anderson 2013-10-07 14:38:53 UTC
OK - will address and submit updated patch
Comment 14 Steve Maynard 2013-12-24 14:00:28 UTC
Created attachment 264842 [details] [review]
updated patch
Comment 15 Steve Maynard 2013-12-24 14:02:51 UTC
(In reply to comment #14)
> Created an attachment (id=264842) [details] [review]
> updated patch

This should have replaced attachment 255250 [details] [review] (but bugzilla wouldn't let me)...
Comment 16 Sebastian Dröge (slomo) 2013-12-24 14:19:19 UTC
Review of attachment 264842 [details] [review]:

Please attach it in "git format-patch" format :)

::: ext/soup/gstsouphttpsrc.c
@@ +795,3 @@
+    /* Use HEAD response headers to determine content size if content-length not included */
+    GST_INFO_OBJECT (src, "Determine size based on HTTP HEAD response headers");
+    gst_soup_http_src_determine_size (src);

Can't we do the if-branch above in here too?

@@ +1427,3 @@
+  const gchar *header_name, *header_value;
+  SoupMessageHeadersIter iter;
+  GString *log_str = g_string_sized_new (256);

This GString is leaked

@@ +1476,3 @@
+              "content-range")) != NULL) {
+    /* Parse out size from string in format: bytes 0-168042671/168042672 */
+    format = "BYTES%*[^/]/%" G_GUINT64_FORMAT;

Put format directly into the sscanf() call, it makes gcc and clang unhappy otherwise with -Wformat=2 :) Also you only use this single format here, nothing else

And sscanf() does not generally support regular expressions AFAIK
Comment 17 Steve Maynard 2013-12-24 15:47:53 UTC
Created attachment 264845 [details] [review]
additional updates per request

This updated patch addresses the have_size if-branch request, the GString leak fix, and the format in-line sscanf request.  In regards to the format, please consider the following link:
http://stackoverflow.com/questions/5999164/is-scanfs-regex-support-a-standard

Finally, I'm having difficulties running git format-patch against a branch I created for the purposes of assisting on this bug.  When I am in my branch and run:
 git format-patch upstream/master
I end up with a patch file for every delta between the date of the original change and now.
Comment 18 Sebastian Dröge (slomo) 2013-12-30 10:41:46 UTC
Interesting, I didn't know about that feature. And it's even in C89 already.
Comment 19 Sebastian Dröge (slomo) 2013-12-30 10:43:37 UTC
Review of attachment 264845 [details] [review]:

::: ext/soup/gstsouphttpsrc.c
@@ +792,3 @@
           gst_message_new_duration_changed (GST_OBJECT (src)));
     }
+  } else {

Why is this still having the content size block above? Can't we merge that into gst_soup_http_src_determine_size()?
Comment 20 Steve Maynard 2013-12-30 13:51:41 UTC
(In reply to comment #19)
> Review of attachment 264845 [details] [review]:
> 
> ::: ext/soup/gstsouphttpsrc.c
> @@ +792,3 @@
>            gst_message_new_duration_changed (GST_OBJECT (src)));
>      }
> +  } else {
> 
> Why is this still having the content size block above? Can't we merge that into
> gst_soup_http_src_determine_size()?

Maybe I misunderstood the previous comment "Can't we do the if-branch above in here too?". Are you asking for the code from line #777 through #794 be moved into the the function gst_soup_http_src_determine_size?
Comment 21 Sebastian Dröge (slomo) 2013-12-30 14:31:09 UTC
Yes
Comment 22 Steve Maynard 2013-12-30 14:52:00 UTC
Great - I did that and will provide another patch when testing is complete...
Comment 23 Steve Maynard 2013-12-30 15:24:27 UTC
Created attachment 265037 [details] [review]
replaces all prior patches - code consolidation per request

This patch resolves consolidation of all sizing code into the gst_soup_http_src_determine_size() method.
Comment 24 Sebastian Dröge (slomo) 2013-12-31 10:02:26 UTC
Review of attachment 265037 [details] [review]:

::: ext/soup/gstsouphttpsrc.c
@@ +1455,3 @@
+      basesrc->segment.duration = src->content_size;
+      gst_element_post_message (GST_ELEMENT (src),
+          gst_message_new_duration_changed (GST_OBJECT (src)));

The reason why I wanted this merged into this function was that there's some duplicated code here...

@@ +1510,3 @@
+    basesrc->segment.duration = src->content_size;
+    gst_element_post_message (GST_ELEMENT (src),
+        gst_message_new_duration_changed (GST_OBJECT (src)));

... which is duplicated here
Comment 25 Sebastian Dröge (slomo) 2014-02-12 08:40:17 UTC
Any plans to update this? Seems like a good idea to me
Comment 26 Steve Maynard 2014-02-12 14:43:47 UTC
I'm no longer at CableLabs - I believe they will reassign this to another engineer...
Comment 27 Sebastian Dröge (slomo) 2017-03-16 13:36:09 UTC
Will they? :) If not we should close this...
Comment 28 Edward Hervey 2017-11-10 09:40:15 UTC
Closing. No proper news in over 3 years. Please re-open if you have new patch.