After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 777222 - souphttpsrc: redirect uri is never set
souphttpsrc: redirect uri is never set
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
git master
Other Linux
: Normal blocker
: 1.10.3
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2017-01-13 15:44 UTC by Arnaud Vrac
Modified: 2017-02-10 06:38 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
souphttpsrc: properly track redirections (1.90 KB, patch)
2017-01-13 15:44 UTC, Arnaud Vrac
none Details | Review
souphttpsrc: properly track redirections (3.59 KB, patch)
2017-01-16 11:12 UTC, Arnaud Vrac
committed Details | Review
souphttpsrc: make flow return values handling clearer (11.01 KB, patch)
2017-01-16 11:18 UTC, Arnaud Vrac
committed Details | Review
souphttpsrc: Initialize return variable (872 bytes, patch)
2017-01-19 07:38 UTC, Edward Hervey
committed Details | Review
souphttpsrc: properly check that seek range was respected (3.38 KB, patch)
2017-01-19 11:50 UTC, Arnaud Vrac
committed Details | Review
souphttpsrc: report a useful error message when soup_session_send fails (1.98 KB, patch)
2017-01-19 11:55 UTC, Arnaud Vrac
committed Details | Review

Description Arnaud Vrac 2017-01-13 15:44:37 UTC
Created attachment 343438 [details] [review]
souphttpsrc: properly track redirections

Hi,

In 1.10 and git master I can see that some HLS streams do not play. In this case this is due to a redirect in the GET request for the master playlist. The variant playlist URIs are relative, and the base URI is set incorrectly to the master playlist URI, instead of the URI set in the Location header from the redirect reply.

The code to handle the base URI is correct in adaptivedemux/hlsdemux, using an URI query to the source element to retrieve the download URI, possible redirect URI and the permanent flag (301 or 302 code).

However souphttpsrc never sets the redirect URI. In the current code libsoup is configured to transparently handle redirects, instead of letting the caller handle the 30x code.

I'm attaching a tentative patch that works for me, but I'm not confident that setting the src->ret code like I did here is correct.
Comment 2 Sebastian Dröge (slomo) 2017-01-15 09:36:41 UTC
Review of attachment 343438 [details] [review]:

::: ext/soup/gstsouphttpsrc.c
@@ +1064,3 @@
           msg->status_code, src->redirection_uri, src->redirection_permanent);
+
+      soup_session_redirect_message (src->session, msg);

We should probably first check with soup_session_would_redirect() if there's an actual redirect... and then only call soup_session_redirect_message() if src->automatic_redirect==TRUE
Comment 3 Arnaud Vrac 2017-01-16 11:12:39 UTC
Created attachment 343543 [details] [review]
souphttpsrc: properly track redirections

You're right, I'm attaching a better version. The redirection_uri is now set using soup_uri_to_string (soup_message_get_uri (msg), FALSE) instead of just using the location header, which can be relative and needs to be transformed to an absolute URI.

If automatic_redirect is not set, the request just fails as before.
Comment 4 Arnaud Vrac 2017-01-16 11:18:45 UTC
Created attachment 343544 [details] [review]
souphttpsrc: make flow return values handling clearer

After reading the code I think we can simplify the flow result handling and make the code clearer. Instead of storing the GstFlowReturn value in the class, we can simply pass it when returning function. We couldn't do this before because the headers handling was done using libsoup callbacks, but this is not the case anymore.

This patch applies on top of redirection patch, otherwise it would be harder to backport the redirection patch to 1.10 if needed.
Comment 5 Thiago Sousa Santos 2017-01-16 21:35:24 UTC
Review of attachment 343543 [details] [review]:

::: ext/soup/gstsouphttpsrc.c
@@ +1444,3 @@
     return GST_FLOW_FLUSHING;
 
+  src->ret = GST_FLOW_OK;

Is this part needed if you already reset it to GST_FLOW_OK when it is a CUSTOM_ERROR?
Comment 6 Arnaud Vrac 2017-01-16 21:52:00 UTC
(In reply to Thiago Sousa Santos from comment #5)
> Review of attachment 343543 [details] [review] [review]:
> 
> ::: ext/soup/gstsouphttpsrc.c
> @@ +1444,3 @@
>      return GST_FLOW_FLUSHING;
>  
> +  src->ret = GST_FLOW_OK;
> 
> Is this part needed if you already reset it to GST_FLOW_OK when it is a
> CUSTOM_ERROR?

Yes, otherwise the redirection is retried until max_retries and then fails. That's why I implemented the second patch, the src->ret variable is confusing.
Comment 7 Thiago Sousa Santos 2017-01-18 01:17:29 UTC
commit 02317496c6f3334e1fb2eae72662d04db51d5851
Author: Arnaud Vrac <avrac@freebox.fr>
Date:   Mon Jan 16 11:58:02 2017 +0100

    souphttpsrc: make flow return values handling clearer
    
    The flow return values was stored in the element before because the
    result had to be set from callbacks. This is not the case anymore, we
    can return the flow result directly from functions, making the code
    easier to understand.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=777222

commit 9b23732559dfed243ae425f92a9a8f4cd2261588
Author: Arnaud Vrac <avrac@freebox.fr>
Date:   Fri Jan 13 16:40:43 2017 +0100

    souphttpsrc: properly track redirections
    
    The current code configures libsoup to handle redirections
    transparently, without informing the caller, thus preventing the element
    to record the redirect code and location uri.
    
    Fix this by always setting the SOUP_MESSAGE_NO_REDIRECT, preventing
    libsoup from handling the redirection. When we receive a redirection
    request and libsoup can safely handle it, return a custom error which
    triggers a retry with the new URI.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=777222
Comment 8 Thiago Sousa Santos 2017-01-18 01:35:20 UTC
Pushed to 1.10
commit 780c99fbb3c58caafef83035accae1fce250b57a
Author: Arnaud Vrac <avrac@freebox.fr>

Decided to put only the real bugfix to reduce chance of regressions.
Comment 9 Sebastian Dröge (slomo) 2017-01-18 09:51:03 UTC
Breaks validate tests, https://ci.gstreamer.net/job/GStreamer-master-validate/4438/testReport/
Comment 10 Arnaud Vrac 2017-01-18 14:54:51 UTC
I still don't know why the CI times out, but I found another issue in the code.

The code under /* Check if Range header was respected. */ is wrong. The CUSTOM_ERROR error code is not necessarily set because of a seek, it could also happen because of a network error while connecting.

We should either remove the check, as we already check that the server exposes the Accept-Ranges before. Or I could add a check when read_position is set for a 206 return code and parse the Content-Ranges header to match it with read_position.
Comment 11 Edward Hervey 2017-01-19 07:38:52 UTC
Created attachment 343776 [details] [review]
souphttpsrc: Initialize return variable

In the normal use-case we would end up with ret being unitialized
causing havoc.
Comment 12 Edward Hervey 2017-01-19 07:40:31 UTC
commit 41fa3fc4998f4369efdec55714eab4d9ec81b8f7
Author: Edward Hervey <edward@centricular.com>
Date:   Thu Jan 19 08:37:37 2017 +0100

    souphttpsrc: Initialize return variable
    
    In the normal use-case we would end up with ret being unitialized
    causing havoc.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=777222
Comment 13 Arnaud Vrac 2017-01-19 11:50:43 UTC
Created attachment 343792 [details] [review]
souphttpsrc: properly check that seek range was respected

The code checking the seek range is broken in master because of the GstFlowReturn refactoring. It was also slightly broken before because the check would only be done on retry, not on seek (in this case src->ret would not be equal to CUSTOM_ERROR), and because we could also do the check before making sure the request was actually sent successfully.

This patch fixes it by moving the check later, when we're sure both the status code is valid and soup_session_send() was successful.
Comment 14 Arnaud Vrac 2017-01-19 11:55:00 UTC
Created attachment 343794 [details] [review]
souphttpsrc: report a useful error message when soup_session_send fails

I've seen that libsoup can return a NULL input stream in soup_session_send, and not set the message status code (status code is 0). In this case it's better to print the GError message to understand what's happening.

Thiago, do you know why libsoup doesn't set the status code in some cases ? I've seen this while running gst-validate, I got some network errors like Socket I/O timed out or Connection reset by peer, and the status code would stay at 0. In this case the connection is not retried and we don't report a pretty error to the user.
Comment 15 Arnaud Vrac 2017-01-19 12:13:30 UTC
Example log output:

0:00:00.089847339 19071 0x55e0f0690f50 DEBUG            souphttpsrc gstsouphttpsrc.c:1729:gst_soup_http_src_start:<source> start("http://127.0.0.1:8079/defaults/ogg/vorbis_theora.1.ogg")
0:00:00.089874507 19071 0x55e0f0690f50 DEBUG            souphttpsrc gstsouphttpsrc.c:901:gst_soup_http_src_session_open:<source> Creating session
0:00:00.176436085 19071 0x55e0f0690f50 DEBUG            souphttpsrc gstsouphttpsrc.c:1810:gst_soup_http_src_get_size:<source> get_size() = FALSE
0:00:00.176476104 19071 0x55e0f0690f50 DEBUG            souphttpsrc gstsouphttpsrc.c:1856:gst_soup_http_src_do_seek:<source> do_seek(0-18446744073709551615)
0:00:00.176493983 19071 0x55e0f0690f50 DEBUG            souphttpsrc gstsouphttpsrc.c:1861:gst_soup_http_src_do_seek:<source> Seek to current read/end position and no seek pending
0:00:00.177191210 19071 0x55e0f0693990 LOG              souphttpsrc gstsouphttpsrc.c:1467:gst_soup_http_src_do_request:<source> Running request for method: GET
0:00:01.193667193 19071 0x55e0f0693990 TRACE              souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): > GET /defaults/ogg/vorbis_theora.1.ogg HTTP/1.1
0:00:01.193740108 19071 0x55e0f0693990 TRACE              souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): > Soup-Debug-Timestamp: 1484827674
0:00:01.193797867 19071 0x55e0f0693990 TRACE              souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): > Soup-Debug: SoupSession 1 (0x55e0f06a1270), SoupMessage 1 (0x55e0f06afe60), SoupSocket 1 (0x55e0f05d36b0)
0:00:01.193850807 19071 0x55e0f0693990 TRACE              souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(H): > Host: 127.0.0.1:8079
0:00:01.193886845 19071 0x55e0f0693990 TRACE              souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(H): > Connection: close
0:00:01.193935385 19071 0x55e0f0693990 TRACE              souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(H): > icy-metadata: 1
0:00:01.193968071 19071 0x55e0f0693990 TRACE              souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(H): > User-Agent: GStreamer souphttpsrc libsoup/2.54.1
0:00:01.194011721 19071 0x55e0f0693990 TRACE              souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M):
0:00:16.209017698 19071 0x55e0f0693990 TRACE              souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): < HTTP/1.1 0 (null)
0:00:16.209065819 19071 0x55e0f0693990 TRACE              souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): < Soup-Debug-Timestamp: 1484827689
0:00:16.209099972 19071 0x55e0f0693990 TRACE              souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): < Soup-Debug: SoupMessage 1 (0x55e0f06afe60)
0:00:16.209127489 19071 0x55e0f0693990 TRACE              souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M):
0:00:16.209320042 19071 0x55e0f0693990 INFO             souphttpsrc gstsouphttpsrc.c:1033:gst_soup_http_src_got_headers:<source> got headers
0:00:16.209423547 19071 0x55e0f0693990 DEBUG            souphttpsrc gstsouphttpsrc.c:1434:gst_soup_http_src_send_message:<source> Didn't get an input stream: Socket I/O timed out
0:00:16.209460004 19071 0x55e0f0693990 DEBUG            souphttpsrc gstsouphttpsrc.c:1705:gst_soup_http_src_create:<source> Returning -5 error
0:00:16.209575313 19071 0x55e0f0693990 WARN                 basesrc gstbasesrc.c:2951:gst_base_src_loop:<source> error: Internal data stream error.
0:00:16.209606881 19071 0x55e0f0693990 WARN                 basesrc gstbasesrc.c:2951:gst_base_src_loop:<source> error: streaming stopped, reason error (-5)
Comment 16 Sebastian Dröge (slomo) 2017-01-23 10:54:01 UTC
(In reply to Arnaud Vrac from comment #14)
> Created attachment 343794 [details] [review] [review]
> souphttpsrc: report a useful error message when soup_session_send fails
> 
> I've seen that libsoup can return a NULL input stream in soup_session_send,
> and not set the message status code (status code is 0). In this case it's
> better to print the GError message to understand what's happening.

That sounds like a libsoup bug. I would expect this to only happen if the system runs out of resources (like fds), but that's probably not the case for you. Can you get that to the libsoup people?
Comment 17 Sebastian Dröge (slomo) 2017-01-23 11:05:20 UTC
commit c08a0493ff51866cf6e7cb92d513191984ec8aef
Author: Arnaud Vrac <avrac@freebox.fr>
Date:   Thu Jan 19 12:29:44 2017 +0100

    souphttpsrc: report a useful error message when soup_session_send fails
    
    This helps to understand cases where libsoup doesn't set the message
    status code after running soup_session_send.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=777222

commit c4cf67cfe5f7b03d38d5657c8fa9b8aca19ba400
Author: Arnaud Vrac <avrac@freebox.fr>
Date:   Thu Jan 19 11:05:00 2017 +0100

    souphttpsrc: properly check that seek range was respected
    
    This check must be done only when we are sure the request was
    successfully sent. soup_session_send() might fail without setting the
    status code. In this case status code is 0 so we would only catch the
    error after the seek range check. In this case we would report an error
    saying that the seek range was not respected, instead of reporting the
    underlying error that triggered the soup_session_send() failure.
    
    https://bugzilla.gnome.org/attachment.cgi?bugid=777222
Comment 18 Sebastian Dröge (slomo) 2017-01-23 11:08:13 UTC
This still fails all the HTTP gst-validate tests for me locally
Comment 19 Sebastian Dröge (slomo) 2017-01-26 11:50:07 UTC
That is unrelated to these changes though, and looks more like a gst-validate problem.
Comment 20 Seungha Yang 2017-02-09 23:31:29 UTC
Hello all

There is a regression on redirection handling, and seems to be caused by the commit "souphttpsrc: properly track redirections"

When adaptivedemux is used and also a fragment's uri is redirected, the connection for original host may not be cleared properly, since the redirection was handled by souphttpsrc.
At the end, connection count for a host will hit the MAX_CONNS_PER_HOST defined in libsoup (this causes pipeline stuck).

I'm not sure this is a bug of libsoup, or souphttpsrc though.

Issue can be reprodued using this m3u8.
http://once.unicornmedia.com/now/master/playlist/4236fac6-f0df-4f9a-b9c7-159cfb257618/97255fb8-6105-4caa-9622-de49bba2cd4b/93ebee84-0b53-4671-a4e1-a5c95b083a0d/content.m3u8?UMOPROFILES=hd3000
Comment 21 Sebastian Dröge (slomo) 2017-02-10 06:10:23 UTC
Can you open a new bug for this? Thanks!
Comment 22 Seungha Yang 2017-02-10 06:38:32 UTC
(In reply to Sebastian Dröge (slomo) from comment #21)
> Can you open a new bug for this? Thanks!

I made new bug at Bug #778428