GNOME Bugzilla – Bug 777222
souphttpsrc: redirect uri is never set
Last modified: 2017-02-10 06:38:32 UTC
Created attachment 343438 [details] [review] souphttpsrc: properly track redirections Hi, In 1.10 and git master I can see that some HLS streams do not play. In this case this is due to a redirect in the GET request for the master playlist. The variant playlist URIs are relative, and the base URI is set incorrectly to the master playlist URI, instead of the URI set in the Location header from the redirect reply. The code to handle the base URI is correct in adaptivedemux/hlsdemux, using an URI query to the source element to retrieve the download URI, possible redirect URI and the permanent flag (301 or 302 code). However souphttpsrc never sets the redirect URI. In the current code libsoup is configured to transparently handle redirects, instead of letting the caller handle the 30x code. I'm attaching a tentative patch that works for me, but I'm not confident that setting the src->ret code like I did here is correct.
test url here: http://absolut.zogzog.org/pub/samples/hls/apple_basic/bipbop_4x3_variant.m3u8
Review of attachment 343438 [details] [review]: ::: ext/soup/gstsouphttpsrc.c @@ +1064,3 @@ msg->status_code, src->redirection_uri, src->redirection_permanent); + + soup_session_redirect_message (src->session, msg); We should probably first check with soup_session_would_redirect() if there's an actual redirect... and then only call soup_session_redirect_message() if src->automatic_redirect==TRUE
Created attachment 343543 [details] [review] souphttpsrc: properly track redirections You're right, I'm attaching a better version. The redirection_uri is now set using soup_uri_to_string (soup_message_get_uri (msg), FALSE) instead of just using the location header, which can be relative and needs to be transformed to an absolute URI. If automatic_redirect is not set, the request just fails as before.
Created attachment 343544 [details] [review] souphttpsrc: make flow return values handling clearer After reading the code I think we can simplify the flow result handling and make the code clearer. Instead of storing the GstFlowReturn value in the class, we can simply pass it when returning function. We couldn't do this before because the headers handling was done using libsoup callbacks, but this is not the case anymore. This patch applies on top of redirection patch, otherwise it would be harder to backport the redirection patch to 1.10 if needed.
Review of attachment 343543 [details] [review]: ::: ext/soup/gstsouphttpsrc.c @@ +1444,3 @@ return GST_FLOW_FLUSHING; + src->ret = GST_FLOW_OK; Is this part needed if you already reset it to GST_FLOW_OK when it is a CUSTOM_ERROR?
(In reply to Thiago Sousa Santos from comment #5) > Review of attachment 343543 [details] [review] [review]: > > ::: ext/soup/gstsouphttpsrc.c > @@ +1444,3 @@ > return GST_FLOW_FLUSHING; > > + src->ret = GST_FLOW_OK; > > Is this part needed if you already reset it to GST_FLOW_OK when it is a > CUSTOM_ERROR? Yes, otherwise the redirection is retried until max_retries and then fails. That's why I implemented the second patch, the src->ret variable is confusing.
commit 02317496c6f3334e1fb2eae72662d04db51d5851 Author: Arnaud Vrac <avrac@freebox.fr> Date: Mon Jan 16 11:58:02 2017 +0100 souphttpsrc: make flow return values handling clearer The flow return values was stored in the element before because the result had to be set from callbacks. This is not the case anymore, we can return the flow result directly from functions, making the code easier to understand. https://bugzilla.gnome.org/show_bug.cgi?id=777222 commit 9b23732559dfed243ae425f92a9a8f4cd2261588 Author: Arnaud Vrac <avrac@freebox.fr> Date: Fri Jan 13 16:40:43 2017 +0100 souphttpsrc: properly track redirections The current code configures libsoup to handle redirections transparently, without informing the caller, thus preventing the element to record the redirect code and location uri. Fix this by always setting the SOUP_MESSAGE_NO_REDIRECT, preventing libsoup from handling the redirection. When we receive a redirection request and libsoup can safely handle it, return a custom error which triggers a retry with the new URI. https://bugzilla.gnome.org/show_bug.cgi?id=777222
Pushed to 1.10 commit 780c99fbb3c58caafef83035accae1fce250b57a Author: Arnaud Vrac <avrac@freebox.fr> Decided to put only the real bugfix to reduce chance of regressions.
Breaks validate tests, https://ci.gstreamer.net/job/GStreamer-master-validate/4438/testReport/
I still don't know why the CI times out, but I found another issue in the code. The code under /* Check if Range header was respected. */ is wrong. The CUSTOM_ERROR error code is not necessarily set because of a seek, it could also happen because of a network error while connecting. We should either remove the check, as we already check that the server exposes the Accept-Ranges before. Or I could add a check when read_position is set for a 206 return code and parse the Content-Ranges header to match it with read_position.
Created attachment 343776 [details] [review] souphttpsrc: Initialize return variable In the normal use-case we would end up with ret being unitialized causing havoc.
commit 41fa3fc4998f4369efdec55714eab4d9ec81b8f7 Author: Edward Hervey <edward@centricular.com> Date: Thu Jan 19 08:37:37 2017 +0100 souphttpsrc: Initialize return variable In the normal use-case we would end up with ret being unitialized causing havoc. https://bugzilla.gnome.org/show_bug.cgi?id=777222
Created attachment 343792 [details] [review] souphttpsrc: properly check that seek range was respected The code checking the seek range is broken in master because of the GstFlowReturn refactoring. It was also slightly broken before because the check would only be done on retry, not on seek (in this case src->ret would not be equal to CUSTOM_ERROR), and because we could also do the check before making sure the request was actually sent successfully. This patch fixes it by moving the check later, when we're sure both the status code is valid and soup_session_send() was successful.
Created attachment 343794 [details] [review] souphttpsrc: report a useful error message when soup_session_send fails I've seen that libsoup can return a NULL input stream in soup_session_send, and not set the message status code (status code is 0). In this case it's better to print the GError message to understand what's happening. Thiago, do you know why libsoup doesn't set the status code in some cases ? I've seen this while running gst-validate, I got some network errors like Socket I/O timed out or Connection reset by peer, and the status code would stay at 0. In this case the connection is not retried and we don't report a pretty error to the user.
Example log output: 0:00:00.089847339 19071 0x55e0f0690f50 DEBUG souphttpsrc gstsouphttpsrc.c:1729:gst_soup_http_src_start:<source> start("http://127.0.0.1:8079/defaults/ogg/vorbis_theora.1.ogg") 0:00:00.089874507 19071 0x55e0f0690f50 DEBUG souphttpsrc gstsouphttpsrc.c:901:gst_soup_http_src_session_open:<source> Creating session 0:00:00.176436085 19071 0x55e0f0690f50 DEBUG souphttpsrc gstsouphttpsrc.c:1810:gst_soup_http_src_get_size:<source> get_size() = FALSE 0:00:00.176476104 19071 0x55e0f0690f50 DEBUG souphttpsrc gstsouphttpsrc.c:1856:gst_soup_http_src_do_seek:<source> do_seek(0-18446744073709551615) 0:00:00.176493983 19071 0x55e0f0690f50 DEBUG souphttpsrc gstsouphttpsrc.c:1861:gst_soup_http_src_do_seek:<source> Seek to current read/end position and no seek pending 0:00:00.177191210 19071 0x55e0f0693990 LOG souphttpsrc gstsouphttpsrc.c:1467:gst_soup_http_src_do_request:<source> Running request for method: GET 0:00:01.193667193 19071 0x55e0f0693990 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): > GET /defaults/ogg/vorbis_theora.1.ogg HTTP/1.1 0:00:01.193740108 19071 0x55e0f0693990 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): > Soup-Debug-Timestamp: 1484827674 0:00:01.193797867 19071 0x55e0f0693990 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): > Soup-Debug: SoupSession 1 (0x55e0f06a1270), SoupMessage 1 (0x55e0f06afe60), SoupSocket 1 (0x55e0f05d36b0) 0:00:01.193850807 19071 0x55e0f0693990 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(H): > Host: 127.0.0.1:8079 0:00:01.193886845 19071 0x55e0f0693990 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(H): > Connection: close 0:00:01.193935385 19071 0x55e0f0693990 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(H): > icy-metadata: 1 0:00:01.193968071 19071 0x55e0f0693990 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(H): > User-Agent: GStreamer souphttpsrc libsoup/2.54.1 0:00:01.194011721 19071 0x55e0f0693990 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): 0:00:16.209017698 19071 0x55e0f0693990 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): < HTTP/1.1 0 (null) 0:00:16.209065819 19071 0x55e0f0693990 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): < Soup-Debug-Timestamp: 1484827689 0:00:16.209099972 19071 0x55e0f0693990 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): < Soup-Debug: SoupMessage 1 (0x55e0f06afe60) 0:00:16.209127489 19071 0x55e0f0693990 TRACE souputils gstsouputils.c:66:gst_soup_util_log_printer_cb:<source> HTTP_SESSION(M): 0:00:16.209320042 19071 0x55e0f0693990 INFO souphttpsrc gstsouphttpsrc.c:1033:gst_soup_http_src_got_headers:<source> got headers 0:00:16.209423547 19071 0x55e0f0693990 DEBUG souphttpsrc gstsouphttpsrc.c:1434:gst_soup_http_src_send_message:<source> Didn't get an input stream: Socket I/O timed out 0:00:16.209460004 19071 0x55e0f0693990 DEBUG souphttpsrc gstsouphttpsrc.c:1705:gst_soup_http_src_create:<source> Returning -5 error 0:00:16.209575313 19071 0x55e0f0693990 WARN basesrc gstbasesrc.c:2951:gst_base_src_loop:<source> error: Internal data stream error. 0:00:16.209606881 19071 0x55e0f0693990 WARN basesrc gstbasesrc.c:2951:gst_base_src_loop:<source> error: streaming stopped, reason error (-5)
(In reply to Arnaud Vrac from comment #14) > Created attachment 343794 [details] [review] [review] > souphttpsrc: report a useful error message when soup_session_send fails > > I've seen that libsoup can return a NULL input stream in soup_session_send, > and not set the message status code (status code is 0). In this case it's > better to print the GError message to understand what's happening. That sounds like a libsoup bug. I would expect this to only happen if the system runs out of resources (like fds), but that's probably not the case for you. Can you get that to the libsoup people?
commit c08a0493ff51866cf6e7cb92d513191984ec8aef Author: Arnaud Vrac <avrac@freebox.fr> Date: Thu Jan 19 12:29:44 2017 +0100 souphttpsrc: report a useful error message when soup_session_send fails This helps to understand cases where libsoup doesn't set the message status code after running soup_session_send. https://bugzilla.gnome.org/show_bug.cgi?id=777222 commit c4cf67cfe5f7b03d38d5657c8fa9b8aca19ba400 Author: Arnaud Vrac <avrac@freebox.fr> Date: Thu Jan 19 11:05:00 2017 +0100 souphttpsrc: properly check that seek range was respected This check must be done only when we are sure the request was successfully sent. soup_session_send() might fail without setting the status code. In this case status code is 0 so we would only catch the error after the seek range check. In this case we would report an error saying that the seek range was not respected, instead of reporting the underlying error that triggered the soup_session_send() failure. https://bugzilla.gnome.org/attachment.cgi?bugid=777222
This still fails all the HTTP gst-validate tests for me locally
That is unrelated to these changes though, and looks more like a gst-validate problem.
Hello all There is a regression on redirection handling, and seems to be caused by the commit "souphttpsrc: properly track redirections" When adaptivedemux is used and also a fragment's uri is redirected, the connection for original host may not be cleared properly, since the redirection was handled by souphttpsrc. At the end, connection count for a host will hit the MAX_CONNS_PER_HOST defined in libsoup (this causes pipeline stuck). I'm not sure this is a bug of libsoup, or souphttpsrc though. Issue can be reprodued using this m3u8. http://once.unicornmedia.com/now/master/playlist/4236fac6-f0df-4f9a-b9c7-159cfb257618/97255fb8-6105-4caa-9622-de49bba2cd4b/93ebee84-0b53-4671-a4e1-a5c95b083a0d/content.m3u8?UMOPROFILES=hd3000
Can you open a new bug for this? Thanks!
(In reply to Sebastian Dröge (slomo) from comment #21) > Can you open a new bug for this? Thanks! I made new bug at Bug #778428