After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 678429 - souphttpsrc: should return GST_FLOW_ERROR from create function in case of HTTP error 404 not found
souphttpsrc: should return GST_FLOW_ERROR from create function in case of HTT...
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
git master
Other Linux
: Normal minor
: 1.0.6
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2012-06-20 00:38 UTC by Norbert
Modified: 2013-03-22 11:52 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch for 678429 (505 bytes, patch)
2012-06-22 21:56 UTC, Norbert
committed Details | Review

Description Norbert 2012-06-20 00:38:00 UTC
gst_soup_http_src_create does not return GST_FLOW_ERROR in case of HTTP error 404 not found. 


What has been observed:
=======================
gst_soup_http_src_got_headers_cb calls gst_soup_http_src_parse_status.
gst_soup_http_src_parse_status detects an HTTP error and calls GST_ELEMENT_ERROR.  In addition, gst_soup_http_src_parse_status assigns GST_FLOW_ERROR to src->ret (GstSoupHTTPSrc element instance structure).  IMHO, one of the subsequent calls to gst_soup_http_src_chunk_allocator appears to occur unexpected.  The reason for saying is that the GST_FLOW_ERROR of src->ret gets overwritten by a GST_FLOW_OK.  During the execution of gst_soup_http_src_chunk_allocator, gst_pad_alloc_buffer gets called and its return value gets assigned to src->ret.  As a consequence, the subsequent gst_soup_http_src_create returns a GST_FLOW_OK instead of GST_FLOW_ERROR as expected.  All this causes a buffer being allocated, filled with data of an HTML body, and pushed out over the source pad of the Soup HTTP src.  Not sure whether this is intended behavior.


Experiment:
===========
The following experiment was run in an attempt to avoid that the error value of src->ret gets overwritten. 

static void
gst_soup_http_src_got_headers_cb (SoupMessage * msg, GstSoupHTTPSrc * src)
{

...

  /* Handle HTTP errors. */
  gst_soup_http_src_parse_status (msg, src);

  /* Check if Range header was respected. */
  if (src->ret == GST_FLOW_CUSTOM_ERROR &&
      src->read_position && msg->status_code != SOUP_STATUS_PARTIAL_CONTENT) {
...
    src->ret = GST_FLOW_ERROR;
  }

/* START OF EXPERIMENTAL CODE */
  if ( src->ret == GST_FLOW_ERROR )
  {
      gst_soup_http_src_session_pause_message( src );
      if (src->loop)
      {
          g_main_loop_quit (src->loop);
      }
  }
/* END OF EXPERIMENTAL CODE */
}


How to reproduce:
=================
Make sure that directory/filename part of the URI passed to Soup HTTP Src does not match with any of the files located on the HTTP 1.1 test server.  This will cause a HTTP error response 404 from the server.

HTTP 1.1 Server:
================
Apache/2.2.15 (Fedora)

Gstreamer version:
==================
gst-plugins-good-0.10.16


Please be so kind on comment on my problem seen.  It would be much appreciated.
Comment 1 Tim-Philipp Müller 2012-06-20 09:33:48 UTC
> Gstreamer version:
> ==================
> gst-plugins-good-0.10.16 
> 
> Please be so kind on comment on my problem seen.  It would be much appreciated.

This version of gst-plugins-good is almost 3 years old. Could you test with the latest release please? (-good 0.10.31)
Comment 2 Norbert 2012-06-20 20:16:04 UTC
Hi Tim-Philipp,

Thanks for your comment.  I am seeing the same problem with the 0.10.31 version of gstsouphttpsrc.c.  

In addition, I would like to add a minor revision to my observations:
=====================================================================
I've noticed that it is NOT the subsequent gst_soup_http_src_chunk_allocator call that overwrites the error value GST_FLOW_ERROR of src->ret.  The gst_soup_http_src_got_chunk_cb does.  

Revised paragraph for "What has been observed:"
===============================================
gst_soup_http_src_create initializes src->ret to GST_FLOW_CUSTOM_ERROR (GstSoupHTTPSrc element instance structure) and then pends on src->ret to change to a different error value.  This is at least my understanding of this part of the code.  While gst_soup_src_create is pending, the following callbacks are being observed: a) gst_soup_http_src_got_headers_cb and b) gst_soup_http_src_got_chunk_cb.
The gst_soup_http_src_got_headers_cb calls gst_soup_http_src_parse_status.
gst_soup_http_src_parse_status detects an HTTP error and calls
GST_ELEMENT_ERROR.  In addition, gst_soup_http_src_parse_status assigns
GST_FLOW_ERROR to src->ret.  IMHO, a subsequent call to gst_soup_http_src_got_chunk_cb appears to overwrite the error status by assigning GST_FLOW_OK to src->ret.  As a consequence, the pending gst_soup_http_src_create returns a GST_FLOW_OK instead of GST_FLOW_ERROR as expected.  The gstreamer app receives a gst error message 404 Not Found, but I don't see an Internal Data Flow Error triggered inline.  

Behavior seen with file src:
============================
When setting an invalid "location" property (pathname does not match with any files on the drive), I see an Internal Data Flow error.  This would be next to a gst error message of "Resource Not Found".
 
Experiment2:
============
An additional experiment was run in an attempt to avoid that the error value of
src->ret gets overwritten.  The experimental code under "Experiment2:" was added instead of the one listed under "Experiment:".

static void
gst_soup_http_src_got_chunk_cb (SoupMessage * msg, SoupBuffer * chunk,
    GstSoupHTTPSrc * src)
{
...
  new_position = src->read_position + chunk->length;
  if (G_LIKELY (src->request_position == src->read_position))
    src->request_position = new_position;
  src->read_position = new_position;

/* START OF EXPERIMENTAL CODE */
  if (src->ret == GST_FLOW_CUSTOM_ERROR )
  {
      src->ret = GST_FLOW_OK;
  }
/* END OF EXPERIMENTAL CODE */

  g_main_loop_quit (src->loop);
  gst_soup_http_src_session_pause_message (src);
}
Comment 3 Tim-Philipp Müller 2012-06-20 22:02:03 UTC
Well, I think you're onto something, though I'm not sure how much it should matter really. The reason you're not getting a follow-up internal flow error is that the pad_push returns wrong-state rather than an error, which will cause the source to shut down silently.

Perhaps you'd like to attempt a patch that makes it return GST_FLOW_ERROR from create() as it should?
Comment 4 Norbert 2012-06-21 02:01:01 UTC
Hi Tim Philipp,

Thanks for your comment.  

Wrt "Perhaps you'd like to attempt a patch that makes it return GST_FLOW_ERROR from create() as it should?"...  Would this involve a change to gst_soup_http_src_got_headers_cb?  A change to quit the g_main_loop prior to the callback returning AND whenever src->ret is GST_FLOW_ERROR?  Please see "Experiment:" in initial problem description.  If this is the case I am not sure whether a patch would involve a mod to create.  

Please point me in the right direction if I am missing out on something here.

Thanks,

Norbert
Comment 5 Norbert 2012-06-22 21:56:52 UTC
Created attachment 217051 [details] [review]
Patch for 678429

Attaching patch as requested.  Please provide feedback.  

Without this patch we see the following:
========================================
1) gst_soup_http_src_got_headers_cb detects error and Gst Error message 404 Not Fount is being sent.  In addition, src->ret is being set to GST_FLOW_ERROR.
2) g_main_loop is NOT being quit.
3) gst_soup_http_src_got_chunk_cb gets called with 322 bytes of data (see HTML body below)
4) gst_soup_http_src_got_chunk_cb overwrites error code in src->ret with GST_FLOW_OK.
5) gst_soup_http_src_create returns GST_FLOW_OK (instead of GST_FLOW_ERROR to trigger Internal Data Flow Error)
6) The 322 bytes (of MIME type html/text) are pushed over the src pad of the Soup HTTP source element.

With the patch we see the following:
========================================
1) gst_soup_http_src_got_headers_cb detects error and Gst Error message 404 Not Fount is being send.  In addition, src->ret is being set to GST_FLOW_ERROR.
2) g_main_loop is being quit.
3) gst_soup_http_src_create returns GST_FLOW_ERROR which triggers an Internal Data Flow Error.
4) No data is being pushed over the src pad of the Soup HTTP source element.




<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> 
<html><head> 
<title>404 Not Found</title> 
</head><body> 
<h1>Not Found</h1> 
<p>The requested URL /mp4/1Winnie The Pooh - Official Trailer [HD].mp4 was not 
found on this server.</p> 
<hr> 
<address>Apache/2.2.15 (Fedora) Server at 10.4.2.100 Port 80</address> 
</body></html> 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> 
<html><head> 
<title>404 Not Found</title> 
</head><body> 
<h1>Not Found</h1> 
<p>The requested URL /mp4/1Winnie The Pooh - Official Trailer [HD].mp4 was not 
found on this server.</p> 
<hr> 
<address>Apache/2.2.15 (Fedora) Server at 10.4.2.100 Port 80</address> 
</body></html>
Comment 6 Tim-Philipp Müller 2013-01-01 19:17:36 UTC
It's been a while, but this should be fixed now. Thanks for the bug report and the patch! 

commit baac8ad66340f11c24e24817be3c620b2c8b8032
Author: Tim-Philipp Müller <tim@centricular.net>
Date:   Tue Jan 1 19:14:36 2013 +0000

    tests: add test for souphttpsrc error handling with data
    
    https://bugzilla.gnome.org/show_bug.cgi?id=678429

commit c00b142f4482a58bf3409c3375c29e8f33948926
Author: Norbert Waschbuesch <nwaschbu@opentv.com>
Date:   Fri Jun 22 21:56:52 2012 +0000

    souphttpsrc: error out properly when receiving data along with an error status
    
    When receiving an error code from the http server, such as 404,
    data might be sent along with it, like a web page. We don't want
    to output that data in this case, and we also want to pass the
    FLOW_ERROR return back to the base class, so it can stop properly.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=678429

(Hope I infered your proper name correctly).