After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 338827 - [patch] gnomevfssrc should use async api to not block on network problems
[patch] gnomevfssrc should use async api to not block on network problems
Status: RESOLVED INCOMPLETE
Product: GStreamer
Classification: Platform
Component: gst-plugins-base
git master
Other Linux
: High major
: git master
Assigned To: Stefan Sauer (gstreamer, gtkdoc dev)
GStreamer Maintainers
: 319873 370769 437652 (view as bug list)
Depends on:
Blocks: 313049 319873
 
 
Reported: 2006-04-17 21:32 UTC by Stefan Sauer (gstreamer, gtkdoc dev)
Modified: 2009-07-27 21:18 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
first (no yet working) attempt for async api (12.59 KB, patch)
2006-04-17 21:41 UTC, Stefan Sauer (gstreamer, gtkdoc dev)
needs-work Details | Review
2nd try to make the change (works now) (2.72 KB, patch)
2006-04-19 21:48 UTC, Stefan Sauer (gstreamer, gtkdoc dev)
rejected Details | Review
3rd try (works find from gst-launch) (16.97 KB, patch)
2006-04-25 07:09 UTC, Stefan Sauer (gstreamer, gtkdoc dev)
none Details | Review
Open non-local URIs in streaming thread (3.05 KB, patch)
2006-10-11 18:47 UTC, René Stadler
none Details | Review
do state transiton in alternate thread (2.80 KB, patch)
2006-10-22 07:11 UTC, James "Doc" Livingston
none Details | Review
4th attempt (using GCond instead of GMainLoop and sleep...) (8.57 KB, patch)
2006-11-28 18:40 UTC, Marc-Andre Lureau
none Details | Review
Fully async HTTP source based on libsoup-2.2 (14.70 KB, patch)
2007-11-15 11:14 UTC, Wouter Cloetens
none Details | Review
Fully async HTTP source based on libsoup-2.2 (15.07 KB, patch)
2007-11-15 11:19 UTC, Wouter Cloetens
none Details | Review

Description Stefan Sauer (gstreamer, gtkdoc dev) 2006-04-17 21:32:53 UTC
Currently gnomevfssrc blocks when network problems occur (like getting out of WLAN area). The element should use the async api of gnomevfs together with a timeout mechanism.
Comment 1 Stefan Sauer (gstreamer, gtkdoc dev) 2006-04-17 21:41:15 UTC
Created attachment 63743 [details] [review]
first (no yet working) attempt for async api

still works for local files, but when using uris (http:// or file://) the reads fail
Comment 2 Stefan Sauer (gstreamer, gtkdoc dev) 2006-04-19 21:48:20 UTC
Created attachment 63905 [details] [review]
2nd try to make the change (works now)

can we get rid of the usleep(100)? any idea?
Comment 3 Stefan Sauer (gstreamer, gtkdoc dev) 2006-04-20 12:28:24 UTC
Comment on attachment 63905 [details] [review]
2nd try to make the change (works now)

wrong patch
Comment 4 Stefan Sauer (gstreamer, gtkdoc dev) 2006-04-25 07:09:15 UTC
Created attachment 64258 [details] [review]
3rd try (works find from gst-launch)

When this patch is applied gnomevfssrc becomes very fragile in respect to a running mainloop. Whenever the mainloop is not running operation can quickly timeout. This especially is a problem when setting the pipeline to NULL. Right now one needs to kick then mainloop manually (g_main_context_iteration(NULL,FALSE);).

This sounds very wrong.
Comment 5 René Stadler 2006-10-08 03:47:30 UTC
This bug seems to cause problems in rhythmbox (see backtrace in #313049 comment #76).  It should also cause problems for any other app using gnomevfssrc.  As the bug had no activity for almost six months and the provided solution seems to be unusable, I'll restate the problem a bit.

The culprit is the start () GstBaseSrc vfunc implementation of gnomevfssrc.  According to the documentation "subclasses should open resources and prepare to produce data".  This effectively gets called by gst_element_set_state (), which must not block.  For a network source, "preparing to produce data" usually means to establish a network connection, which is obviously not guaranteed to be an instantaneous operation.  Therefore, this breaks the semantics of gst_element_set_state (), causing terrible problems for applications.

A very simple method to reproduce the problem is using the netcat program like this: "nc -l -p 10000 localhost".  This simulates a TCP server that eats all data sent to it but never sends anything back.
"gst-launch-0.10 gnomevfssrc location=http://localhost:10000 ! fakesink" will then hang while setting the state to PAUSED (literally, that is to say in gst_element_set_state ()).
Comment 6 Alex Lancaster 2006-10-08 03:59:34 UTC
(In reply to comment #5)
> (see backtrace in #313049 comment #76).  

Generate correct autolink:

bug #313049 comment #76
Comment 7 Alex Lancaster 2006-10-09 03:08:35 UTC
Changing from "enhancement" to "normal" severity because it is a bug (it breaks the state_change() semantics).
Comment 8 René Stadler 2006-10-11 18:47:36 UTC
Created attachment 74506 [details] [review]
Open non-local URIs in streaming thread

This implements basically what Wim stated in bug #319873 comment #4.  The approach is inherently a little imperfect as the time between start() and the first create() isn't used for establishing the connection in the background, but it's better than the current behaviour.

It is assumed that local URIs are opened instantly, so I made start() to act as before in this case, which is needed so that get_size() and is_seekable() work (which also seem to be called from the main thread).
Comment 9 René Stadler 2006-10-20 14:32:04 UTC
Of course my quickfix just moves the problem elsewhere, as the streaming thread is blocked instead.  It would make sense if the operation were cancellable.

neonhttpsrc (currently in gst-plugins-bad) uses the neon http library directly and seems to suffer from the same problem.  The DAAP source element inside Rhythmbox also blocks in start() (bug #362226).  This indicates that the documentation for GstBaseSrc needs to be extended: At least start, get_size and is_seekable vfuncs are called by gst_element_set_state and must never block.  As far as I can see, this isn't indicated in the current documentation at all.
Comment 10 Wim Taymans 2006-10-20 15:25:57 UTC
Normally state changes should never block. I have no real idea how to handle blocking state changes yet...
Comment 11 Alex Lancaster 2006-10-21 08:42:37 UTC
Confirming that this is indeed a bug.
Comment 12 James "Doc" Livingston 2006-10-22 07:11:03 UTC
Created attachment 75180 [details] [review]
do state transiton in alternate thread

I recently had a idea for a completely different way of handling this. Rather than using gnomevfs's asynchronous API, which doesn't help when a vfunc must have a result before returning, we could push the entire state transition into another thread, which can block.

This patch makes gnomevfssrc intercept READY->PAUSED (and the reverse) by creating a new thread which calls the parent state transition function, and retuning ASYNC. I'm not sure whether this method would have other issues, and it definitely need testing, but it might work.
Comment 13 Wim Taymans 2006-10-22 11:31:12 UTC
returning ASYNC from a source plugin should theoretically work. The idea would indeed be to start the open() in a different thread (which must also be cancelable or you might end up with a dozen threads hanging around). 

The thing is that basesrc has no idea that starting the task to perform the read() should be delayed until the state change completes. An alternative way would something like this:

start { 
  /* check URI, do all the stuff that does not block. BaseSrc will start a */
  /* task. */
  ...
  /* assume all this to be TRUE or based on the uri protocol. */
  if (file://,...) {
    seekable = TRUE;
    random_access = TRUE;
  }
  else {
    seekable = FALSE;
    random_access = FALSE;
  }
  return TRUE;
}

stop {
   /* try to cancel all actions that can block the streaming thread */
   cancel_open();
   cancel_read();
   return TRUE;
}

change_state {
   ..

   res = parent_change_state()   /* (de)activates the pad and starts the task */
   
   READY_TO_PAUSED:
     res = ASYNC;                /* overwrite result to be ASYNC */
     break;
   PAUSED_TO_READY:
     opened = FALSE;
     break;
 
   return res;
}

create {
  if (!opened) {
     open();         /* blocks but is cancelable */
     opened = true;
     commit_state()  /* commit state from READY to PAUSED */
  }
  read();
}

In fact, you might just want to skip the ASYNC stuff entirely, as long as you don't produce data, the sinks will return ASYNC anyway.

It's still awkward because after opening the device, basesrc will need some info (seekable, random access, ...) which you don't know yet. Assuming values based on the uri protocol might work just fine.

Also, when operating in random access mode, the sequence is a little different because the start/stop methods will be called when downstream activates itself (so before the change_state). Above code will handle that.


Comment 14 Wim Taymans 2006-10-22 11:44:30 UTC
Hmm. I'm repeating myself... If patch in Comment #8 is made cancelable and seekable/random_access only enabled for local uris (opened in start()), this can go in.

As further enhancements a simple method could be added to basesrc, called after open(), that (re)checks seekability. If we really care to make it ASYNC, it could do the commit stuff as well.
Comment 15 Wim Taymans 2006-10-22 12:14:30 UTC
*** Bug 319873 has been marked as a duplicate of this bug. ***
Comment 16 James "Doc" Livingston 2006-11-10 07:47:16 UTC
*** Bug 370769 has been marked as a duplicate of this bug. ***
Comment 17 Stefan Sauer (gstreamer, gtkdoc dev) 2006-11-28 09:06:10 UTC
rene, for your patch from #8, I fail to see when the code to make get_size() and is_seekable() work gets executed for non_local streams.

We currently have the problem, that for upnpav servers the get_info stuff can block for 30 seconds :( so that needs to get moved out of _start too.
Comment 18 René Stadler 2006-11-28 12:48:14 UTC
Looks to me like gnome-vfs (in its current form) is unsuitable for use by
gstreamer at all.  To recap the whole picture as it occurs to me:

gnomevfssrc as-is uses gnome_vfs_open_uri in ->start, which blocks and therefore
possibly makes gst_element_set_state hang (very bad).  I provided a small patch
to move this blocking call into the streaming thread, so it will hang there
instead (slightly less bad).  James provided a better patch that does the
blocking calls in an extra thread, which should free applications from hanging
because they call innocent gstreamer functions (even less bad but still bad as
the threads are leaked until the operation times out).  These are workarounds to ensure people don't lose data because their application locks up.

This would be cleanly solvable if the blocking calls of gnome-vfs were
cancellable.  It looks like that they are not.  There is something about
cancellation in the API, but only for use by methods (backends).  So no matter
what we come up with, the best possible solution involves leaking threads that
hang for a timeout of several minutes.

Therefore, we are left with the async API of gnome-vfs as the only opportunity
to get a clean solution.  However, the patch Stefan made to convert gnomevfssrc
over to the async calls does not work correctly.  The problem manifests itself
in these lines of code:
  
  /* wait for callback to be called */
  while (src->async_op_pending) {
    if (g_main_context_pending (NULL)) {
      g_main_context_iteration (NULL, FALSE);

Iterating a main context is only allowed from a single thread!  That's why
programs with a main loop become unstable with the patch.  gst-launch works
because it just polls the bus.  So no matter what, these lines have to go.  If
the thread just sleeps instead, everything should be fine as long as the
application iterates its default GMainContext.  Doing so is not a requirement
for using gstreamer at all - and that's the fundamental difference between
gstreamer and gnome-vfs.

Applying the patch (fixed in the way as described) would therefore mean that any
application that uses gnomevfssrc _must_ run a GMainLoop for the default
GMainContext (or iterate the context in regular intervals in another way).  I
doubt making this a requirement is acceptable.

Looking for a solution in gnome-vfs might be the best idea.  Its assumption that
the application iterates the default main context seems silly from the gstreamer
point of view.  Then again, it's "gnome-vfs" and not "gvfs".
Comment 19 Stefan Sauer (gstreamer, gtkdoc dev) 2006-11-28 14:10:20 UTC
Regarding the seeking in Comment #14 please also think about Bug #54590. Seekabillity not just depends on wheter the file is local or not.
Comment 20 Marc-Andre Lureau 2006-11-28 18:34:38 UTC
Starting from the 3rd patches of stefan, with a bit of rewritting, I changed the async_wait() function. Does it solve the bug?

- no more main GMainContext
- no more main loop.

static gboolean
gst_gnome_vfs_async_wait(GstGnomeVFSSrc * src)
{
  gboolean res;
  GTimeVal timeval;

  g_get_current_time (&timeval);
  g_time_val_add (&timeval, src->timeout * 1000000);
  /* FIXME: the mutex is not useful (btw, it should NOT be async_op_mutex, I think) */
  if (g_cond_timed_wait (src->async_op_cond, src->async_op_mutex, &timeval) == FALSE) {
    gnome_vfs_async_cancel (src->handle);
    res = FALSE;
    GST_INFO_OBJECT (src, "Timeout occured during gnomevfs async call with %f", src->timeout);
  } else {
    src->async_op_pending = FALSE;
    res = TRUE;
  }

  return res;
}
Comment 21 Marc-Andre Lureau 2006-11-28 18:40:47 UTC
Created attachment 77305 [details] [review]
4th attempt (using GCond instead of GMainLoop and sleep...)

For COMMENTS, not completely tested.
Comment 22 René Stadler 2006-11-28 18:50:01 UTC
Marc-Andre: No, it cannot.  The problem is in gnome-vfs, which for example uses the convenience of g_idle_add to marshal some callbacks to the so-called main thread.   If the application does not iterate the default main context at all, these callbacks are never called.
Comment 23 Marc-Andre Lureau 2006-11-29 08:23:59 UTC
Is it possible that with the GCond wait, the main thread can be stuck? And so, the 4th attempt does not help at all?
Comment 24 antony.gelberg 2007-06-10 23:07:56 UTC
Hi, any updates on this bug?  Nothing for six months now.  (I became interested after viewing bug 362226.)
Comment 25 Stefan Sauer (gstreamer, gtkdoc dev) 2007-06-11 16:14:42 UTC
I belive this will be reattempt with gvfs - the next generation gnome vfs that currently get developed.
Comment 26 Thomas Vander Stichele 2007-06-30 13:24:51 UTC
This is indeed a pretty bug, blocking UI for any GStreamer using application that uses gnomevfssrc.  Which is basically all of them (Totem, Elisa, ..)

Upping priority.
Comment 27 Tim-Philipp Müller 2007-09-08 21:06:18 UTC
*** Bug 437652 has been marked as a duplicate of this bug. ***
Comment 28 Sebastian Dröge (slomo) 2007-09-28 10:07:15 UTC
Would be nice if people could test the gio plugin from gst-plugins-bad to make sure this issue doesn't happen again with gio/gvfs. Unfortunately the only existing network protocol support in gvfs is currently samba...
Comment 29 Wouter Cloetens 2007-10-01 12:55:58 UTC
For people like myself who need a quick fix, I've patched libneon to expose the socket fd, and implemented interruption of the neonhttpsrc's create() method by implementing unlock/stop_unlock.
Patches for libneon and gst-plugins-bad on ftp://ftp.mind.be/GStreamer/
Comment 30 Wouter Cloetens 2007-11-15 11:14:14 UTC
Created attachment 99136 [details] [review]
Fully async HTTP source based on libsoup-2.2

This adds a fully async HTTP source based on libsoup.
start() returns immediately. URL parsing, DNS lookup, connection, HTTP negotiation and reception of data is completely asynchronous (with use of a new GMainLoop) and handled in create().
This is a bare-bones implementation and lacks many of the features on neonhttpsrc. HTTPS should work too, but was not tested.
Comment 31 Wouter Cloetens 2007-11-15 11:19:21 UTC
Created attachment 99137 [details] [review]
Fully async HTTP source based on libsoup-2.2

Forgot Makefile.am.
Comment 32 Wouter Cloetens 2007-11-15 11:29:15 UTC
Bug 497020 to handle souphttpsrc.
Comment 33 Luca Ferretti 2008-01-18 13:49:28 UTC
Just a note: both GnomeVFS and libsoup-2.2 are going to be deprecated.

GnomeVFS will be replaced by GIO - bug #510393
libsoup-2.2 --> libsoup-2.4 with a totally new API see http://mail.gnome.org/archives/desktop-devel-list/2008-January/msg00098.html
Comment 34 Wouter Cloetens 2008-01-18 14:30:07 UTC
Already working on libsoup-2.4 support...
Comment 35 Wouter Cloetens 2008-01-20 01:07:36 UTC
libsoup-2.4 support implemented, see bug 510708.
Comment 36 Edward Hervey 2009-03-11 08:29:25 UTC
Wouter, Stefan : *IF* this bug is still valid, there's a better alternative to fixing this issue, by not doing the connection in NULL=>READY (GstBaseSrc.start()) but in READY=>PAUSED (.render()).

This should be done for all sources for which we can't guarantee an immediate response (anything 'potentially' involving network).
Comment 37 Tim-Philipp Müller 2009-03-11 10:03:59 UTC
> Wouter, Stefan : *IF* this bug is still valid, there's a better alternative to
> fixing this issue, by not doing the connection in NULL=>READY
> (GstBaseSrc.start()) but in READY=>PAUSED (.render()).
> 
> This should be done for all sources for which we can't guarantee an immediate
> response (anything 'potentially' involving network).

This only hides the problem, it doesn't fix it at all. You might still get blocking when shutting down if you do this (e.g. when user starts playing something, then hits the next button).

Having said that, I think we should just declare gnomevfs* stuff as WONTFIX and deprecate the gnomevfs elements and remove the experimental status from the gio elements.


Comment 38 Stefan Sauer (gstreamer, gtkdoc dev) 2009-03-11 13:40:26 UTC
Agree with tim, but imho gio still has one (minor) regression - handling the authentication. rene should know more.
Comment 39 Sebastian Dröge (slomo) 2009-03-11 14:22:51 UTC
(In reply to comment #38)
> Agree with tim, but imho gio still has one (minor) regression - handling the
> authentication. rene should know more.

Also GIO needs to support non-default main contexts for it's async operations before we can use them. Until then it's only as good as gnomevfs in this regard (you can cancel sync operations from other threads).

The authentication problem is, that we a) can't do authentication inside the plugin because it also runs on the default main context and b) we need to tell the application about the authentication issue, the application then has to show a dialog, mount the filesystem and then tell the plugin that everything is good now.
Comment 40 René Stadler 2009-03-11 22:57:26 UTC
(In reply to comment #39)
> Also GIO needs to support non-default main contexts for it's async operations
> before we can use them. Until then it's only as good as gnomevfs in this regard
> (you can cancel sync operations from other threads).
> 
> The authentication problem is, that we a) can't do authentication inside the
> plugin because it also runs on the default main context and b) we need to tell
> the application about the authentication issue, the application then has to
> show a dialog, mount the filesystem and then tell the plugin that everything is
> good now.
> 

And that's the problem -- you can't interact with the user to get the authentication during a state change. That is why it's OK that Alex closed my feature request for mainloop-less async mount operation as WONTFIX. It's either sync or you error out anyways: The app needs to gio-mount any location itself before setting the state. This is only a regression if you consider the old gnomevfs elements' solution as working, which I don't. It was just a giant hack IMO...

It should still be possible to make the gio elements attempt to mount theirselves, but it's impossible to provide credentials so this would only fix password-less locations (good enough for gst-launch and lazy apps though).
Comment 41 Kris Thomsen 2009-07-27 21:18:30 UTC
Closing this bug report as no further information has been provided. Please feel free to reopen this bug if you can provide the information asked for.
Thanks!