After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 565501 - GIO could benefit from having fibers
GIO could benefit from having fibers
Status: RESOLVED WONTFIX
Product: glib
Classification: Platform
Component: gio
unspecified
Other Linux
: Normal enhancement
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2008-12-23 20:40 UTC by David Zeuthen (not reading bugmail)
Modified: 2013-11-26 17:31 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Example program (3.63 KB, text/plain)
2008-12-23 20:41 UTC, David Zeuthen (not reading bugmail)
  Details
Interface (1.97 KB, text/plain)
2008-12-23 20:41 UTC, David Zeuthen (not reading bugmail)
  Details
UNIX implementation (3.23 KB, text/plain)
2008-12-23 20:42 UTC, David Zeuthen (not reading bugmail)
  Details
Vala example (1.12 KB, text/plain)
2008-12-24 08:19 UTC, Jürg Billeter
  Details
Using libcoro (43.32 KB, patch)
2009-05-13 04:36 UTC, David Zeuthen (not reading bugmail)
none Details | Review
Updated patch (81.80 KB, patch)
2009-06-10 06:53 UTC, David Zeuthen (not reading bugmail)
none Details | Review
GThread based approach (55.60 KB, patch)
2009-06-11 23:42 UTC, David Zeuthen (not reading bugmail)
none Details | Review

Description David Zeuthen (not reading bugmail) 2008-12-23 20:40:20 UTC
Alex writes about continuations here

 http://blogs.gnome.org/alexl/2008/09/16/async-io-made-easy-using-javascript/

which is a neat way to do async IO. A comment mentioned

 http://felipec.wordpress.com/2008/09/28/continuations-in-c-easy-asynchronous-stuff/

which again links to

 http://software.schmorp.de/pkg/libcoro.html

Anyway, I decided to try to implement this as GLib API. Will attach some code; I only spent a couple of hours doing this so it's pretty rough but should show that it's possible to do.
Comment 1 David Zeuthen (not reading bugmail) 2008-12-23 20:41:06 UTC
Created attachment 125226 [details]
Example program
Comment 2 David Zeuthen (not reading bugmail) 2008-12-23 20:41:47 UTC
Created attachment 125227 [details]
Interface

It's probably overkill to use GObject for this but meh.
Comment 3 David Zeuthen (not reading bugmail) 2008-12-23 20:42:37 UTC
Created attachment 125228 [details]
UNIX implementation
Comment 4 David Zeuthen (not reading bugmail) 2008-12-23 20:43:52 UTC
Output from example program

$ ./gfiber /etc/hosts
running fiber
Going to read '/etc/hosts'
returned to main
** (process:16868): DEBUG: Got the GFileInputStream. Now sleeping 3000 msec from fiber
in print_mark()
in print_mark()
in print_mark()
in print_mark()
in print_mark()
in print_mark()
** (process:16868): DEBUG: Done sleeping 3000 msec. Now reading data from stream.
Read 191 bytes: '# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1		localhost.localdomain localhost x61
::1		localhost6.localdomain6 localhost6
'

Comment 5 David Zeuthen (not reading bugmail) 2008-12-23 20:45:17 UTC
Note that

 http://software.schmorp.de/pkg/libcoro.html

suggests this should work on most unices and that it's possible to implement for Win32 as well. On Unix, swapcontext() + friends is required by POSIX.1-2001.
Comment 6 David Zeuthen (not reading bugmail) 2008-12-23 20:46:49 UTC
Also, am totally not married to the name 'fiber' or what the interface looks like. I just wanted to show it's possible to do this in a relatively nice way in C.
Comment 7 David Zeuthen (not reading bugmail) 2008-12-23 21:54:48 UTC
Notes about implementing this on Win32

http://www.codeproject.com/KB/threads/ucontext.aspx

And, for the record, we should always be able to emulate fibers just using threads (and a mutex to guarantee that the fiber and the main thread doesn't run at the same time).
Comment 8 Matthias Clasen 2008-12-23 21:56:34 UTC
>  fiber->stack = g_malloc0 (SIGSTKSZ);
>  fiber->stack_size = SIGSTKSZ;

Not sure this is going to work in general. The heap is usually not executable.
The libc manual recommends to either allocate the memory on the original threads stack, or use mmap.
Comment 9 Behdad Esfahbod 2008-12-23 23:46:39 UTC
(In reply to comment #8)
> >  fiber->stack = g_malloc0 (SIGSTKSZ);
> >  fiber->stack_size = SIGSTKSZ;
> 
> Not sure this is going to work in general. The heap is usually not executable.
> The libc manual recommends to either allocate the memory on the original
> threads stack, or use mmap.

Why does the stack need to be executable?
Comment 10 Matthias Clasen 2008-12-24 00:16:59 UTC
It doesn't in well behaved applications
Comment 11 Matthias Clasen 2008-12-24 00:20:54 UTC
Also worth pointing out that these fibers have a fixed stack size of 

#define SIGSTKSZ	8192

bytes (gotta love those unpronounceable names). Which is not a lot, considering we do things like

  char buffer[1024*64], *p;

in other places in gio...
Comment 12 Jürg Billeter 2008-12-24 08:19:22 UTC
Created attachment 125262 [details]
Vala example

Very interesting. FYI, I just implemented basic coroutine support for async methods in Vala. It has the same goal but is implemented on the language level, the backend generates goto statements and stores local variables on the heap. We could of course also use the proposed fibers as an alternative implementation.

I've attached an example in Vala that does almost exactly the same as the C example but without fibers.
Comment 13 David Zeuthen (not reading bugmail) 2008-12-24 17:07:17 UTC
(In reply to comment #8)
> >  fiber->stack = g_malloc0 (SIGSTKSZ);
> >  fiber->stack_size = SIGSTKSZ;
> 
> Not sure this is going to work in general. The heap is usually not executable.

Not convinced we want an executable stack; especially not since the primary use case of fibers is IO (e.g. gio or eggdbus)... so people will be using fibers for loading untrusted data most of the time probably into buffers allocated on the stack.

Typically only legacy apps needs executable stacks. Since fibers is a new thing we don't have that problem. However, I can see some VM's that might want executable stack; we should probably have a g_fiber_run_full() with all kinds of options.

> The libc manual recommends to either allocate the memory on the original
> threads stack, or use mmap.

Yeah, I looked at bit more into this. We probably just want to mmap() something like 1MB (we _definitely_ don't want to g_malloc0() it).

I'm guessing that 1MB of stack space (tunable via a _full() function) is a fine default as you're probably never going to have more than 10-20 active fibers at a time (unlike threads where you may have a separate event loop handling) and I think we can assume to have at least 2GB or 4GB of virtual address space.

Also, I was looking over the NPTL design docs. They go out of their way to reuse stacks because munmap() causes things like TLB flushes (stalling all CPUs) on IA32 and other architectures. So we should probably do something similar since we want fiber construction / destruction to be very efficient.
Comment 14 Christian Hergert 2009-02-24 09:53:39 UTC
Couple questions,

With modern languages using generational garbage collection and potentially moving items around, how will this handle those references potentially moving on the heap.

I take it since this is fibers, the abuse that will be caused to the locks for the main loop idle dispatch are acceptable (since it would rarely have contention if executing during the main loop)?

What about new fibers being created from existing fibers? Does this provide a way to make sure those start execution immediately after the executing fibers yield? You need to do that to get the maximum performance from the cpu cache lines since they almost always work with related data.

I think it would also make since for an asynchronous toolkit to provide the fundamentals to build upon for such features as parallel_for(each) and sort (if not a basic map/reduce as well). Obviously, fibers is not the right approach for this unless you can have fibers execute on multiple threads.

Also, what about canceling of fibers?

FWIW, I've been working on many of these issues in GTask[1]. It's a bit more complex in that it manages threads, but that will be required if we really want multi-core efficiency anyway. As I see it, thats one of the benefits of asynchronous programming in general.

GTask also provides a rich feature set of callbacks/errbacks like twisted. This allows you to build your entire asynchronous workflow before any code is executed.

As for inheriting from GObject, I chose the same approach. It's been acceptable performance so far. What has been delightful is how much easier it is to bind into the higher languages.

Slides are also available from my talk on GTask at SCaLE 7x[2].

[1] http://audidude.com/blog/?p=51
[2] http://audidude.com/blog-content/gtask/gs.pdf
Comment 15 David Zeuthen (not reading bugmail) 2009-05-13 04:36:00 UTC
Created attachment 134540 [details] [review]
Using libcoro

Found some more time to work on this tonight

 1. We're now using a copy libcoro for portability; notes

    - http://software.schmorp.de/pkg/libcoro.html

    - I've tested this on both Linux and Windows Vista. Works fine.
      - On win32 I used VS Express 2008 and prebuilt GLib packages from
        http://www.gtk.org/download-windows.html

    - There's a pthread backend in libcoro; that also works fine (see
      the Makefile for how to turn it on)

    - libcoro appears to be BSD licensed so shouldn't be a problem to
      include a copy (see libcoro/LICENSE in the patch)

    - libcoro appears to be somewhat actively maintained, see
      http://cvs.schmorp.de/libcoro/

 2. Slightly reworked the GFiber interface; the fiber func is now
    guaranteed to have a non-NULL GCancellable; the user can pass it
    in himself too.... there's also g_fiber_cancel(). User also gets
    the GFiber object back and can connect to the ::completed signal
    to get the return value when the fiber terminates.

    See the example for more details.

Remaining issues

 - still need to figure out to do efficient stack allocation; right now
   I just g_malloc0() a megabyte... Shouldn't be hard to fix using mmap()
   etc. etc.

 - do we want to support multiple outstanding ops? If so we need to slightly
   change the way GAsyncReadyCallback et. al. is integrated... not sure it's
   worth the effort, maybe it is. One approach is to do this

   foo_async_op (foo,
     cancellable,
     g_fiber_get_async_ready_callback (fiber),   /* GAsyncReadyCallback */
     g_fiber_get_async_ready_user_data (fiber, TAG0)); /* user_data */

   foo_async_another_op (foo,
     cancellable,
     g_fiber_get_async_ready_callback (fiber),   /* GAsyncReadyCallback */
     g_fiber_get_async_ready_user_data (fiber, TAG1)); /* user_data */

   switch (g_fiber_yield (fiber))
     {
     case TAG0:
       /* foo_async_op completed */
       break;
     case TAG1:
       /* foo_async_another_op completed */
       break;
     }

   but it seems like too much effort. Thoughts?

Apart from these two issues I think it's ready to go (sans docs and test cases).

$ diffstat gfiber-20090513.patch 
 Makefile        |   20 ++
 genpatch.sh     |   11 +
 gfiber.c        |  322 +++++++++++++++++++++++++++++++++++++++++++
 gfiber.h        |   92 ++++++++++++
 libcoro/LICENSE |   26 +++
 libcoro/README  |    6 
 libcoro/coro.c  |  399 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 libcoro/coro.h  |  294 +++++++++++++++++++++++++++++++++++++++++
 main.c          |  195 +++++++++++++++++++++++++++
 9 files changed, 1365 insertions(+)
Comment 16 David Zeuthen (not reading bugmail) 2009-05-13 04:43:39 UTC
(In reply to comment #14)
> Also, what about canceling of fibers?

I just added support for that.

> FWIW, I've been working on many of these issues in GTask[1]. It's a bit more
> complex in that it manages threads, but that will be required if we really want
> multi-core efficiency anyway. As I see it, thats one of the benefits of
> asynchronous programming in general.

GTask looks very interesting, especially the map/reduce part.

Anyway, I think the main reason for coroutine support isn't so much performance; I mean, if I wanted to utilize all cores, I'd go straight for threads and lock-free data structures and all that jazz. 

Instead, I think the main reason for coroutine support, the reason I'm interesting in it anyway, is that it makes it very easy to use existing async APIs - especially since we have a well-established pattern (GAsyncCallback) of how to model async APIs.
Comment 17 Alexander Larsson 2009-06-09 15:05:35 UTC
I like this a lot, but I want some additions/changes.

First of all, we want to mmap the stack (with malloc fallback), and we probably want to cache a few (or one?) stacks, freeing them after some timeout.

Secondly, we want to pass a GMainContext to the GFiber so that fibers can run in threads other than the main thread (given a GMainLoop + GMainContext in another thread). This of course requires support for other main contexts in gio, but that is being worked on. It also means using GSources in gfiber.c, not callback ids.

Then I want support for generic GSources as source for wakeups. g_fiber_sleep() should be implemented on top of this, and we should also add g_fiber_idle() and g_fiber_sleep_seconds().

I also would like to have support for multiple outstanding operations, but ideally that should not cause any extra work for the common one-operation case. I think this can be handled like this:

 int op1 = g_fiber_get_op_id()
 foo_async_op (foo,
     cancellable,
     g_fiber_get_async_ready_callback (fiber),   /* GAsyncReadyCallback */
     g_fiber_get_async_ready_user_data (fiber)); /* user_data */

 int op2 = g_fiber_get_op_id()
 foo_async_another_op (foo,
     cancellable,
     g_fiber_get_async_ready_callback (fiber),   /* GAsyncReadyCallback */
     g_fiber_get_async_ready_user_data (fiber)); /* user_data */

 int op = g_fiber_yield (fiber);
 if (op == op1)
  /* foo_async_op completed */
 if (op == op2)
  /* foo_async_op completed */

We would just keep a counter that we incremented on each async_ready or gsource we add. Then we return this counter id from yield. Zero code complication for the common case.

I don't know why you used a signal for fiber completion. A more obvious
approach would be to use an GAsyncReady callback.

Maybe we could have a thread-local "current" fiber, similar to g_cancellable_get_current which is automatically set when inside the fiber function. That way its easy to use gfiber with other libraries where you can't pass the fiber down through various function calls (with no support for user_data).

As far as I understand you needs various configure.in checks so that the automatic "backend" picking works in libcoro, like:
 HAVE_UCONTEXT_H, HAVE_SETJMP_H, HAVE_SIGALTSTACK
Comment 18 Alexander Larsson 2009-06-09 15:10:20 UTC
Also, the g_fiber_new() function is a bit weird. It will cause the fiber to be run, and the object it returns will basically never be used by the creator, but must still be freed.

I think a better approach would be to drop g_fiber_new for a g_fiber_run() that doesn't return a value, so you only see the GFiber reference inside the fiber and in the asyncresult callback.
Comment 19 David Zeuthen (not reading bugmail) 2009-06-10 06:53:28 UTC
Created attachment 136254 [details] [review]
Updated patch

OK, here's an updated version integrated into the GLib tree. 

diffstat 0001-Bug-565501-GIO-could-benefit-from-having-fibers.patch 
 configure.in                        |    1 
 docs/reference/gio/gio-docs.xml     |    1 
 docs/reference/gio/gio-sections.txt |   31 +
 docs/reference/gio/gio.types        |    2 
 gio/Makefile.am                     |   10 
 gio/gfiber.c                        |  846 ++++++++++++++++++++++++++++++++
 gio/gfiber.h                        |  105 ++++
 gio/gfiberprivate.c                 |  216 +++++++++
 gio/gfiberprivate.h                 |   41 +
 gio/gio.h                           |    1 
 gio/gio.symbols                     |   20 
 gio/gioenums.h                      |   14 
 gio/giotypes.h                      |   22 
 gio/libcoro/LICENSE.libcore         |   26 +
 gio/libcoro/Makefile.am             |   18 
 gio/libcoro/README.libcore          |    6 
 gio/libcoro/coro.c                  |  399 ++++++++++++++++
 gio/libcoro/coro.h                  |  294 ++++++++++++
 gio/tests/Makefile.am               |    7 
 gio/tests/fibers.c                  |  427 ++++++++++++++++++
 20 files changed, 2485 insertions(+), 2 deletions(-)

This patch should address most of the comments with the following caveats that more work is needed for

 - docs, tests and examples

 - code review

 - locking

 - caching/reuse/eviction of stacks

 - build system crap

 - thread-local "get current fiber"

This patch should address most of the points raised in comment 17 and comment 18. There's also tests for most of the code, see gio/tests/fibers.c (though I still want to add more tests).

Open questions:

 - should g_fiber_sleep() and friends take the GCancellable into account?
   Right now they don't. I suspect they could return G_MAXUINT (we'd define
   e.g. G_FIBER_EVENT_ID_CANCELLED to this) if this happens

 - actually, does g_fiber_sleep() and friends make sense at all?

 - should we make g_fiber_resume() public?

 - there are no properties on GFiber, should there be? I suspect only
   g_fiber_get_context() would be useful. I suspect anything but C/C++
   won't be able to use this and most other languages have coroutine
   support built right into the language anyway...

 - We most likely want to skip libcore for Win32 and use the native
   fiber API there. See

    http://msdn.microsoft.com/en-us/library/ms682661%28VS.85%29.aspx

   This might affect our public GLib API. Some investigation is needed
   for this.

Anyway, I'm posting the patch now just to get a sense of whether this is the right direction before doing more work. Thanks for looking at it.
Comment 20 Alexander Larsson 2009-06-10 12:19:43 UTC
Looks good to me. Some comments:

want a way to get the priority of the fiber

want a way to get the cancellable of the fiber

I think g_fiber_sleep & co should use the priority of the fiber. If you want something else, use g_fiber_attach_source.
Comment 21 Benjamin Otte (Company) 2009-06-11 13:42:41 UTC
I don't like this at all. GStreamer was a heavy user of coroutines until 0.10 and it was a major source of problems that caused more issues than it solved.

Problems I remember are (core GStreamer developers might be able to add more):
- portability issues
From reading this bug, it seems better now, but I vaguely remember the syscalls having weird behaviors on various unices and that re
- memory issues
Either you allocate too much stack per fiber to take lots of RAM or not enough and get stack overflow
- debugging is a pain
No tool has any clue about cothreads. Certainly not gdb, and probably valgrind, sysprof and friends don't, either. So if you'll get backtraces that list g_fiber_run() and the function it called. Not useful.
- code has nonobvious behavior
g_fiber_yield() calls can actually change the whole program, including all variables you previously moved to the stack. This is not a problem in GC'ed languages, but it is in C. You will likely forget to ref all objects and copy all strings on the stack before calling yield and that'll SEGV your programs.
- cleanup
It's complicated to clean up a fiber as you'll need to unwind the stack, so a nontrivial amount of code might run (possibly yielding again?) in a g_object_unref (fiber).

Also, the example code looks very ugly.

I can agree though that fibers are an interesting idea in langauges that have proper memory management (like Vala or Javascript) and integrate their debugging tools properly with their toolkits. But in those cases fibers are language features. I certainly think they should not be bolted on top of C by a library as high-level as glib.

(I'm sorry if I'm overreacting a bit here, but I feel like the nightmares from my GStreamer maintenance past are haunting me again.)
Comment 22 David Zeuthen (not reading bugmail) 2009-06-11 15:01:31 UTC
(In reply to comment #21)
> - portability issues
> From reading this bug, it seems better now, but I vaguely remember the syscalls
> having weird behaviors on various unices and that re

Yeah. If a particular Unix flavor is broken then I don't know what to do about that except for telling vendors to conform to POSIX.1-2001 (eight years old!).


But it seems like it works in libcoro which is used by Coro in Perl so I think we are good there.

For Win32 we will use the native fiber APIs.

We should test this works on at least lates FreeBSD, OS X, Solaris and Linux.

> - memory issues
> Either you allocate too much stack per fiber to take lots of RAM or not enough
> and get stack overflow

Not really an issue, we allocate the stack via mmap(2) so no RSS is ever used, only virtual address space (4GB at least). We also want to cache stacks.

> - debugging is a pain
> No tool has any clue about cothreads. Certainly not gdb, and probably valgrind,
> sysprof and friends don't, either. So if you'll get backtraces that list
> g_fiber_run() and the function it called. Not useful.

gdb actually works just fine:

Program received signal SIGSEGV, Segmentation fault.
read_file_in_fiber (fiber=0x611800, data=0x610400, cancellable=0x0, 
    error=0x7ffff71e3fa8) at fibers.c:48
48	  gint* foo = NULL; *foo = 0;
(gdb) bt
  • #0 read_file_in_fiber
    at fibers.c line 48
  • #1 run_fiber_func
    at gfiber.c line 152
  • #2 coro_init
    at coro.c line 91
  • #3 ??
    from /lib64/libc.so.6
  • #4 ??

I'm sure people can fix the tools if we find bugs in them. It's not like GLib is going to be the first user of ucontext, for example GtkVNC uses this already and has for a long time.

> - code has nonobvious behavior
> g_fiber_yield() calls can actually change the whole program, including all
> variables you previously moved to the stack. This is not a problem in GC'ed
> languages, but it is in C. You will likely forget to ref all objects and copy
> all strings on the stack before calling yield and that'll SEGV your programs.

I don't understand what you are saying here. Each fiber has it's own stack, there's no copying or anything going on.

> - cleanup
> It's complicated to clean up a fiber as you'll need to unwind the stack, so a
> nontrivial amount of code might run (possibly yielding again?) in a
> g_object_unref (fiber).

I don't understand this either; the GFiber object only contains data internal to the implementation and we destroy the execution context before the fiber runs. It's true that data set via g_object_set_data() on the fiber is freed in when the fiber is disposed but always use associations for data allocated on the heap, never the stack.

> Also, the example code looks very ugly.

That's subjective, I think the code is much nicer than what you'd have to write with callbacks. YMMV.

> I can agree though that fibers are an interesting idea in langauges that have
> proper memory management (like Vala or Javascript) and integrate their
> debugging tools properly with their toolkits. But in those cases fibers are
> language features. I certainly think they should not be bolted on top of C by a
> library as high-level as glib.
> 
> (I'm sorry if I'm overreacting a bit here, but I feel like the nightmares from
> my GStreamer maintenance past are haunting me again.)

No, this feedback is very useful. Thanks for your insights.

    David
Comment 23 Benjamin Otte (Company) 2009-06-11 15:14:39 UTC
(In reply to comment #22)
> gdb actually works just fine:
> 
> Program received signal SIGSEGV, Segmentation fault.
> read_file_in_fiber (fiber=0x611800, data=0x610400, cancellable=0x0, 
>     error=0x7ffff71e3fa8) at fibers.c:48
> 48        gint* foo = NULL; *foo = 0;
> (gdb) bt
> #0  read_file_in_fiber (fiber=0x611800, data=0x610400, cancellable=0x0, 
>     error=0x7ffff71e3fa8) at fibers.c:48
> #1  0x00007ffff7856116 in run_fiber_func (data=<value optimized out>)
>     at gfiber.c:152
> #2  0x00007ffff78a4e85 in coro_init () at coro.c:91
> #3  0x00000033b0c349a0 in ?? () from /lib64/libc.so.6
> #4  0x0000000000000000 in ?? ()
> 
The thing is that thr a a bt will not show you the fibers that are currently not running nor will it show you the main stack and where it's currently residing, as opposed to what they will do with threads.

> > - code has nonobvious behavior
> > g_fiber_yield() calls can actually change the whole program, including all
> > variables you previously moved to the stack. This is not a problem in GC'ed
> > languages, but it is in C. You will likely forget to ref all objects and copy
> > all strings on the stack before calling yield and that'll SEGV your programs.
> 
> I don't understand what you are saying here. Each fiber has it's own stack,
> there's no copying or anything going on.
>
My issue is with the expectations people have when reading the code. Yielding execution is a concept that is hard to wrap one's head around. People don't realize they need to ensure to keep a reference to all GObjects on the stack, as they might go away during a yield. (this is in fact already a problem with signals).
 
> > - cleanup
> > It's complicated to clean up a fiber as you'll need to unwind the stack, so a
> > nontrivial amount of code might run (possibly yielding again?) in a
> > g_object_unref (fiber).
> 
> I don't understand this either; the GFiber object only contains data internal
> to the implementation and we destroy the execution context before the fiber
> runs. It's true that data set via g_object_set_data() on the fiber is freed in
> when the fiber is disposed but always use associations for data allocated on
> the heap, never the stack.
> 
So you're basically saying "use different code conventions for code that runs in fibers"? Code like foo = malloc (size); g_input_stream_read_fibered (foo); process (foo); free (foo); will suddenly cause memleaks when the fiber is deleted while yielding in the read?
Comment 24 David Zeuthen (not reading bugmail) 2009-06-11 15:30:27 UTC
(In reply to comment #23)
> The thing is that thr a a bt will not show you the fibers that are currently
> not running nor will it show you the main stack and where it's currently
> residing, as opposed to what they will do with threads.

Well, then someone needs to teach gdb about multiple execution contexts, e.g. proper support for ucontext (it's worth talking to the RH guys working on the Archer branch of gdb; we should do that).

We could add convenience functions such as g_fiber_get_all_fibers_in_thread() and g_fiber_get_all_fibers(). Then you can use these with g_fiber_resume() (which is in my local tree) to switch into the execution context and get the back trace.

Anyway, I don't think the quality (or lack of) of gdb should prevent us from adding fibers to GLib.

(I also wonder how well it works in Visual Studio; I'll let you know later today when I do the native Win32 port.)

> > > - code has nonobvious behavior
> > > g_fiber_yield() calls can actually change the whole program, including all
> > > variables you previously moved to the stack. This is not a problem in GC'ed
> > > languages, but it is in C. You will likely forget to ref all objects and copy
> > > all strings on the stack before calling yield and that'll SEGV your programs.
> > 
> > I don't understand what you are saying here. Each fiber has it's own stack,
> > there's no copying or anything going on.
> >
> My issue is with the expectations people have when reading the code. Yielding
> execution is a concept that is hard to wrap one's head around. 

Actually I think it's very natural...

> People don't
> realize they need to ensure to keep a reference to all GObjects on the stack,
> as they might go away during a yield. (this is in fact already a problem with
> signals).

I still don't understand this. Why would you need to keep a reference to an object when calling yield()? Can you give an example?

> > > - cleanup
> > > It's complicated to clean up a fiber as you'll need to unwind the stack, so a
> > > nontrivial amount of code might run (possibly yielding again?) in a
> > > g_object_unref (fiber).
> > 
> > I don't understand this either; the GFiber object only contains data internal
> > to the implementation and we destroy the execution context before the fiber
> > runs. It's true that data set via g_object_set_data() on the fiber is freed in
> > when the fiber is disposed but always use associations for data allocated on
> > the heap, never the stack.
> > 
> So you're basically saying "use different code conventions for code that runs
> in fibers"? Code like foo = malloc (size); g_input_stream_read_fibered (foo);
> process (foo); free (foo); will suddenly cause memleaks when the fiber is
> deleted while yielding in the read?

No, the coding conventions are not different because you would never ever use g_object_set_data() with anything allocated on the stack so there's no difference between fibers or normal execution contexts.

Maybe an example illustrating your thoughts would help here.

    David
Comment 25 Jürg Billeter 2009-06-11 15:43:24 UTC
(In reply to comment #22)
> (In reply to comment #21)
> > - portability issues
> > From reading this bug, it seems better now, but I vaguely remember the syscalls
> > having weird behaviors on various unices and that re
> 
> Yeah. If a particular Unix flavor is broken then I don't know what to do about
> that except for telling vendors to conform to POSIX.1-2001 (eight years old!).

It probably does not really matter, I just wanted to note that makecontext() and swapcontext() have been obsoleted by POSIX.1-2008, recommending to use POSIX threads instead.
Comment 26 Matthias Clasen 2009-06-11 16:36:16 UTC
> My issue is with the expectations people have when reading the code. Yielding
> execution is a concept that is hard to wrap one's head around.

I have a lot of sympathy for that sentiment. My head is still a little bent out of shape, too...


> The thing is that thr a a bt will not show you the fibers that are currently
> not running nor will it show you the main stack and where it's currently
> residing, as opposed to what they will do with threads.

Out of interest, how do tools on platforms with native fiber support (ie win32) handle this ?
Comment 27 David Zeuthen (not reading bugmail) 2009-06-11 23:42:53 UTC
Created attachment 136380 [details] [review]
GThread based approach

Hmm, saybe we can save ourselves a ton of headaches by implementing fibers on top of threads. Here's an unfinished (things can be simplified / optimized a bit more and it's lacking docs/examples) but working patch to do this (there's also more test cases). Notably libcoro is gone. 

This approach is based largely on http://git.gnome.org/cgit/gtk-vnc/tree/src/coroutine_gthread.c

I can't really decide if this is what we want; but I'm pretty sure it is since this will work on any platform now or in the future (including working gdb, valgrind and sysprof support out of the box)... and the GFiber API is unchanged.

Wrt TLS and alternate GMainContexts; with the proposed API in bug 579984, we can call g_main_context_push_thread_default() when entering the thread for a newly created fiber.

Thoughts?
Comment 28 Dan Winship 2009-06-11 23:53:57 UTC
In addition to TLS you could also have problems with recursive mutexes. It's going to be a very leaky abstraction. (That's probably true of the coroutine way too though...)

If the app is going to enable threads anyway, it seems like it would be easier to just use a GThreadPool, synchronous APIs, and g_simple_async_result_complete_in_idle(). Yeah, you have to deal with synchronization explicitly, but at least it's completely transparent what's going on.
Comment 29 Ray Strode [halfline] 2009-06-12 17:12:17 UTC
There was a small discussion about doing fibers in terms of threads on IRC a few days ago:

   [11:58:39]  <halfline> alexl,desrt: of course there's nothing that says
   gfiber has to use setcontext/getcontext/swapcontext ...
   [11:58:57]  <halfline> you could just use threads but force it to be
   serialized
   [11:59:52]  <desrt> seems a little silly to use threads with locks that
   prevent threading :)
   [12:00:23]  <desrt> interesting from an academic standpoint, though
   [12:01:16]  <halfline> it just gives you per-fiber storage
   [12:01:33]  <halfline> and you can leverage gthreads thread pools
   [12:01:45]  <halfline> and you don't have to allocate and manage your own
   stack
   [12:02:15]  <halfline> context switing is probably a little heavier though
   [12:03:02]  <desrt> definitely.
   [12:03:07]  <desrt> and less direct, too
   [12:03:14]  <desrt> with fibres you say "go here"
   [12:03:31]  <desrt> with threading you'd end up having to say "unlock this
   lock...." (implicitly: and run whoever was waiting)
   [12:05:33]  <alexl> halfline: gfiber on threads is just slowness. i don't
   see the point
   [12:05:52]  <alexl> its nice as a fallback, but if you allow it you
   instantly deny fibers access to TLS
   [12:06:06]  <alexl> so, i would prefer it was guaranteed to never happen
   [13:21:19]  <halfline> alexl: i guess there are ups and downs to having
   shared tls for fibers
   [13:21:58]  <alexl> halfline: There are only two options, a) you can rely
   on TLS working in a fiber, or b) you can never rely on TLS working
   [13:22:12]  <alexl> i'd prefer a, but that precludes using threads as the
   basic operation
   [13:23:06]  <alexl> halfline: for instance, the "differnt main context for
   gio calls" api relies on TLS
   [13:23:30]  <halfline> ah okay

Comment 30 Christian Hergert 2009-06-12 21:04:45 UTC
I built a co-routine replacement, libiris[1], on-top of GThread that can use a work-stealing scheduler for much faster work execution than GThreadPool.  It also does not need the high-number of threads that the locking model will require.

[1] http://git.dronelabs.com/iris
Comment 31 Alexander Larsson 2009-06-15 09:36:01 UTC
If you're using threads anyway I don't think you need the gfiber API really. You're better of using blocking calls and g_io_scheduler_push_job(). Thats what nautilus uses for its complicated copy operations that still do UI.
Comment 32 Stef Walter 2009-06-19 18:01:25 UTC
Cool to see work on this. I love using fibers. I've been using fibers for quite a while to simplify async code.

In gnome-keyring we use threads as fibers in many cases. In other projects I've done used real fibers in GUI apps that have massive amounts of async IO.

Some of the problems I've run into: 

 - Windows will let multiple fibers access the same window, but not several 
   threads, even if you lock them to make sure only one is running at any given
   time. 

   Fibers as Threads completely breaks on Windows. Windows uses windows (lower
   case) in subtle stupid ways like for passing messages between applications, 
   even in 'non-gui' apps where you'd least expect it. 

 - Recursive locks interacting with the GLib Mainloop can deadlock an
   application when several mainloops are constructed (modal dialogs). Could 
   be fixed perhaps.

 - There's tons of assumptions in code (and compilers) that thread == stack. 
   In Windows TLS is used all over the place :(  C++ exceptions are a problem 
   case.
   
I think if this code is made to be too lowest-common-denominator general-purpose, many problems will be encountered.
Comment 33 Allison Karlitskaya (desrt) 2010-09-14 04:06:11 UTC
just happened upon this today:

CONFORMING TO
       SUSv2, POSIX.1-2001.  POSIX.1-2008 removes the specification of getcon-
       text(),  citing  portability issues, and recommending that applications
       be rewritten to use POSIX threads instead.
Comment 34 Marc-Andre Lureau 2013-11-26 17:31:09 UTC
fwiw, I am proposing a simpler api in bug 719362 (based on qemu implementation)