Bug 661767 – merge/improve various bits of run-in-thread functionality

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 661767 - merge/improve various bits of run-in-thread functionality


Summary:	merge/improve various bits of run-in-thread functionality


Status:	RESOLVED FIXED

Product:	glib
Classification:	Platform
Component:	gio
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	gtkdev
QA Contact:	gtkdev

URL:
Whiteboard:

Duplicates:	683376 (view as bug list)
Depends on:
Blocks:

Reported:	2011-10-14 13:30 UTC by Dan Winship
Modified:	2012-10-11 14:31 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
some API thoughts (9.25 KB, text/plain) 2011-10-14 13:33 UTC, Dan Winship		Details
what gthreadedresolver.c ends up looking like (7.87 KB, text/plain) 2011-10-14 13:35 UTC, Dan Winship		Details
GTask: new GAsyncResult implementation / threaded task manager (33.30 KB, patch) 2011-11-30 13:39 UTC, Dan Winship	none	Details \| Review
GThreadedResolver: port to GTask (23.20 KB, patch) 2011-11-30 13:39 UTC, Dan Winship	none	Details \| Review
gio: port a few GSimpleAsyncResult users to GTask, as a proof-of-concept (52.81 KB, patch) 2011-11-30 13:41 UTC, Dan Winship	none	Details \| Review
gio: port g_file_real_copy_async() from GIOScheduler to GTask (4.84 KB, patch) 2011-11-30 13:41 UTC, Dan Winship	none	Details \| Review
gio: port some g_simple_async_result_new_from_error(), etc, uses to GTask (6.71 KB, patch) 2011-11-30 13:41 UTC, Dan Winship	none	Details \| Review
GTask: new GAsyncResult implementation / threaded task manager (38.03 KB, patch) 2012-01-16 14:40 UTC, Dan Winship	none	Details \| Review
GThreadedResolver: port to GTask (23.19 KB, patch) 2012-01-16 14:40 UTC, Dan Winship	none	Details \| Review
gio: port a few GSimpleAsyncResult users to GTask, as a proof-of-concept (49.97 KB, patch) 2012-01-16 14:40 UTC, Dan Winship	none	Details \| Review
gio: port g_file_real_copy_async() from GIOScheduler to GTask (4.84 KB, patch) 2012-01-16 14:41 UTC, Dan Winship	none	Details \| Review
gio: port some g_simple_async_result_new_from_error(), etc, uses to GTask (6.58 KB, patch) 2012-01-16 14:41 UTC, Dan Winship	none	Details \| Review
gio: port GIOScheduler to GTask (8.25 KB, patch) 2012-01-16 14:41 UTC, Dan Winship	none	Details \| Review
GTask: new GAsyncResult implementation / threaded task manager (48.00 KB, patch) 2012-04-17 14:42 UTC, Dan Winship	none	Details \| Review
GThreadedResolver: port to GTask (25.30 KB, patch) 2012-04-17 14:42 UTC, Dan Winship	none	Details \| Review
gfile: remove some unnecessary code (1.33 KB, patch) 2012-04-17 14:43 UTC, Dan Winship	none	Details \| Review
GIOScheduler: port to GTask (12.09 KB, patch) 2012-04-17 14:43 UTC, Dan Winship	none	Details \| Review
gio: deprecate GSimpleAsyncResult (18.84 KB, patch) 2012-04-17 14:43 UTC, Dan Winship	none	Details \| Review
GFile: remove some unnecessary code (1.40 KB, patch) 2012-06-13 17:28 UTC, Dan Winship	committed	Details \| Review
gio: handle GSimpleAsyncResult errors in _finish vmethods (13.94 KB, patch) 2012-06-13 17:28 UTC, Dan Winship	committed	Details \| Review
gio: Add g_async_result_legacy_propagate_error() (38.63 KB, patch) 2012-06-13 17:28 UTC, Dan Winship	committed	Details \| Review
gio: add g_async_result_is_tagged() (13.85 KB, patch) 2012-06-13 17:28 UTC, Dan Winship	committed	Details \| Review
GTask: new GAsyncResult implementation / threaded task manager (58.42 KB, patch) 2012-06-13 17:28 UTC, Dan Winship	reviewed	Details \| Review
GThreadedResolver: port to GTask (25.33 KB, patch) 2012-06-13 17:29 UTC, Dan Winship	accepted-commit_now	Details \| Review
gio: deprecate gioscheduler, soft deprecate GSimpleAsyncResult (22.54 KB, patch) 2012-06-13 17:29 UTC, Dan Winship	accepted-commit_now	Details \| Review
gio: port everything else from GSimpleAsyncResult to GTask (412.29 KB, patch) 2012-06-13 17:29 UTC, Dan Winship	reviewed	Details \| Review
gio: deprecate GSimpleAsyncResult (15.94 KB, patch) 2012-06-13 17:29 UTC, Dan Winship	accepted-commit_now	Details \| Review
Add a GTask test, fix a few GTask bugs. (to be squashed) (35.73 KB, patch) 2012-06-15 13:50 UTC, Dan Winship	none	Details \| Review
more GTask tests, for 100% line coverage of gtask.c (to be squashed) (15.99 KB, patch) 2012-06-16 19:28 UTC, Dan Winship	none	Details \| Review
gthreadpool: set default max_unused_threads and max_idle_time values (2.71 KB, patch) 2012-07-25 19:13 UTC, Dan Winship	committed	Details \| Review
Add, use g_main_current_context (3.50 KB, patch) 2012-07-25 19:16 UTC, Dan Winship	none	Details \| Review
glib/tests/mainloop: test g_source_get_time() (3.89 KB, patch) 2012-07-30 13:02 UTC, Dan Winship	committed	Details \| Review
gmain: allow g_source_get_context() on destroyed sources (4.30 KB, patch) 2012-07-30 13:02 UTC, Dan Winship	committed	Details \| Review

Description Dan Winship 2011-10-14 13:30:13 UTC

(spun off from bug 660740)

There are currently a bunch of ways to run something in another thread:

  - g_thread_new()
  - g_thread_pool_push()
  - g_io_scheduler_push_job()
  - g_simple_async_result_run_in_thread()

I've been thinking about this lately because for some GTls stuff, we need to
turn non-cancellable blocking function calls (PKCS#11 stuff) into cancellable
sync/async gio methods. There's code to do that in GThreadedResolver, but it
would need to be abstracted out, and then we'd have *another*
run-something-in-a-thread API... So I was thinking, maybe it would be possible
to do some merging and deprecating.


GIOScheduler is really mostly just a GThreadPool now that the
non-threaded case is gone, so it could pretty easily be deprecated.

For anything where you're doing an asynchonous gio operation (most uses of g_io_scheduler_push_job(), all uses of g_simple_async_result_run_in_thread(), and the async GThreadedResolver calls), it is convenient to have integration with GAsyncResult. One way to do that would be to push everything into GSimpleAsyncResult, but that class already has way too many slightly-different APIs, and adding the ability to do the synchronous-GThreadedResolver-like calls as well would make a complete mess of things. So I thought maybe we could make the new "GTask" be a GAsyncResolver implementation too.


What could be improved from GSimpleAsyncResolver?

    - Get rid of the crazy redundancy (eg, GSimpleAsyncResult has 10
      different methods that can be used to report an error)

    - Built-in handling of "task data" (separate from the callback
      data), so that you don't have to abuse _set_op_res_gpointer() or
      g_object_set_data() to pass data around

    - Store the io_priority, GCancellable, and GMainContext on the task
      object so that ops that need to perform multiple sub-ops will have
      access to them at each step without needing to keep track of them
      themselves

    - Simplify the bits where all callers end up doing the same things.
      Eg, "check if there's an error and return it if so, otherwise
      get the pointer result and then fiddle with things so that it
      doesn't get freed when the async result is destroyed and then
      return it" => "return g_task_propagate_result_gpointer (task, error);"

All of that could be added to GSimpleAsyncResult too of course, just at the cost of making it even more redundant and less simple...

Comment 1 Dan Winship 2011-10-14 13:33:38 UTC

Created attachment 199008 [details]
some API thoughts

I'd been thinking mostly about the GThreadedResolver case so far, so there may need to be changes for the normal GSimpleAsyncResult use cases still...

Comment 2 Dan Winship 2011-10-14 13:35:56 UTC

Created attachment 199009 [details]
what gthreadedresolver.c ends up looking like

Comment 3 Dan Winship 2011-11-30 13:39:43 UTC

Created attachment 202438 [details] [review]
GTask: new GAsyncResult implementation / threaded task manager

GTask is a replacement for GSimpleAsyncResult and GIOScheduler, that
also allows for making cancellable wrappers around non-cancellable
functions (as in GThreadedResolver).

Comment 4 Dan Winship 2011-11-30 13:39:51 UTC

Created attachment 202439 [details] [review]
GThreadedResolver: port to GTask

Comment 5 Dan Winship 2011-11-30 13:41:17 UTC

Created attachment 202440 [details] [review]
gio: port a few GSimpleAsyncResult users to GTask, as a proof-of-concept

===

I am not sure how much of this we actually want to do... note that in
some places we need to leave GSimpleAsyncResult references around
anyway, since some classes assume their sub/superclasses use it.

Comment 6 Dan Winship 2011-11-30 13:41:23 UTC

Created attachment 202441 [details] [review]
gio: port g_file_real_copy_async() from GIOScheduler to GTask

We don't need an equivalent of
g_io_scheduler_job_send_to_mainloop_async(), because we have
g_main_context_invoke_full() as a more generic version of that now.

AFAICT, the mainloop_barrier() in the original code was unnecessary,
since the source created by g_simple_async_result_complete_in_idle()
is going to get scheduled after all of the sources created by
g_io_scheduler_job_send_to_mainloop() anyway.

Comment 7 Dan Winship 2011-11-30 13:41:31 UTC

Created attachment 202442 [details] [review]
gio: port some g_simple_async_result_new_from_error(), etc, uses to GTask

In general, g_simple_async_result_new_from_error() is only interesting
if you're not creating the async result right away, which you
shouldn't do anyway. So there's no much need for methods like those in
GTask.

Comment 8 Dan Winship 2012-01-16 14:40:38 UTC

Created attachment 205367 [details] [review]
GTask: new GAsyncResult implementation / threaded task manager

GTask is a replacement for GSimpleAsyncResult and GIOScheduler, that
also allows for making cancellable wrappers around non-cancellable
functions (as in GThreadedResolver).

===

(In particular, see the "Porting from GSimpleAsyncResult" section of
its gtk-docs.)

Comment 9 Dan Winship 2012-01-16 14:40:44 UTC

Created attachment 205368 [details] [review]
GThreadedResolver: port to GTask

Comment 10 Dan Winship 2012-01-16 14:40:51 UTC

Created attachment 205369 [details] [review]
gio: port a few GSimpleAsyncResult users to GTask, as a proof-of-concept

Comment 11 Dan Winship 2012-01-16 14:41:02 UTC

Created attachment 205370 [details] [review]
gio: port g_file_real_copy_async() from GIOScheduler to GTask

We don't need an equivalent of
g_io_scheduler_job_send_to_mainloop_async(), because we have
g_main_context_invoke_full() as a more generic version of that now.

AFAICT, the mainloop_barrier() in the original code was unnecessary,
since the source created by g_simple_async_result_complete_in_idle()
is going to get scheduled after all of the sources created by
g_io_scheduler_job_send_to_mainloop() anyway.

Comment 12 Dan Winship 2012-01-16 14:41:09 UTC

Created attachment 205371 [details] [review]
gio: port some g_simple_async_result_new_from_error(), etc, uses to GTask

In general, g_simple_async_result_new_from_error() is only interesting
if you're not creating the async result right away, which you
shouldn't do anyway. So there's no much need for methods like those in
GTask.

Comment 13 Dan Winship 2012-01-16 14:41:14 UTC

Created attachment 205372 [details] [review]
gio: port GIOScheduler to GTask

So that the thread-pool, etc, code isn't duplicated in two places.

Comment 14 Dan Winship 2012-04-17 14:42:48 UTC

Created attachment 212209 [details] [review]
GTask: new GAsyncResult implementation / threaded task manager

GTask is a replacement for GSimpleAsyncResult and GIOScheduler, that
also allows for making cancellable wrappers around non-cancellable
functions (as in GThreadedResolver).

Comment 15 Dan Winship 2012-04-17 14:42:56 UTC

Created attachment 212210 [details] [review]
GThreadedResolver: port to GTask

Comment 16 Dan Winship 2012-04-17 14:43:03 UTC

Created attachment 212211 [details] [review]
gfile: remove some unnecessary code

The "mainloop_barrier" in copy_async_thread() is unnecessary, since
the g_simple_async_result_complete_in_idle() will be queued after all
of the g_io_scheduler_job_send_to_mainloop_async()s anyway.

Comment 17 Dan Winship 2012-04-17 14:43:15 UTC

Created attachment 212212 [details] [review]
GIOScheduler: port to GTask

We don't need an equivalent of
g_io_scheduler_job_send_to_mainloop_async(), because we have
g_main_context_invoke_full() as a more generic version of that now.

Comment 18 Dan Winship 2012-04-17 14:43:22 UTC

Created attachment 212213 [details] [review]
gio: deprecate GSimpleAsyncResult

Comment 19 Dan Winship 2012-04-17 14:50:42 UTC

Current status of the work. Further examples of porting existing APIs from GSimpleAsyncResult to GTask can be seen at http://git.mysterion.org/glib/commit/?h=task&id=8d34bc8d and http://git.mysterion.org/glib/commit/?h=task&id=6f3639c5. I'm not sure how much of that we'd want to commit right away; I was doing the porting mostly just to test out the API...

The only remaining problem I know of is that this implements the behavior of g_simple_async_result_set_check_cancellable() by default, but that means it would be impossible to use it to implement a g_output_stream_write() implementation, because the docs there explicitly state "If an operation was partially finished when the operation was cancelled the partial result will be returned, without an error.". So there needs to be some way to avoid that, but the API needs to be not easily confused with the API that is currently called g_task_set_auto_cancel() (which is something different). One possibility is to have a GTaskCancellationBehavior flags, with G_TASK_CANCELLATION_OVERRIDES (for the check_cancellable behavior) and G_TASK_CANCELLATION_INTERRUPTS (for the auto_cancel behavior). (This also leads to the question of whether the override/check-cancellable behavior actually ought to be the default or not.)

Comment 20 Dan Winship 2012-06-13 17:28:19 UTC

Created attachment 216326 [details] [review]
GFile: remove some unnecessary code

The "mainloop_barrier" in copy_async_thread() is unnecessary, since
the g_simple_async_result_complete_in_idle() will be queued after all
of the g_io_scheduler_job_send_to_mainloop_async()s, and sources with
the same priority will run in the order in which they were queued.

Comment 21 Dan Winship 2012-06-13 17:28:35 UTC

Created attachment 216327 [details] [review]
gio: handle GSimpleAsyncResult errors in _finish vmethods

Originally, the standard idiom with GSimpleAsyncResult was to handle
all errors in the _finish wrapper function, so that vmethods only had
to deal with successful results. But this means that chaining up to a
parent _finish vmethod won't work correctly. Fix this by also checking
for errors in all the relevant vmethods. (We have to redundantly check
in both the vmethod and the wrapper to preserve compatibility.)

Comment 22 Dan Winship 2012-06-13 17:28:42 UTC

Created attachment 216328 [details] [review]
gio: Add g_async_result_legacy_propagate_error()

Finish deprecating the "handle GSimpleAsyncResult errors in the
wrapper function" (and protect again future GSimpleAsyncResult
deprecation warnings) idiom by adding a "legacy" GAsyncResult method
to do it in those classes/methods where it had been traditionally
done.

(This applies only to wrapper methods; in cases where an _async
vmethod explicitly uses GSimpleAsyncResult, its corresponding _finish
vmethod still uses g_simple_async_result_propagate_error.)

Comment 23 Dan Winship 2012-06-13 17:28:48 UTC

Created attachment 216329 [details] [review]
gio: add g_async_result_is_tagged()

Rather than doing a two step first-check-the-GAsyncResult-subtype-then-
check-the-tag, add a GAsyncResult-level method so that you can do them
both at once, simplifying the code for "short-circuit" async return
values where the vmethod never gets called.

Comment 24 Dan Winship 2012-06-13 17:28:55 UTC

Created attachment 216330 [details] [review]
GTask: new GAsyncResult implementation / threaded task manager

GTask is a replacement for GSimpleAsyncResult and GIOScheduler, that
also allows for making cancellable wrappers around non-cancellable
functions (as in GThreadedResolver).

Comment 25 Dan Winship 2012-06-13 17:29:03 UTC

Created attachment 216331 [details] [review]
GThreadedResolver: port to GTask

Comment 26 Dan Winship 2012-06-13 17:29:16 UTC

Created attachment 216332 [details] [review]
gio: deprecate gioscheduler, soft deprecate GSimpleAsyncResult

Reimplement gioscheduler in terms of GTask, and deprecate the original
gioscheduler methods. Update docs to point people to GTask rather than
gioscheduler and GSimpleAsyncResult, but don't actually formally
deprecate GSimpleAsyncResult yet.

Comment 27 Dan Winship 2012-06-13 17:29:22 UTC

Created attachment 216333 [details] [review]
gio: port everything else from GSimpleAsyncResult to GTask

Comment 28 Dan Winship 2012-06-13 17:29:28 UTC

Created attachment 216334 [details] [review]
gio: deprecate GSimpleAsyncResult

Comment 29 Dan Winship 2012-06-13 17:40:25 UTC

OK, I think this is ready to go in... thoughts?

As before, I'm not convinced we want to port all of gio to GTask right away... but the latest patch does now do that, as a proof of concept / API sanity test. (I'm assuming we'd also not commit the "deprecate GSimpleAsyncResult" patch right now if we don't port all of gio.)

Comment 30 Alexander Larsson 2012-06-13 19:11:09 UTC

Review of attachment 216326 [details] [review]:

Seems right to me. I have no idea why i added that initially.

Comment 31 Alexander Larsson 2012-06-13 19:12:07 UTC

Review of attachment 216326 [details] [review]:

Seems right to me. I have no idea why i added that initially.

Comment 32 Matthias Clasen 2012-06-13 22:58:46 UTC

Impressive patches.
Do you have testcases for GTask ?

Comment 33 Dan Winship 2012-06-14 10:33:34 UTC

(In reply to comment #32)
> Impressive patches.
> Do you have testcases for GTask ?

Well, "make check" still passes after the "port everything" patch. I started porting tests/simple-async-result.c, but it seemed silly since basically every test that uses any async function was acting as a test of GTask...

But I guess there's nothing that tests g_task_set_return_on_cancel() (other than tests/resolver, which is not "make check"-able), and there are probably other bits (error cases, etc) that don't get tested by any of the other tests. So yeah, I should do this.

Comment 34 Dan Winship 2012-06-15 13:50:13 UTC

Created attachment 216516 [details] [review]
Add a GTask test, fix a few GTask bugs. (to be squashed)

Comment 35 Matthias Clasen 2012-06-15 23:07:23 UTC

Review of attachment 216327 [details] [review]:

ok

Comment 36 Matthias Clasen 2012-06-15 23:09:08 UTC

Review of attachment 216328 [details] [review]:

ok

Comment 37 Matthias Clasen 2012-06-15 23:11:15 UTC

Review of attachment 216329 [details] [review]:

ok

Comment 38 Matthias Clasen 2012-06-15 23:12:29 UTC

Review of attachment 216330 [details] [review]:

::: gio/gtask.c
@@ +522,3 @@
+  source = g_main_current_source ();
+  if (source)
+    task->creation_time = g_source_get_time (source);

Why not just call g_get_monotonic_time() here ? Do we not need creation_time to be set ?

Comment 39 Matthias Clasen 2012-06-15 23:33:31 UTC

Overall diffstat:  70 files changed, 6211 insertions(+), 4812 deletions(-)
But of the 1400 new lines, 1200 are tests

Comment 40 Dan Winship 2012-06-16 00:20:45 UTC

(In reply to comment #38)
> +  source = g_main_current_source ();
> +  if (source)
> +    task->creation_time = g_source_get_time (source);
> 
> Why not just call g_get_monotonic_time() here ? Do we not need creation_time to
> be set ?

We don't want the actual current time; it's taking advantage of the fact that g_source_get_time() returns a cached value that doesn't change during a single iteration of the GMainContext. Then in g_task_return() it does the same calculation and compares the two times, and if they're the same, then queues an idle handler to run the callback rather than invoking it directly.

g_task_return() explains this, so maybe I should just put a comment in g_task_new() pointing to there?

Comment 41 Matthias Clasen 2012-06-16 00:51:10 UTC

Running make check with --enable-gcov and this patch set yields

Lines:  	268 	281 	95.4 %
Functions: 	48 	50 	96.0 %
Branches: 	107 	147 	72.8 %

The only missing functions are

g_task_get_source_tag
g_task_get_user_data

Comment 42 Dan Winship 2012-06-16 19:28:37 UTC

Created attachment 216590 [details] [review]
more GTask tests, for 100% line coverage of gtask.c (to be squashed)

Comment 43 Colin Walters 2012-06-21 21:44:51 UTC

So maybe this is expanding the scope of this bug too much, but I wonder if we could improve the experience with asynchronous APIs even more.

In particular there are these two cases:

* Composition: Implementing an asynchronous function that runs N async functions to do its job, and completes when they all complete
* Chaining: Implementing an asynchronous function that calls A, which on completion, needs to call B

And of course, one can have both simultaneously.  The main issue I've found is that in composition, error handling is extremely tedious.  You can just always call g_simple_async_result_take_error() if you want last-error-wins semantics.  Which isn't bad, but I think it'd be better to have first-error-wins.

We really lack nontrivial async API usage in tests/gio/.

Comment 44 Dan Winship 2012-06-22 14:25:48 UTC

(In reply to comment #43)
> So maybe this is expanding the scope of this bug too much, but I wonder if we
> could improve the experience with asynchronous APIs even more.

Well, making sure we're not going to have to add a similar-but-different API in the future to support these extra use cases is definitely good.

> * Composition: Implementing an asynchronous function that runs N async
> functions to do its job, and completes when they all complete

Where have you needed to do this?

I ran into this while glibifying some of the libnm-glib async APIs a few months ago. Eg, when initializing an object, you need to GetAll properties on each of its interfaces, and wait until they've all come back. (And if one of those properties is an object array, you need to create an object for each array element, and wait until they've all finished g_async_initable_init_async().) I ended up keeping a counter of how many ops were still pending, and then only actually calling g_simple_async_result_complete() once the counter reached 0. We could add some built-in counter to GTask to simplify this maybe...

> * Chaining: Implementing an asynchronous function that calls A, which on
> completion, needs to call B

This is already a bit easier in GTask because the task keeps track of the GMainContext, GCancellable, and priority for you, so you don't have make a structure to keep track of them yourself. Are there any other things it could do to help out here? (Or specific bits of code you're thinking of that do this?)

> And of course, one can have both simultaneously.  The main issue I've found is
> that in composition, error handling is extremely tedious.  You can just always
> call g_simple_async_result_take_error() if you want last-error-wins semantics. 
> Which isn't bad, but I think it'd be better to have first-error-wins.

Yeah... first-error-wins is what I needed in NM too.

Hm... GTask currently has "last error wins and earlier errors get leaked". Oops. But anyway, you're supposed to only be able to call g_task_return_* once anyway (since it does the equivalent of g_simple_async_result_complete() as well). It correctly enforces that for successful returns, but not for error returns.

This would easily be extendable for the composition case; you call g_task_set_number_of_parts(task, 5) [needs a better name] and then you're allowed/required to call g_task_return_* 5 times, and the result of the last call is the actual return value, unless there was an error, in which case the first error wins. Does that seem right? (This would be a compatible change to the API, so we wouldn't need to do it right now.)

Comment 45 Colin Walters 2012-06-26 15:57:18 UTC

Review of attachment 216330 [details] [review]:

::: gio/gtask.c
@@ +949,3 @@
+    task->result_set = TRUE;
+
+  if (task->synchronous || !task->callback)

How would g_task_return () get called when task->synchronous is TRUE?  Shouldn't this be a g_assert (!task->synchronous) ?

@@ +1054,3 @@
+
+static void
+task_thread_cancelled (GCancellable *cancellable,

So one thing I discovered while improving GIOScheduler performance is that it goes out of its way to ensure cancelled tasks are scheduled first.  I'm guessing this is to avoid having the application schedule even more work that may be unnecessary if it knew the associated operations were cancelled.

If we port GIO to GTask, then the async API we include such as g_file_copy_async() etc. will silently lose this behavior.  While it's unlikely an application hard-depends on this semantic, we should at least consider whether it makes sense to preserve the behavior or not.

@@ +1169,3 @@
+
+  if (!task->error)
+    g_cond_wait (&task->cond, &task->lock);

All use of g_cond_wait() should be inside a while loop.

@@ +1176,3 @@
+
+/**
+ * g_task_attach_source:

Hmm...can you explain the use of this for user code?

Comment 46 Dan Winship 2012-06-26 16:42:10 UTC

(In reply to comment #45)
> +  if (task->synchronous || !task->callback)
> 
> How would g_task_return () get called when task->synchronous is TRUE? 
> Shouldn't this be a g_assert (!task->synchronous) ?

You still need to call g_task_return_pointer(), etc, if you want to set a return value for the caller to extract. Plus, you want to be able to use the same GTaskThreadFunc for the sync and async variants of the API, where relevant. (eg, see lookup_by_name() and lookup_by_name_async() in gthreadedresolver.c)

Probably should add an example of g_task_run_in_thread_sync() to the docs...

> So one thing I discovered while improving GIOScheduler performance is that it
> goes out of its way to ensure cancelled tasks are scheduled first.

Ah. I missed that while copying over the thread-pool-prioritizing stuff.

> All use of g_cond_wait() should be inside a while loop.

oops

> + * g_task_attach_source:
> 
> Hmm...can you explain the use of this for user code?

When the async part of your task involves waiting for some GSource to trigger, this is just a shortcut for dealing with that. Search for it in the "port everything" patch; it simplifies a common pattern.

Comment 47 Colin Walters 2012-06-26 19:56:02 UTC

Review of attachment 216333 [details] [review]:

::: gio/gdbusconnection.c
@@ -1826,3 @@
-  g_simple_async_result_complete_in_idle (data->simple);
-  g_object_unref (data->simple);
-  data->simple = NULL;

So conceptually this function is no longer delivering...it's cleaning up.

@@ +1910,1 @@
+  if (g_task_return_error_if_cancelled (task)) {

Brace on a new line.

@@ +2128,3 @@
   g_return_val_if_fail (error == NULL || *error == NULL, NULL);
 
+  data.res = NULL;

I am confused why this code stores a reference to the GAsyncResult in a structure, it looks like all we do is ref it, then unref it.

(This comment is unrelated to your patch)

::: gio/gdbusprivate.c
@@ +141,2 @@
   if (result >= 0)
+    g_task_return_int (task, result);

It's mildly confusing that _int really means gssize.

@@ +182,3 @@
+				   cancellable);
+  g_task_attach_source (task, source, (GSourceFunc) _g_socket_read_with_control_messages_ready);
+  g_source_unref (source);

Previously, we were only attaching a source if we couldn't read data immediately.  Now it appears we do it every time (well, unless _g_socket_read_with_control_messages_ready() fails).

Intentional?

@@ +397,3 @@
 {
+  if (close_data->task != NULL)
+    g_object_unref (close_data->task);

g_clear_object

::: gio/gdbusproxy.c
@@ +1611,3 @@
+                          _("Error calling StartServiceByName for %s: "),
+                          proxy->priv->name);
+          goto failed;

This looks like a bad merge - you lost the handling for org.freedesktop.systemd1.Masked.

::: gio/gfile.c
@@ -6489,3 @@
-      if (length)
-	*length = 0;
-      return FALSE;

Previously, we were setting *length to 0 in case of an error; your new code isn't anymore.  I think we should preserve the old semantics for compatibility.

@@ +6604,3 @@
+
+  data->task = g_task_new (file, cancellable, callback, user_data);
+  g_task_set_task_data (data->task, data, (GDestroyNotify)replace_contents_data_free);

Random aside...why isn't there g_task_new_with_data()?

::: gio/gfileenumerator.c
@@ -732,1 @@
-  g_simple_async_result_set_handle_cancellation (res, FALSE);

We're losing this bit here; see the comment starting with:

 /* Auto handling of cancelation disabled, and ignore cancellation, since we want to close things anyway...

::: gio/goutputstream.c
@@ -1706,2 @@
-  g_simple_async_result_set_handle_cancellation (res, FALSE);
-  

Also losing _set_handle_cancellation() here.

::: gio/gresolver.c
@@ +454,1 @@
       return g_list_append (NULL, g_object_ref (addr));

Hm...I thought g_task_propagate_pointer() should ref.  Don't we need to drop the second g_object_ref() here?

::: gio/gtlsdatabase.c
@@ +335,3 @@
+  for (l = list; l; l = l->next)
+    g_object_unref (l->data);
+  g_list_free (list);

g_list_free_full()

Comment 48 Colin Walters 2012-06-26 20:10:48 UTC

Review of attachment 216332 [details] [review]:

Looks fine to me.

Comment 49 Colin Walters 2012-06-26 20:16:23 UTC

Review of attachment 216331 [details] [review]:

Honestly I'm just going to trust you on this one...

Comment 50 Colin Walters 2012-06-26 20:20:52 UTC

Review of attachment 216334 [details] [review]:

Sure.

Comment 51 Dan Winship 2012-06-26 21:35:46 UTC

(In reply to comment #47)
> So conceptually this function is no longer delivering...it's cleaning up.

Yeah... most of the porting was quick-and-dirty, and I was only concerned with "is the API easy to use in this situation" and "do the tests still pass", not so much with "does this function name still make sense" :)

> +    g_task_return_int (task, result);
> 
> It's mildly confusing that _int really means gssize.

The idea is that it's the generic "integer" return value type, which happens to be gssize because that's useful to us, but it's also a generic integer. Having "int" in the name works better for blatantly-non-size-like values like

    g_task_return_int (task, G_TLS_INTERACTION_HANDLED);

etc. (Although TBH I was thinking about adding _return/_propagate_enum for other reasons anyway. Still, there are other integer return values that are not "sizes".)

Maybe "int64" (in both the function names and the argument types) would be better?

> Previously, we were only attaching a source if we couldn't read data
> immediately.  Now it appears we do it every time (well, unless
> _g_socket_read_with_control_messages_ready() fails).
> 
> Intentional?

Ugh.

The original code is buggy, because it assumes that if the socket is readable, then g_socket_receive_message() MUST succeed, which isn't true. (Well... maybe it is for unix sockets actually. But it's still a misuse of the GSocket API.) 

It looks like I started fixing that, but then forgot to add the part where _g_socket_read_with_control_messages_ready() returns TRUE if it gets G_IO_ERROR_WOULD_BLOCK, and so it's still buggy.

Given that _ready() always returns FALSE, the old and new versions are identical (just with the sense of the if() reversed, and corresponding code rearrangement for that).

> Previously, we were setting *length to 0 in case of an error; your new code
> isn't anymore.  I think we should preserve the old semantics for compatibility.

Yeah, the patch was supposed to be pretty strongly semantics-preserving. But see above re "quick-and-dirty". There are probably bugs in places that gio/tests/ doesn't look... (Actually, I already know there's a bug in the version of GProxyAddressEnumerator in this patch that I only discovered after trying to run glib-networking against the gtasky glib.) I think my theory on this patch was that we wouldn't commit it, but I'd point people to it for when/if they wanted to port the parts of gio that they maintain, and they could use it as a starting point, but being sure to sanity-check it first.

> Random aside...why isn't there g_task_new_with_data()?

Originally the task data and its GDestroyNotify (and the IO priority too) were arguments to g_task_new(), but it made g_task_new() calls really complicated, and only about half the GTask users needed task data anyway. So I split it out. I didn't really think about g_task_new_with_data(), but we could add that...

> -  g_simple_async_result_set_handle_cancellation (res, FALSE);
> 
> We're losing this bit here; see the comment starting with:
> 
>  /* Auto handling of cancelation disabled, and ignore cancellation, since we
> want to close things anyway...

g_simple_async_result_run_in_thread()'s default behavior is that if the cancellable is cancelled before the thread gets queued up, then it never calls your thread_func. set_handle_cancellation(FALSE) changes that. But g_task_run_in_thread() always behaves the FALSE way, so there's no need for any call corresponding to that here.

In other words, it's *every other user of g_simple_async_result_run_in_thread* that I changed the semantics of... (though generally not very much; once the thread func starts doing whatever it does, something should notice that the cancellable is cancelled and return an error, and then the thread will return).

Where necessary, you can get the GSimpleAsyncResult-like behavior by starting the thread_func with:

    if (g_task_return_error_if_cancelled (task))
      return;

Anyway, another thing to add to the "porting from GSimpleAsyncResult" docs...

>        return g_list_append (NULL, g_object_ref (addr));
> 
> Hm...I thought g_task_propagate_pointer() should ref.  Don't we need to drop
> the second g_object_ref() here?

Yes

Comment 52 Dan Winship 2012-07-10 14:47:15 UTC

Current state at http://git.mysterion.org/glib/log/?h=task&showmsg=1, which is mostly just patches on top of the patches here, to make it clear what's changed. (However, it's also rebased to master, so the GIOScheduler patch is now different, to take Colin's changes there into account.)

Pending issues from the above reviews:

  * how much porting to commit along with it, whether/how much to
    deprecate GSimpleAsyncResult

  * API changes/additions for "composition" (multiple simultaneous
    async ops) -- I argued in comment 44 that we can add this later,
    though I'd still like to hear Colin's use case, and whether he
    thinks my suggestion there would work for it.

  * rename g_task_return_int() to return_size() or return_int64()?

  * g_task_new_with_data()? (could be added later)

Oh, also, the new test_run_in_thread_priority() test is occasionally flaky, because it's hard to pin down GThreadPool enough to force it to schedule things how we want, and I can't see any way to more reliably implement "add dummy tasks until the thread pool is full" without introducing race conditions when running on slow/busy machines. It's possible we could fix it just by making it the first threaded test...

Comment 53 Dan Winship 2012-07-10 14:50:55 UTC

Attachment 216326 [details] pushed as a98d26c - GFile: remove some unnecessary code
Attachment 216327 [details] pushed as 538b2f1 - gio: handle GSimpleAsyncResult errors in _finish vmethods
Attachment 216328 [details] pushed as f8532a1 - gio: Add g_async_result_legacy_propagate_error()
Attachment 216329 [details] pushed as 82d914d - gio: add g_async_result_is_tagged()

Comment 54 David Zeuthen (not reading bugmail) 2012-07-10 15:13:51 UTC

(Sorry for kinda hijacking this bug, but since you are changing all this...)

I'd really like is the ability to set the maximum number of worker threads allowed at the same time [1] - it's currently hard-coded to 10 (!), see gioscheduler.c's init_scheduler() function. Can we please get a way to get the underlying GThreadPool or at least change this maximum number of threads allowed at the same time? With GIOScheduler, I guess it would be something like

 GThreadPool* g_io_scheduler_get_thread_pool(void)

with the new stuff, I'm not so sure. Dan?

[1] : Here's why I want it: I use GDBus' HANDLE_METHOD_INVOCATIONS_IN_THREAD facility extensively in udisks (makes life very good) and some operations can take a very long time to complete, for example ATA Secure Erase can easily take 6 hours for a 2TB hard disk. So with the current limit, I can only run 10ish operations at the same time (it's not impossible you want to erase all 20 disks in an enclosure at the same time) and once that limit is reached no other GIO operation (not even GFile methods) can run for another six hours... 

Obviously, in udisks I just want to set the max number of worker threads to 0 (e.g. unlimited). I think that even may make sense for GIO itself, there's really no point in trying to protect the application from itself... and if the app needs protection (nautilus?), it's probably better if the app itself sets the limit... Alex?

Comment 55 Colin Walters 2012-07-10 15:55:06 UTC

(In reply to comment #52)
> Current state at http://git.mysterion.org/glib/log/?h=task&showmsg=1, which is
> mostly just patches on top of the patches here, to make it clear what's
> changed. (However, it's also rebased to master, so the GIOScheduler patch is
> now different, to take Colin's changes there into account.)
> 
> Pending issues from the above reviews:
> 
>   * how much porting to commit along with it, whether/how much to
>     deprecate GSimpleAsyncResult

Note we don't need to make the porting all or nothing - clearly the GResolver port should go in.  Maybe split up the porting patch?

>   * API changes/additions for "composition" (multiple simultaneous
>     async ops) -- I argued in comment 44 that we can add this later,
>     though I'd still like to hear Colin's use case, and whether he
>     thinks my suggestion there would work for it.

We can add this later (if at all); I thought about it a bit and couldn't figure out a way to improve things at the GTask level.

However, changing GTask to first-error wins would be very convenient, particularly if the application cancels composed tasks on the first error.

Comment 56 Dan Winship 2012-07-10 16:28:13 UTC

(In reply to comment #54)
> I'd really like is the ability to set the maximum number of worker threads
> allowed at the same time [1] - it's currently hard-coded to 10 (!)

Yeah, that's definitely a little weird. Also, this:

	  /* It's kinda weird that this is a global setting
	   * instead of per threadpool. However, we really
	   * want to cache some threads, but not keep around
	   * those threads forever. */
	  g_thread_pool_set_max_idle_time (15 * 1000);
	  g_thread_pool_set_max_unused_threads (2);

> Obviously, in udisks I just want to set the max number of worker threads to 0
> (e.g. unlimited). I think that even may make sense for GIO itself, there's
> really no point in trying to protect the application from itself... and if the
> app needs protection (nautilus?), it's probably better if the app itself sets
> the limit... Alex?

But maybe the app needs protection from a library? Or one library from another? But I guess at the moment we sometimes have anti-protection, since in cases like you describe, one part of the program can monopolize all available threads. (Another example: webkit attempts to pre-resolve all hostnames seen in links on a page, which can easily be more than 10. If your DNS server is slow / dropping packets / etc, you could clog up the GTask thread pool with GResolver tasks waiting to time out.)

I like to hear what Alex has to say about all this too...

Comment 57 David Zeuthen (not reading bugmail) 2012-07-10 16:39:46 UTC

(In reply to comment #56)
> But maybe the app needs protection from a library? Or one library from another?
> But I guess at the moment we sometimes have anti-protection, since in cases
> like you describe, one part of the program can monopolize all available
> threads. (Another example: webkit attempts to pre-resolve all hostnames seen in
> links on a page, which can easily be more than 10. If your DNS server is slow /
> dropping packets / etc, you could clog up the GTask thread pool with GResolver
> tasks waiting to time out.)

These are all good points, especially the webkit one... totally thinking out loud, maybe we want the API to be something like

 static GTaskPool *my_pool = NULL;

 g_task_pool_get_or_init (&my_pool, max_threads, max_idle_time, max_unused_threads);
 g_task_pool_push_thread_default (my_pool);
 /* do some work */
 g_task_pool_pop_thread_default (my_pool);

e.g. the concept of a thread-default task pool. Maybe I'm overthinking it...

Comment 58 Matthias Clasen 2012-07-11 12:47:25 UTC

The hard-coded max of 10 is a problem, I agree.
The intention probably was to avoid spawning too many threads when a lot of tasks appear at the same time ? I don't think our threadpool currently offers a good way to say "try to stick with 10 threads, but if all 10 are occupied for too long, you're ok to allocate another 10".

Comment 59 Dan Winship 2012-07-25 19:12:10 UTC

(In reply to comment #58)
> The hard-coded max of 10 is a problem, I agree.
> The intention probably was to avoid spawning too many threads when a lot of
> tasks appear at the same time ? I don't think our threadpool currently offers a
> good way to say "try to stick with 10 threads, but if all 10 are occupied for
> too long, you're ok to allocate another 10".

It doesn't. To do that we'd have to add a GThreadPool-monitoring thread to notice when that was happening. (I guess that could happen in the glib worker thread.)

How about we just bump GTask's thread pool max_size up to 100 instead of 10, and think more about GThreadPool later?

(Also, we'd talked before about getting Alex's input, but he's on vacation until a week before freeze, so I don't think we can wait for that if we want this to get in for 3.6.)

Comment 60 Dan Winship 2012-07-25 19:13:17 UTC

Created attachment 219650 [details] [review]
gthreadpool: set default max_unused_threads and max_idle_time values

GThreadPool defaulted to 0 for max_unused_threads (meaning thread-pool
threads would exit immediately if there was not already another task
waiting for them), and 0 for max_idle_time (meaning unused threads
would linger forever, though this is only relevant if you changed
max_unused_threads).

However, GIOScheduler changed the global defaults to 2 and 15*1000,
respectively, arguing that these were more useful defaults. And they
are, so let's use them.

==========

I think this makes sense... this would get rebased into the beginning
of the patchset.

Comment 61 Dan Winship 2012-07-25 19:16:59 UTC

Created attachment 219652 [details] [review]
Add, use g_main_current_context

====

another fix; g_source_get_context(g_main_current_source()) doesn't
work if the source has been destroyed already, so add
g_main_current_context()

Comment 62 Dan Winship 2012-07-25 19:20:24 UTC

Comment on attachment 219652 [details] [review]
Add, use g_main_current_context

>+  /* Note that we (or someone else) can't just call
>+   * g_source_get_context() here, because that will fail if the source
>+   * is already destroyed (since it's possible in that case that the
>+   * context has already been destroyed). But if there's a *current*
>+   * source, we know its context must be live.
>+   */

although the same logic applies to g_source_get_time(), which doesn't currently have a g_return_val_if_fail(!SOURCE_DESTROYED(source)), but maybe it should... so maybe I'd need g_main_context_get_time() too... (or just have some other way to do the hack I'm doing here, to figure out if we're still in the same iteration of the main loop that the task was created in)

Comment 63 Dan Winship 2012-07-30 13:02:12 UTC

Created attachment 219884 [details] [review]
glib/tests/mainloop: test g_source_get_time()

Verify that

  - g_source_get_time() does not change within a single callback
    (even if the real time does)

  - g_source_get_time() does not change between different callbacks in
    the same mainloop iteration

  - g_source_get_time() does change between iterations if the real
    time did.

Comment 64 Dan Winship 2012-07-30 13:02:25 UTC

Created attachment 219885 [details] [review]
gmain: allow g_source_get_context() on destroyed sources

g_source_get_context() was checking that the source wasn't destroyed
(since a source doesn't hold a ref on its context, and so
source->context might point to garbage in that case). However, it's
useful to be allowed to call g_source_get_context() on a source that
is destroyed-but-currently-running.

So instead, let g_source_get_context() return the context whenever
it's non-NULL, and clear the source->context of any sources that are
still in a context's sources list when the context is freed. Since
sources are only removed from the list when the source is freed (not
when it is destroyed), this means that now whenever a source has a
non-NULL context pointer, then that pointer is valid.

This also means that g_source_get_time() will now return-if-fail
rather than crashing if it is called on a source whose context has
been destroyed.

Add tests to glib/tests/mainloop to verify that g_source_get_context()
and g_source_get_time() work on destroyed sources.

Comment 65 Matthias Clasen 2012-07-30 16:31:29 UTC

Review of attachment 219650 [details] [review]:

Do we know other users of threadpools that might be affected by this ?
I'm generally ok with the idea.

Comment 66 Matthias Clasen 2012-07-30 16:39:25 UTC

Review of attachment 219884 [details] [review]:

moar tests, great

Comment 67 Matthias Clasen 2012-07-30 16:39:41 UTC

Review of attachment 219650 [details] [review]:

Do we know other users of threadpools that might be affected by this ?
I'm generally ok with the idea.

Comment 68 Matthias Clasen 2012-07-30 16:40:52 UTC

Review of attachment 219885 [details] [review]:

Looks harmless enough to me

Comment 69 Dan Winship 2012-07-30 17:34:11 UTC

(In reply to comment #67)
> Do we know other users of threadpools that might be affected by this ?
> I'm generally ok with the idea.

Theoretically, the only observable difference should be a slight speedup (because of fewer g_thread_new() calls) and some extra memory usage (because of idle threads). A program would have to be somewhat pathological to be able to behave differently as a result of this.

Of course, any app using gio is probably already seeing these default values anyway, so all this only applies to people not using gio.

Comment 70 Matthias Clasen 2012-07-31 17:11:29 UTC

Review of attachment 219650 [details] [review]:

Alright then

Comment 71 Dan Winship 2012-08-03 14:59:59 UTC

This is now in the wip/task branch on git.gnome.org, with the porting split up into multiple patches. I'd suggesting including the tests, GAsyncInitable, and networking ports in the original merge (after I sanity-check them again), and leaving the others for later.

Comment 72 Matthias Clasen 2012-08-06 14:21:32 UTC

Sounds good to me

Comment 73 Jasper St. Pierre (not reading bugmail) 2012-08-17 03:43:46 UTC

(In reply to comment #43)
> So maybe this is expanding the scope of this bug too much, but I wonder if we
> could improve the experience with asynchronous APIs even more.
> 
> In particular there are these two cases:
> 
> * Composition: Implementing an asynchronous function that runs N async
> functions to do its job, and completes when they all complete
> * Chaining: Implementing an asynchronous function that calls A, which on
> completion, needs to call B

I didn't read the entire bug, sorry if somebody mentioned this, but it's probably good to read prior art for this. The other thing that I think should be included is some definition of "progress", so you can have intermediate states between "started" and "finished".

The standard thing that everybody seems to rely on is Deferred/Promise, which was mostly originally stolen from the Twisted networking framework for Python.

Of course, we don't have all the flexibility of Python in C, but we I think we can still approach composition and chaining.

The point of a Deferred is just a handle for a potential operation that you can pass around. You stick "completed" handlers directly on the Deferred object. In Python:

     def pageFetched(data):
         print data
         return data

     d = getPage("http://example.com")
     d.addCallback(pageFetched)

One interesting thing is that you can attach multiple callbacks, and the data returned from a callback will be passed along:

     def getTitle(data):
         return data[data.find("<title>"):data.find("</title>")]

     def printTitle(title):
         print title

     d = getPage("http://example.com")
     d.addCallback(getTitle)
     d.addCallback(printTitle)

This means that you can compose a complex asynchronous operation out of a large asynchronous operation and multiple smaller ones. But if a callback returns a Deferred, the entire operation waits for that, too, passing the result of that Deferred onto the next callback.

     # library_code.py

     def gotSecretAPIToken(data):
         d = getPage("http://example.com/grabAThing/?token=%s")
         return d

     def grabAThing():
         d = getPage("http://example.com/getSecretAPIToken")
         d.addCallback(gotSecretAPIToken)
         return d

     # user_code.py

     def gotAThing(thing):
         print thing

     d = grabAThing()
     d.addCallback(gotAThing)

So, here we are, chaining: the gotAThing Deferred's callback will only be fired when both pages are fetched.

Now, chaining can follow the same pattern of polymorphism:

     def gatherResults(deferreds):
         parentD = Deferred()

         results = [None] * len(deferreds)
         nFinished = 0

         def deferredFinished(data, index):
             results[index] = data
             nFinished += 1
             if nFinished == len(deferreds):
                 parentD.callback(results)

         for i, childD in enumerate(deferreds):
             childD.addCallback(deferredFinished, i)
         return parentD

      # ...

      def gotPages(results):
          example, fdo, gnome = results
          print example, fdo, gnome

      d = gatherResults([getPage("http://example.com"),
                         getPage("http://freedesktop.org"),
                         getPage("http://gnome.org")])
      d.addCallback(gotPages)

(Sorry for the long reply, but these problems have been solved before. Just want to make sure you're aware of them)

Comment 74 Colin Walters 2012-08-20 21:32:18 UTC

(In reply to comment #73)
> 
> I didn't read the entire bug, sorry if somebody mentioned this, but it's
> probably good to read prior art for this. The other thing that I think should
> be included is some definition of "progress", so you can have intermediate
> states between "started" and "finished".

I think the basic problem with anything more ambitious here at the moment is that we're constrained by the existing async API pattern.  

If we do settle on something else like promises, perhaps one approach would be to generate C code implementing it derived from the existing async APIs.

But we shouldn't block danw's really huge and also really awesome cleanup patch on this.

Comment 75 Dan Winship 2012-08-20 21:33:56 UTC

(In reply to comment #74)
> But we shouldn't block danw's really huge and also really awesome cleanup patch
> on this.

FTR, mclasen and I agreed this morning that since there's nothing currently blocking on this, we'll wait until after 3.6 to land it, so that we then have the whole 3.8 cycle to port things to it and possibly change things.

Comment 76 Jasper St. Pierre (not reading bugmail) 2012-08-20 21:34:10 UTC

Not saying we should, and it's a really great cleanup, it's just that I'm afraid that if we want promises we're going to deprecate GTask as well.

Comment 77 Dan Winship 2012-09-05 00:27:49 UTC

*** Bug 683376 has been marked as a duplicate of this bug. ***

Comment 78 Debarshi Ray 2012-09-05 12:16:49 UTC

Typo alert: http://git.gnome.org/browse/glib/tree/gio/gtask.c?h=wip/task#n1281

s/nonw/none/

Comment 79 Christian Hergert 2012-09-06 01:30:40 UTC

Just catching up on this progress and I have a few things I'd like to add.

I wrote a library called gtask[1] a few years ago that tried to address some of the same issues. One thing I quite liked about what went into that library was the ability to compose tasks execution trees. So you can have tasks that are dependent on other tasks to complete.

For example:

 Task 1
    Task2
    Task3
      Task4

So that when Task 1 completes, 2 and 3 may start. When 3 completes, 4 can start. If any of them fail, the dependent tasks also fail.

This is particularly handy to break computational tasks into smaller chunks (allowing for parallelism when CPU allows). For example, Task1 might be fetching a URL, and 2 and 3 the saving and indexing of that document (in parallel).

I tried a second time to build this in a more simpler way, libiris[2], but it also got little traction. And most recent, as an attempt to make it as simple as possible, here[3].

Anyway, not proposing that we use any of my code at all, just that I would like to see task execution trees as an important part of any task abstraction.

[1] https://github.com/chergert/gtask
[2] https://github.com/chergert/iris
[3] https://github.com/chergert/catch-glib/blob/master/catch-glib/catch-task.h

Comment 80 Dan Winship 2012-09-06 18:34:27 UTC

(In reply to comment #79)
> Anyway, not proposing that we use any of my code at all, just that I
> would like to see task execution trees as an important part of any
> task abstraction.

Right, so the problem with this is that in gio-style async APIs, the caller doesn't receive any sort of handle on the async task, so there's no way it could then call something like catch_task_add_dependency(). (No, we're not adding a duplicate version of every single gio async API. :-)

There's a bit of discussion above (comment 43 and 44) about how we might do task trees in GTask... we can make the parallel-execution case a bit easier, but there's no obvious way to simplify sequential execution, since our APIs provide no way to express an intention to perform an asynchronous operation without actually starting to perform it.

(Awful idea number 1: push a new thread-default GMainContext before calling g_file_replace_async(), then pop it and don't start it running again until you're ready to write the file.)

(Awful idea number 2: Add g_cancellable_pause() / g_cancellable_unpause().)

Non-awful idea: just use sync ops in a thread... (assuming there are no thread-safety issues with the objects involved)

(Idea of ambiguous awfulness: Fibers? (bug 565501))

Can you point me to some real-world code that uses CatchTask/IrisTask/(your)GTask non-trivially?

Comment 81 David Zeuthen (not reading bugmail) 2012-09-06 18:50:36 UTC

(In reply to comment #80)
> Non-awful idea: just use sync ops in a thread... (assuming there are no
> thread-safety issues with the objects involved)

And things like G_DBUS_INTERFACE_SKELETON_FLAGS_HANDLE_METHOD_INVOCATIONS_IN_THREAD [1] actually promote this style - see udisks2 for an example where it's used in a lot of places.

[1] : longest define in GLib? :-)

Comment 82 Christian Hergert 2012-09-06 21:19:10 UTC

(In reply to comment #80)
> Right, so the problem with this is that in gio-style async APIs, the caller
> doesn't receive any sort of handle on the async task, so there's no way it
> could then call something like catch_task_add_dependency(). (No, we're not
> adding a duplicate version of every single gio async API. :-)

Agreed. This was one of the things that I was disappointed to see when the Gio stuff was released. I also agree that we shouldn't make additional functions to rectify this.

> There's a bit of discussion above (comment 43 and 44) about how we might do
> task trees in GTask... we can make the parallel-execution case a bit easier,
> but there's no obvious way to simplify sequential execution, since our APIs
> provide no way to express an intention to perform an asynchronous operation
> without actually starting to perform it.

So in CatchTask, I took the approach that the task was the GAsyncResult. While it doesn't let external systems add dependencies to your tasks easily, it does allow developers to build task trees internally to their API. That might be the safer design anyway.

Comment 83 Colin Walters 2012-09-06 21:30:17 UTC

(In reply to comment #80)
> (In reply to comment #79)
> > Anyway, not proposing that we use any of my code at all, just that I
> > would like to see task execution trees as an important part of any
> > task abstraction.
> 
> Right, so the problem with this is that in gio-style async APIs, the caller
> doesn't receive any sort of handle on the async task, so there's no way it
> could then call something like catch_task_add_dependency(). (No, we're not
> adding a duplicate version of every single gio async API. :-)

It'd be pretty ugly, but we could add something like

GTask *g_async_op_get_last_handle (void);

as a thread-local.

Comment 84 David Zeuthen (not reading bugmail) 2012-09-06 21:47:42 UTC

(In reply to comment #82)
> (In reply to comment #80)
> > Right, so the problem with this is that in gio-style async APIs, the caller
> > doesn't receive any sort of handle on the async task, so there's no way it
> > could then call something like catch_task_add_dependency(). (No, we're not
> > adding a duplicate version of every single gio async API. :-)
> 
> Agreed. This was one of the things that I was disappointed to see when the Gio
> stuff was released. I also agree that we shouldn't make additional functions to
> rectify this.

Well, changing

 void
 foo_bar_async (Foo                 *foo,
                GCancellable        *cancellable,
                GAsyncReadyCallback  callback,
                gpointer             user_data);

to

 GTaskInstance *  // or whatever
 foo_bar_async (Foo                 *foo,
                GCancellable        *cancellable,
                GAsyncReadyCallback  callback,
                gpointer             user_data);

isn't an ABI nor an API break - as long as we don't expect the caller to free the instance. You could say, for example, that the GTaskInstance* object is only valid to access until @callback has been called...

My EggDBus work did this by returning a guint, see e.g.

 http://people.freedesktop.org/~david/eggdbus-20091014/EggDBusConnection.html#egg-dbus-connection-send-message-with-reply
 http://people.freedesktop.org/~david/eggdbus-20091014/EggDBusConnection.html#egg-dbus-connection-pending-call-cancel
 http://people.freedesktop.org/~david/eggdbus-20091014/EggDBusConnection.html#egg-dbus-connection-pending-call-block

Comment 85 Dan Winship 2012-09-06 22:01:56 UTC

(In reply to comment #84)
> Well, changing
> 
>  void
> to
>  GTaskInstance *  // or whatever
> 
> isn't an ABI nor an API break

IIRC, it will cause gtkmm to have an ABI break, because the wrapped C++ version of the function will suddenly have a different mangled name.

(In reply to comment #83)
> It'd be pretty ugly, but we could add something like
> 
> GTask *g_async_op_get_last_handle (void);
> 
> as a thread-local.

That would be tricky though, since some async ops chain into other async ops, so they'd have to explicitly "g_async_op_set_last_handle()" to get it back to the right thing after doing that.

  GTask *g_async_op_get_for_callback (GAsyncReadyCallback, gpointer);

would probably work better. But it's still ugly.

Maybe we could have a less ugly version that just handled the task-chaining case?

Comment 86 Christian Hergert 2012-09-06 22:23:20 UTC

(In reply to comment #85)
> Maybe we could have a less ugly version that just handled the task-chaining
> case?

If the tasks are responsible for knowing when they are ready to be scheduled for execution, then it is fairly trivial to build subclasses for cases like "All Of N tasks must be completed", or "Any of N tasks must be completed", etc.

Comment 87 Dan Winship 2012-10-10 14:35:29 UTC

Pushed the base code and some of the porting. I'll push more porting
later on...