GNOME Bugzilla – Bug 659326
implement g_app_launch_context_get_environment for win32
Last modified: 2018-05-24 13:23:03 UTC
We currently do two things a lot in GLib these days: - make lots of worker threads - call gettext() in them The second happens for seemingly innocent reasons like accessing a type macro for the first time, causing a class_init function to be called which uses _() on its property strings. gettext is not threadsafe. In particular, _() can indirectly result in calls to getenv() which is not threadsafe. We're starting to see some crash traces showing up on this codepath. We need to take a look at what we do in GLib (and probably other libraries) to avoid calling gettext from any of our worker threads.
(In reply to comment #0) > gettext is not threadsafe. In particular, _() can indirectly result in calls > to getenv() which is not threadsafe. We're starting to see some crash traces > showing up on this codepath. got any bug numbers? > We need to take a look at what we do in GLib (and probably other libraries) to > avoid calling gettext from any of our worker threads. That seems like a complete non-starter. Eg, this would basically mean you couldn't throw GErrors in worker threads. And forbidding all uses of getenv() in other threads is even worse. I think we need to treat this as an upstream bug (in getenv, not gettext). And/or, figure out why people are calling setenv, and make them stop, since they're probably creating other race conditions as well.
One thing we _could_ do to mitigate the problem is to make g_[gs]etenv threadsafe, and have the glib_gettext wrappers take the same lock. That would probably largely solve the problem in core glib, but, of course, people can and will use direct libc calls, so the problem remains.
https://launchpadlibrarian.net/76187233/ThreadStacktrace.txt is the stacktrace that started this discussion
We may have to audit all of our code for setenv() calls not early in main(). One example that pops up is: gdk/tests/display.c: g_setenv ("DISPLAY", "poo", TRUE); Annotate shows: fe687e76 (Matthias Clasen 2011-01-18 09:36:24 -0500 42) g_setenv ("DISPLAY", "poo", TRUE); So we know who to blame =)
Right, I think we just need to outlaw setenv(), basically. There is no thread-safety issue with getenv() in the absence of setenv calls. Compare http://austingroupbugs.net/view.php?id=188
Created attachment 197503 [details] [review] Add thread-safety warnings to the g_setenv() and g_unsetenv() docs ====== we could potentially also make g_thread_create() set some global flag indicating that there is at least one other thread now, and have g_setenv()/g_unsetenv() check that and do a g_warning()... (Also, bug 659145 is an example of this problem. Epiphany crashes on startup maybe 1 time in 20 or so.)
Review of attachment 197503 [details] [review]: ::: glib/gutils.c @@ +1338,3 @@ + * the very start of your program, before creating any other threads + * (or creating objects that create worker threads of their own). + * </para></warning> Might be good to also point out that some widely used APIs such as gettext() use getenv() under the covers.
Created attachment 197576 [details] [review] gutils: warn on g_setenv/g_unsetenv after threads are created ===== so, here's that. "G_DEBUG=fatal-warnings make check" passes. The pthread_atfork() is needed so you don't get warnings when setting up the child environment in g_spawn(). I made the warning only happen on UNIX, since on Windows, the g_spawn child setup stuff happens in-process, so it would trigger the warnings otherwise. (And it's probably all thread-safe on Windows anyway, so the g_spawn code isn't a bug in that case.)
Review of attachment 197576 [details] [review]: note: we already have a pthread_atfork() in gmain.c to detect forking after the main context has been created. Not sure if it makes sense to tie-in there or not.
so, trying to fix bug 659145 completely turns out to be tricky, and we might want to think about adding some new API... http://git.gnome.org/browse/epiphany/tree/src/ephy-window.c?id=3.2.0#n2658 g_unsetenv (EPHY_UUID_ENVVAR); command_line = g_strdup_printf ("gvfs-open %s", uri); g_spawn_command_line_async (command_line, &error); to fix this, they'd have to use the much more complicated g_spawn_async() instead, and either pass a complete environment, or use a child_setup callback. And in: http://git.gnome.org/browse/epiphany/tree/src/window-commands.c?id=3.2.0#n508 they do an unsetenv() followed (indirectly) by g_app_info_launch(), which doesn't provide any way to modify the environment at all... maybe there could be some new API in GAppLaunchContext for that? (Actually, that could be used in the previous case as well, using gio functions instead of spawning gvfs-open, which is really kind of lame anyway...)
What would the api look like ? g_app_launch_context_setenv (GAppLaunchContext *c, const gchar *variable, const gchar *value) ?
of course command_line = g_strdup_printf ("env -u %s gvfs-open %s", EPHI_UUID_ENVVAR, uri); will work too, and is not much more awful then launching gvfs-open is in the first place...
Created attachment 197879 [details] [review] Add g_launch_context_set_child_setup_func() Calling setenv() or unsetenv() in a multithreaded process is unsafe, so add a way to set a GSpawnChildSetupFunc when using GAppInfo so that the caller can modify the environment of the child safely. ==== This is what I was thinking of. BTW, it turns out that it's in theory not safe to call setenv()/unsetenv() from a GSpawnChildSetupFunc, since they're not async-signal-safe...
I think the proscription against non-async-signal-safe code from the setup function is pretty arbitrary -- more like a way of saying "try not to cause too much damage here". We could probably be more precise.
(In reply to comment #14) > I think the proscription against non-async-signal-safe code from the setup > function is pretty arbitrary -- more like a way of saying "try not to cause too > much damage here". We could probably be more precise. diff --git a/glib/gspawn.h b/glib/gspawn.h index c9da1a8..70bb3cb 100644 --- a/glib/gspawn.h +++ b/glib/gspawn.h @@ -102,9 +102,13 @@ typedef enum * to perform but before calling exec(). On POSIX actions taken in this * function will thus only affect the child, not the parent. * - * Note that POSIX allows only async-signal-safe functions (see signal(7)) - * to be called in the child between fork() and exec(), which drastically - * limits the usefulness of child setup functions. + * Note that any mutexes that were held by other threads in the parent + * process at the time of the fork() will still be locked in the child + * process, and they will never be unlocked (since the threads that + * held them don't exist in the child). Therefore, it is not safe to + * call functions that may try to acquire mutexes from a + * #GSpawnChildSetupFunc. POSIX only guarantees that async-signal-safe + * functions (see signal(7)) can be called successfully. * * Also note that modifying the environment from the child setup function But setenv() is still unsafe, because it may need to realloc(), and it's possible that some malloc-related lock was held at the time when you forked. It seems that the only really safe way to modify the child environment is to pass in a complete and pre-modified envp[].
I think I would prefer a simple setenv/unsetenv over allowing arbitrary child_setup functions that have hard-to-explain restrictions.
Created attachment 198098 [details] [review] gutils: Add functions for working with environment arrays. When spawning a child process, it is not safe to call setenv() before the fork() (because setenv() isn't thread-safe), but it's also not safe to call it after the fork() (because it's not async-signal-safe). So the only safe way to alter the environment for a child process from a threaded program is to pass a fully-formed envp array to exec*/g_spawn*/etc. So, add g_environ_getenv(), g_environ_setenv(), and g_environ_unsetenv(), which act like their namesakes, but work on arbitrary arrays rather than working directly on the environment.
Created attachment 198099 [details] [review] GAppLaunchContext: add environment-manipulating functions Add functions for manipulating the environment under which a GAppLaunchContext will launch its children, to avoid thread-related bugs with using setenv() directly. FIXME: win32 side isn't implemented yet
Review of attachment 197879 [details] [review]: Do we still want this one if we can't actually use it to call setenv()?
Review of attachment 198098 [details] [review]: ::: glib/gspawn.h @@ +122,3 @@ + * drastically limits the usefulness of child setup functions. + * + * In particular, it is not safe to modify environment variables from I would say "In particular note it is not safe to call any function which may call malloc(), which includes POSIX functions such as setenv()." ::: glib/gutils.c @@ +1701,3 @@ + * @envp to @value. Both the variable's name and value should be in + * the GLib file name encoding. On UNIX, this means that they can be + * any sequence of bytes. On Windows, they should be in UTF-8. Let's use the term "byte string", like g_file_get_basename(), because it *does* have to be NUL terminated. @@ +1735,3 @@ + + memcpy (new_envp, envp, length * sizeof (gchar *)); + g_free (envp); Why not g_realloc()? ::: glib/gutils.h @@ +274,3 @@ +gchar ** g_get_environ (void); +const gchar * g_environ_getenv (gchar **envp, + const gchar *variable); How about g_environ_list_get_value () or something? Some name that makes it a bit more obvious we're operating on a temporary, and doesn't contain a substring "getenv" so you know it's different. (Not a strong feeling, just a possibility) @@ +280,3 @@ + gboolean overwrite) G_GNUC_WARN_UNUSED_RESULT; +gchar ** g_environ_unsetenv (gchar **envp, + const gchar *variable) G_GNUC_WARN_UNUSED_RESULT; We could accept NULL for value and not expose _unsetenv(). ::: glib/tests/cond @@ +2,3 @@ + +# cond - temporary wrapper script for .libs/cond +# Generated by libtool (GNU libtool) 2.4 You seem to have attached the libtool wrappers =)
(In reply to comment #20) > + * @envp to @value. Both the variable's name and value should be in > + * the GLib file name encoding. On UNIX, this means that they can be > + * any sequence of bytes. On Windows, they should be in UTF-8. > > Let's use the term "byte string", like g_file_get_basename(), because it *does* > have to be NUL terminated. ok, I just copied the docs from g_getenv/g_setenv/g_unsetenv, but we can fix it there too. > Why not g_realloc()? huh. no clue. :) > +gchar ** g_get_environ (void); > +const gchar * g_environ_getenv (gchar **envp, > + const gchar *variable); > > How about g_environ_list_get_value () or something? Some name that makes it a > bit more obvious we're operating on a temporary, and doesn't contain a > substring "getenv" so you know it's different. Note that g_get_environ() already exists. I chose the other names to sort of match it. And I felt that including getenv/setenv/unsetenv in the names was a feature, not a bug. > We could accept NULL for value and not expose _unsetenv(). we could, but we could have done that in g_setenv() too...
Comment on attachment 197879 [details] [review] Add g_launch_context_set_child_setup_func() yeah, this is just trading one race condition for another
Review of attachment 198099 [details] [review]: ::: gio/gdesktopappinfo.c @@ +1130,3 @@ + for (i--; i >= 0; i--) + *(data->pid_envvar++) = buf[i]; + *data->pid_envvar = '\0'; This kind of hack makes me think more about implementing some alloca() based functions. I have g_build_filename_alloca() already. We could consider g_vprintf_alloca() too, though that one would be hard. @@ +1270,3 @@ + envp = g_environ_setenv (envp, + "GIO_LAUNCHED_DESKTOP_FILE_PID", + "XXXXXXXXXXXXXXXXXXXX", Could use a comment like /* Filled in in child_setup function */ @@ +1272,3 @@ + "XXXXXXXXXXXXXXXXXXXX", + TRUE); + data.pid_envvar = (char *)g_environ_getenv (envp, "GIO_LAUNCHED_DESKTOP_FILE_PID"); Ug...you just wrote these functions and we're already playing evil tricks with them =/ Can't you call g_environ_getenv() inside the child_setup so we at least don't mysteriously explode if someone adds a g_environ_setenv() call after this?
While thinking about this more, while it's true that the POSIX spec limits what you can call, in *reality* glibc has hooks that on Linux, ensure malloc is at least somewhat usable after fork() time - from my reading of the code, it unlocks the mutex at least. Otherwise really we'd be deadlocking all over the place - Python exposes both threads and child setup functions (and I bet it calls malloc() while interpreting the child setup inside the forked process too). So we definitely have issues with unsynchronized getenv()/setenv() pairs, it's not immediately necessary for GNOME/Linux to clean all of our child_setup functions of calls to malloc() .
(In reply to comment #23) > + data.pid_envvar = (char *)g_environ_getenv (envp, > "GIO_LAUNCHED_DESKTOP_FILE_PID"); > > Ug...you just wrote these functions and we're already playing evil tricks with > them =/ > > Can't you call g_environ_getenv() inside the child_setup Yes, but... > so we at least don't > mysteriously explode if someone adds a g_environ_setenv() call after this? That's backwards actually. If we don't pass pid_envvar, then we have to pass the envp as a whole, but then if someone adds another g_environ_setenv() after that, it will break, because that will (likely) realloc the array, leaving the old data.envp dangling. Whereas g_environ_getenv() documents that the returned pointer only becomes invalid if you change *that variable* (since each element of the envp is allocated separately). (In reply to comment #24) > While thinking about this more, while it's true that the POSIX spec limits what > you can call, in *reality* glibc has hooks that on Linux, ensure malloc is at > least somewhat usable after fork() time - from my reading of the code, it > unlocks the mutex at least. Yes, it appears that it locks the mutexes before fork(), and unlocks them in both processes afterward. Makes sense. We could say that glib only supports platforms that do this, since it's not even that hard to implement... > So we definitely have issues with unsynchronized getenv()/setenv() pairs, it's > not immediately necessary for GNOME/Linux to clean all of our child_setup > functions of calls to malloc() . Yeah, I just figured better safe than sorry. We don't want to be here again in a few years.
Review of attachment 198098 [details] [review]: Committed with some cleanups, taking Colins comments into account.
Review of attachment 198099 [details] [review]: Committed, with some cleanups and fixes.
reopening because: (In reply to comment #18) > FIXME: win32 side isn't implemented yet >@@ -278,6 +278,15 @@ g_win32_app_info_launch (GAppInfo *appinfo, > } > #endif >+ /* FIXME: Need to do something with >+ * g_app_launch_context_get_environment()... ShellExecuteExW() >+ * doesn't have any way to pass an environment though. We need to >+ * either (a) update environment, ShellExecuteExW(), revert >+ * environment; or (b) find an API to figure out what app >+ * ShellExecuteExW() would launch, and then use g_spawn_async() >+ * instead. >+ */ (Windows presumably doesn't have the same thread-safety issues, but we at least need to make the API work the same way.)
retitle for clarity
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/451.