After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 510664 - gspawn without helper process.
gspawn without helper process.
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: win32
unspecified
Other Windows
: Normal normal
: ---
Assigned To: gtk-win32 maintainers
gtk-win32 maintainers
Depends on:
Blocks:
 
 
Reported: 2008-01-19 19:34 UTC by Lieven van der Heide
Modified: 2008-02-24 22:54 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
changed the win32 spawning functions to use CreateProcess, instead of the helper process. (42.48 KB, patch)
2008-01-19 19:40 UTC, Lieven van der Heide
none Details | Review

Description Lieven van der Heide 2008-01-19 19:34:30 UTC
Please describe the problem:
On windows, the spawning functions don't work very well, mostly because of the helper process.

For example, the code I provided can get into an infinite loop, because sometimes the child process is shut down, even before the helper process sends it's handles back to the parent process. When the parent process receives the handles, it calls DuplicateHandle on them, but if the child process already closed, this will result in an invalid handle, even though the child process was successfully executed. These invalid handles will result in a hang, in some GMainLoop code.

I looked at the reasons that were specified why the helper process was used, and even though all those restrictions do apply to the _wspawnv function, none of them apply to the CreateProcess function, so it would be much better, and more stable, to just use CreateProcess, without the helper process instead.

Steps to reproduce:
run the following code:

#include <glib.h>
#include <assert.h>

GMainLoop *main_loop = 0;

gboolean child_watch(GPid pid, gint child_status, void* data)
{
  g_main_loop_quit(main_loop);
  return TRUE;
}

int main()
{
  GMainContext *main_context = 0;
  GSource *source = 0;

  while(1)
  {
    gchar *working_dir = g_get_current_dir();

    gchar *args[] = 
    {
      "c:/Perl/bin/perl.exe",
      "c:/Script.pl",
      0,
    };

    int flags = G_SPAWN_DO_NOT_REAP_CHILD;
    GPid pid;
    GError *error;

    gboolean res = g_spawn_async_with_pipes(working_dir,
      args,0,(GSpawnFlags)flags,0,0,&pid,
      0,0,0,
      &error);
    if(!res)
    {
      assert(!"spawn failed");
    }

    main_context = g_main_context_new();
    main_loop = g_main_loop_new(main_context,FALSE);

    source = g_child_watch_source_new(pid);
    g_source_set_callback(source,(GSourceFunc)child_watch,0,NULL);
    g_source_attach(source,main_context);
    g_source_unref(source);

    g_main_loop_run(main_loop);

    g_spawn_close_pid(pid);

    g_main_context_unref(main_context);
    g_main_loop_unref(main_loop);

    g_free(working_dir);
  }

  g_main_loop_unref(main_loop);

  return 0;
}

Actual results:
sometimes it works, but sometimes, it gets in an infinite loop, and outputs errors related to WaitForMultipleObjects.

Expected results:
It should never hang.

Does this happen every time?
no, but the example code just retries the same thing all the time, and it's often just a few runs before it starts hanging.

Other information:
Comment 1 Lieven van der Heide 2008-01-19 19:40:48 UTC
Created attachment 103216 [details] [review]
changed the win32 spawning functions to use CreateProcess, instead of the helper process.
Comment 2 Lieven van der Heide 2008-01-19 19:43:40 UTC
The only thing that doesn't work yet with that patch is G_SPAWN_LEAVE_DESCRIPTORS_OPEN, but as far as I could tell, that didn't work in the old code either. 

To be honest, I doubt it's possible on windows at all. Inheriting handles is possible, but I couldn't find anything about inheriting file descriptors.
Comment 3 Tor Lillqvist 2008-01-20 10:08:23 UTC
G_SPAWN_LEAVE_DESCRIPTORS_OPEN does work. After all, the very "G" in GTK+, GIMP, uses that feature in its plug-in communication. (GIMP creates two pipes with _pipe(), passes the file descriptor numbers of the plug-in's read and write ends of the pipes as command-line argument to the plug-in. The file descriptors are inherited by the plug-in child process.)

> To be honest, I doubt it's possible on windows at all. Inheriting handles 
> is possible, 

Firstly, in case people who don't realize it are reading, I should point out that "file descriptors" (the small numbers as returned and handled by open(), dup(), read() etc) are on Windows relevant to the C library (whichever one an executable is using--yes, there are several alternatives) only.

The Windows operating system itself doesn't know anything about file descriptors. (It only handles HANDLEs.) File descriptors are indexes into tables local to the C library instance in a process. (Or, in the nowadays fortunately nd arare occasion that static C libraries are used, to each copy of the C library linked into the EXE and any DLLs.) (It is also possible that the EXE uses one C library and some DLL it uses another C library. But let's not talk about the problems that can cause now.)

> but I couldn't find anything about inheriting file descriptors.

Indeed. Unfortunately the exact mechanism how C file descriptors are inherited in the Microsoft C library is not publicly (as in "no strings attached") documented as far as I know.

Download the (freely available) Platform SDK (might be called "Windows SDK" now I think). It includes source code for the Microsoft C library, and you can see the gory details in the file dospawn.c. It involves using the fields cbReserved2 and lpReserved2 in the STARTUPINFO struct, which are publicly documented only as "Reserved for the C Run-time; must be NULL".

As the only place to find out this information is the C library sources, I have avoided using this information in GLib, as I am not sure what the Platform SDK license says about making use of information learned from the C library sources. If the C file descriptor inheritance method could be duplicated in GLib, then GLib could use CreateProcess() directly instead of the *spawn*() functions, and avoid the helper process in even more cases.
Comment 4 Lieven van der Heide 2008-01-20 12:55:32 UTC
Thanks for your reply.

I had a look at the dospawn.c file, and it does indeed handle the file descriptors. I do think it's feasible to implement this handle inheritence with CreateProcess ourselves, but that does mean that we will be calling undocumented API functions. If that's not a problem, I will fix my patch to handle file descriptors correctly.
Comment 5 Lieven van der Heide 2008-01-22 09:39:11 UTC
The data that's passed through lpReserved2, is the underlying HANDLE, and a char that contains the oflags, passed to the open function.

I can't find any good way to figure these flags out, that will work with any runtime (ie. without relying on implementation details). So I guess that in case that G_SPAWN_LEAVE_DESCRIPTORS_OPEN flag is specified, the helper process will still be required.

I will have to change the helper process, though, to use the CreateProcess function (and acquire the lpReserved2 data through GetStartupInfo() of the helper proc), instead of wspawnv. If we use CreateProcess, we can spawn the process in paused mode, then pass the handle to the parent proc, and then resume the child proc. This way it's not possible that the parent process gets an invalid handle.
Comment 6 Lieven van der Heide 2008-01-22 09:55:08 UTC
or maybe it's just that the helper proc is already closed, when Duplicate is called. 

Anyway, it looks like it's fixable:)
Comment 7 Tor Lillqvist 2008-02-24 22:54:45 UTC
> When the parent process receives the
> handles, it calls DuplicateHandle on them, but if the child process
> already closed, this will result in an invalid handle,

I think there are two problem cases here: 1) the helper process has exited before the parent process calls DuplicateHandle() but the actual user process started is still running, and 2) the actual user process started by the helper process has exited, possibly also the helper process.

I accidentally came across case 1 yesterday while debugging some other enhancements I did to the gspawn-win32.c code. Stepping the code in do_spawn_with_pipes() in the parent process meant that it got trivially reproduced, as the helper process had ample time to exit before the do_spawn_with_pipes() code tried the DuplicateHandle(). I added a fix for this: the helper process now doesn't exit until the parent process allows it. I use another pipe for this: The parent process writes a byte to the pipe when it is OK for the helper process to exit.

Now when I read this bug report more closely, and try some debugging on your test program to trigger case 2, I notice that that is no longer a problem after the fix, as the helper process is still alive when DuplicateHandle() is called. 

Even if the handle the helper process told the parent process is to a nonexisting process, it is still a valid handle, and can be duplicated as long as the helper process itself is alive. Windows does have something similar to the zombie concept in Unix. If you have a handle to a dead process, you can wait on it (and the wait will trivially return immediately).

As GLib 2.16.0 should be just weeks away, I think I won't commit the fixes to the stable branch. Can you check out the current trunk from SVN and verify that it fixes your problems? (Please note that you need to use both a freshly build libglib DLL and freshly built helper processes, as the parameters to the helper processes was changed. Maybe I should have changed the names of the helper process because of this.)

Resolving this bug as for now as fixed. Reopen if necessary. Feel free to add more comments.

ChangeLog entry:

2008-02-24  Tor Lillqvist  <tml@novell.com>

	* glib/gutils.c (_glib_get_installation_directory): New internal function.

	* glib/gspawn-win32.c: When spawning the helper process, use an
	explicit full path. (#518292)

	* glib/gspawn-win32.c
	* glib/gspawn-win32-helper.c: Fix race condition when using the
	helper process. This seems to fix #510664.

	When the helper process writes the handle of the actual started
	user process to the parent process, it must be duplicated in the
	parent process with DuplicateHandle() so that it is a valid handle
	in that process. However, if the helper process has happened to
	exit before the DuplicateHandle() call, the duplication will
	fail. Thus we must synchronise the helper process's exit. Use
	another pipe for this.

	Take care not to inherit the writing end of this pipe to the
	helper process. Also, in the helper process, take care not to
	inherit either of the pipes used for communication with the parent
	process to the started user process.