After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 123472 - cancelled opens() can leak handles
cancelled opens() can leak handles
Status: RESOLVED WONTFIX
Product: gnome-vfs
Classification: Deprecated
Component: Async operations
cvs (head)
Other All
: High normal
: ---
Assigned To: gnome-vfs maintainers
gnome-vfs maintainers
Depends on:
Blocks:
 
 
Reported: 2003-09-29 14:26 UTC by Alexander Larsson
Modified: 2008-09-06 19:17 UTC
See Also:
GNOME target: ---
GNOME version: 2.11/2.12


Attachments
gnome-vfs-123472-file-method-cancel-while-open.diff (1.28 KB, patch)
2005-08-01 23:14 UTC, Federico Mena Quintero
rejected Details | Review
Updated patch (5.67 KB, patch)
2005-08-05 21:57 UTC, Federico Mena Quintero
none Details | Review
New patch (87.23 KB, patch)
2005-08-10 04:10 UTC, Federico Mena Quintero
none Details | Review
gnome-vfs2-94400-cancelation-rewrite.diff (86.43 KB, patch)
2005-08-24 23:09 UTC, Federico Mena Quintero
none Details | Review
gnome-vfs-123472-cancelation-rewrite.diff (86.54 KB, patch)
2005-08-27 22:56 UTC, Federico Mena Quintero
needs-work Details | Review

Description Alexander Larsson 2003-09-29 14:26:45 UTC
<alex> consider the case of an open() that has been cancelled mid-way, but
the module doesn't really handle cancels (or it raced with cancellation)
<campd> ok
 campout campd 
<alex> campd: execute_open() will finish, and call job_oneway_notify()
<alex> job_oneway_notify() calls _gnome_vfs_async_job_add_callback() to add
a callback, however, since the job is cancelled that will just return FALSE
<alex> So, job_oneway_notify will just call
_gnome_vfs_job_destroy_notify_result, and then return
<campd> so it leaves an open handle
<alex> eventually it'll reach thread_routine() which calls
_gnome_vfs_job_complete() to figure out the operation is completed
(cancelled) and kills the job
<alex> and the handle leaks
<alex> if however the cancel hasn't run
<alex> it will:   g_idle_add (dispatch_job_callback, notify_result);
<alex> and if the job is canceled before dispatch_job_callback, note what
happens:
<alex>  if (cancelled) {
<alex>   /* cancel the job in progress */
<alex>   JOB_DEBUG (("cancelling job %u %u",
<alex>        GPOINTER_TO_UINT (notify_result->job_handle),
<alex>        notify_result->callback_id));
<alex> Which for open-type-ops leads to some locking + 
<alex>      handle_cancelled_open (job);
<alex> which schedules a close of the handle
<alex> It seems to me that we should do something similar in the
already-cancelled case
<alex> However, all the locking, threads and crap makes changing this code
super-complex
<alex> i.e. the job_oneway_notify is run in the vfs thread, with the job
lock held.
<alex> Maybe we still can schedule something from it though?
<campd> hrm
<alex> No, we can't
<alex> returning from the vfs thread with a cancelled op means the job will
be killed
<alex> we probably need to piggyback on the thread to execute the close
<alex> And that seems hard since the job has cancelled==TRUE
<campd> so a brute force way would be to let cancelled opens on to the job
callback queue 
<campd> so that they hit the dispatch_job_callback cancelled clause
<alex> campd: But the job will be killed before the idle runs
<campd> hrm, yeah
<campd> isn't that a race anyway then?
<alex> no, its only killed if the job is completed
<alex> and open() calls that are not cancelled are not completed
<alex> see _gnome_vfs_job_complete()
<campd> ah
<campd> ok
<alex> so, basically the job sticks around until the handle is closed
<alex> and is reused for each op
<campd> nod
<alex> maybe thread_routine() could look for cancelled opens?
Comment 1 Federico Mena Quintero 2005-08-01 23:14:45 UTC
Created attachment 50100 [details] [review]
gnome-vfs-123472-file-method-cancel-while-open.diff

This patch fixes the bug for file-method.c.  After open() returns, we check the
context for cancellation and close() the file if appropriate.
Comment 2 Federico Mena Quintero 2005-08-01 23:16:21 UTC
We'll have the patch in comment #1 in the gnome-vfs package in Novell Linux
Desktop --- the bug exposed itself when mounting netware filesystems.  I guess
this is useful for NFS et al as well.
Comment 3 Federico Mena Quintero 2005-08-04 16:29:07 UTC
Actually, the patch in comment #1 doesn't quite fix the problem.  A cancel can
still happen after open(2) has returned.  That's exactly what alex and campd
were talking about, but I didn't quite understand it.
Comment 4 Federico Mena Quintero 2005-08-04 18:22:20 UTC
About comment #3 - I can still duplicate the bug (fds get leaked), as my patch
is not fully correct.

I think I have the correct analysis now:

The only place where open() cancellations can be processed without any race
conditions is in the dispatch_job_callback() in the main thread.  This is the
idle handler that gets queued from the worker thread in job_oneway_notify().

So, jobs and their callbacks must stay in the job queue and callback map,
respectively, until dispatch_job_callback() gets run.  This means two things:

- job_oneway_notify() must always install the idle handler for the callback, and
the notify_result must always be put in the callback map.  This means changing
_gnome_vfs_async_job_add_callback() to do this unconditionally, rather than
based on whether the job is canceled.

- thread_routine() must not free the job if it is canceled.  It decides whether
to do this based on gnome_vfs_job_complete().

The changes I'm making are as follows:

1. Make _gnome_vfs_async_job_add_callback() return void rather than a boolean. 
It used to return a boolean based on the job being canceled.

2. Make job_oneway_notify() unconditionally add the idle handler for the benefit
of the main thread.

3. Make gnome_vfs_job_complete() not return true if the job is canceled - this
only gets called from thread_routine(), and by the definition above, a job is
not complete until the main thread executes dispatch_job_callback() and takes
care of cancellation.

4. Let dispatch_job_callback() process cancellations as usual.  It will now pick
up cancellations of open() calls.
Comment 5 Federico Mena Quintero 2005-08-05 21:57:49 UTC
Created attachment 50297 [details] [review]
Updated patch

Now I'm using this patch.  It doesn't leave descriptors open, but sometimes
Nautilus crashes.  I'm trying to get meaningful info out of the crashes.
Comment 6 Federico Mena Quintero 2005-08-05 22:07:43 UTC


Thread 1 (Thread 1091111008 (LWP 7120))

  • #0 ??
  • #1 ??
  • #2 ??
  • #3 ??
  • #4 raise
    from /lib/tls/libc.so.6
  • #5 abort
    from /lib/tls/libc.so.6
  • #6 g_logv
    from /opt/gnome/lib/libglib-2.0.so.0
  • #7 g_log
    from /opt/gnome/lib/libglib-2.0.so.0
  • #8 read_file_open_callback
    at eel-vfs-extensions.c line 256
  • #9 dispatch_open_callback
    at gnome-vfs-job.c line 220
  • #10 dispatch_job_callback
    at gnome-vfs-job.c line 636
  • #11 g_idle_dispatch
    from /opt/gnome/lib/libglib-2.0.so.0
  • #12 g_main_context_dispatch
    from /opt/gnome/lib/libglib-2.0.so.0
  • #13 g_main_context_iterate
    from /opt/gnome/lib/libglib-2.0.so.0
  • #14 g_main_loop_run
    from /opt/gnome/lib/libglib-2.0.so.0
  • #15 gtk_main
    from /opt/gnome/lib/libgtk-x11-2.0.so.0
  • #16 main
    at nautilus-main.c line 335

That's the assertion I hit - somehow the handle in the notify_result is screwed up.
Comment 7 Federico Mena Quintero 2005-08-10 04:07:25 UTC
OK, so I've changed a lot of code and it no longer seems to leak FDs for me.

The ChangeLog is this:

2005-08-09  Federico Mena Quintero  <federico@ximian.com>

	A GnomeVFSJob represents an open handle --- as before, it is
	identified by a GnomeVFSAsyncHandle.  Each job has a current
	operation, which is job->op; this is of type GnomeVFSOp.  Each job
	also has a list of canceled operations, which is
	job->canceled_ops.

	The job queue now stores keys of the form <seq_num, priority> and
	values which are GnomeVFSOp.  Sending a job to the worker thread
	means sending a GnomeVFSOp.  All operations, including
	canceled ones, queue a GnomeVFSNotifyResult which the main thread
	will pick up in dispatch_job_callback().

	The only place where we can handle cancellations without race
	conditions is in dispatch_job_callback(), as that is when the
	worker thread is done and control has gone back to the main
	thread.  This even handles the case where the worker thread is
	done, and the main thread does a cancel() before reaching its main
	loop.

	Within a job, a GnomeVFSOp is either the current operation
	(job->op), or a canceled operation awaiting notification
	(job->canceled_ops).

	It is illegal to call gnome_vfs_async_*() on a handle whose
	corresponding job has a current operation in progress.  The only
	exception is of course gnome_vfs_async_cancel().

	Now that notify_results are tied to the lifetime of the job
	itself, there is no danger of aliasing them.  Thus, we can remove
	the async_job_callback_map and its associated functions.

	* libgnomevfs/gnome-vfs-job.h (GnomeVFSJob): Removed the
	"cancelled" and "failed" fields; the right place for these is in
	GnomeVFSOp.  Added a "canceled_ops" field, and a
	"free_when_canceled_ops_complete" field.
	(GnomeVFSOp): Added a "job" field to identify the job
	corresponding to this operation.  Added a "canceled" field.
	(GnomeVFSNotifyResult): Added an "op" field to identify the
	operation to which this notification refers.  Removed the
	"cancelled" and "type" fields; the right place for those is in
	GnomeVFSOp.
	(GnomeVFSOpType): Removed GNOME_VFS_OP_READ_WRITE_DONE; it's
	useless now.
	(GnomeVFSOpenOpResult): Removed the "callback" and "callback_data"
	fields, since they now come on the notify_result->op.
	(GnomeVFSOpenAsChannelOpResult): Likewise.
	(GnomeVFSCreateOpResult): Likewise.
	(GnomeVFSCreateAsChannelOpResult): Likewise.
	(GnomeVFSCloseOpResult): Likewise.
	(GnomeVFSReadOpResult): Likewise.
	(GnomeVFSWriteOpResult): Likewise.
	(GnomeVFSSeekOpResult): Likewise.
	(GnomeVFSGetFileInfoOpResult): Likewise.
	(GnomeVFSSetFileInfoOpResult): Likewise.
	(GnomeVFSFindDirectoryOpResult): Likewise.
	(GnomeVFSLoadDirectoryOpResult): Likewise.
	(GnomeVFSXferOpResult): Likewise.
	(GnomeVFSFileControlOpResult): Likewise.
	(GnomeVFSNotifyResult): Removed the "callback_id" field.

	* libgnomevfs/gnome-vfs-job.c (_gnome_vfs_job_set): Don't allow
	changing the op of a job whose current op is still in progress.
	Set the new fields of the op structure.
	(_gnome_vfs_job_complete): Removed; it's no longer used.  Removed
	really old code inside "#ifdef OLD_CONTEXT_DEPRECATED".
	(_gnome_vfs_job_module_cancel): Lock the job mutex around getting
	its job->op->context.
	(_gnome_vfs_job_execute): Take a GnomeVFSOp rather than a
	GnomeVFSJob.  Use the op rather than the old fields in
	GnomeVFSNotifyResult.  Pass the op to the specific execute_*()
	functions.
	(job_oneway_notify): Take just a GnomeVFSNotifyResult as argument.
	(execute_open): Adjust for the new GnomeVFSNotifyResult and
	GnomeVFSOp.
	(execute_open_as_channel): Likewise.
	(execute_create): Likewise.
	(execute_create_symbolic_link): Likewise.
	(execute_create_as_channel): Likewise.
	(execute_close): Likewise.
	(execute_read): Likewise.
	(execute_write): Likewise.
	(execute_seek): Likewise.
	(execute_get_file_info): Likewise.
	(execute_set_file_info): Likewise.
	(execute_find_directory): Likewise.
	(execute_load_directory): Likewise.
	(xfer_callback): Likewise.
	(execute_xfer): Likewise.
	(execute_file_control): Likewise.
	(dispatch_open_callback): Likewise, and get the callback and
	callback_data from the op, not the notify_result.
	(dispatch_create_callback): Likewise.
	(dispatch_open_as_channel_callback): Likewise.
	(dispatch_create_as_channel_callback): Likewise.
	(dispatch_close_callback): Likewise.
	(dispatch_read_callback): Likewise.
	(dispatch_write_callback): Likewise.
	(dispatch_seek_callback): Likewise.
	(dispatch_load_directory_callback): Likewise.
	(dispatch_get_file_info_callback): Likewise.
	(dispatch_find_directory_callback): Likewise.
	(dispatch_set_file_info_callback): Likewise.
	(dispatch_xfer_callback): Likewise.
	(dispatch_file_control_callback): Likewise.
	(job_oneway_notify): Don't call _gnome_vfs_async_job_add_callback().
	(job_notify): Likewise.
	(dispatch_sync_job_callback): Don't call _gnome_vfs_async_job_remove_callback().
	(dispatch_job_callback): Likewise.

	* libgnomevfs/gnome-vfs-async-ops.c (gnome_vfs_async_cancel):
	Check that the job exists, as its lifetime is now protected by its
	pending notifications.  Also, don't allow canceling a job which
	doesn't have a running operation.
	Removed really old code inside "#ifdef OLD_CONTEXT_DEPRECATED".
	(gnome_vfs_async_close): There is no need to re-try the close()
	anymore if a read/write is pending.

	* libgnomevfs/gnome-vfs-async-job-map.c:
	(_gnome_vfs_async_job_cancel_job_and_callbacks): Just take the job
	as an argument; it will always exist when this is called.
	(_gnome_vfs_async_job_map_shutdown): Don't call
	async_job_callback_map_destroy.
	(_gnome_vfs_async_job_callback_valid): Removed.
	(_gnome_vfs_async_job_add_callback): Removed.
	(_gnome_vfs_async_job_remove_callback): Removed.
	(_gnome_vfs_async_job_cancel_job): Renamed from
	_gnome_vfs_async_job_cancel_job_and_callbacks().  Don't frob the
	nonexistent async_job_callback_map.
	(async_job_callback_map_destroy): Removed.

	* libgnomevfs/gnome-vfs-job-queue.c (value_destroy): Removed.
	(job_queue_new): Use plain g_tree_new(), since the destroy funcs
	were never used anyway.
	(job_queue_add): Insert a GnomeVFSOp in the tree, not a
	GnomeVFSJob.
	(find_first_value): The tree now stores GnomeVFSOp as values.
	(job_queue_get_first): Likewise.
	(_gnome_vfs_job_queue_run): Likewise.  Pass the op to
	_gnome_vfs_job_create_slave().
	(_gnome_vfs_job_schedule): Pass the op to
	_gnome_vfs_job_create_slave().

	* libgnomevfs/gnome-vfs-job-slave.c (_gnome_vfs_job_create_slave):
	Take a GnomeVFSOp as an argument instead of a GnomeVFSJob.
	(thread_routine): Now we get passed a GnomeVFSOp instead of a
	GnomeVFSJob.  Also, since we don't access the async_job_map
	anymore, we can avoid acquiring its lock.
Comment 8 Federico Mena Quintero 2005-08-10 04:10:23 UTC
Created attachment 50495 [details] [review]
New patch

This patch seems to fix the problem for me (I'm awaiting confirmation from the
Novell guys who had the actual problem).  gnome-vfs/tests/test-async-cancel
passes all the tests, and Nautilus doesn't leak open descriptors for me when
using it to browse slow file systems.

I'd really appreciate if someone could test this, especially on a
multi-processor machine.
Comment 9 Federico Mena Quintero 2005-08-10 15:29:45 UTC
OK, and the Novell guys say that it works for them as well!  This makes me
really happy :)
Comment 10 Christian Neumair 2005-08-10 16:06:23 UTC
Milestoning/raising priority since its *really* important to not leak,
especially if we have a fix, and Federico and "the Novell guys" are happy.
Comment 11 Christian Kellner 2005-08-11 04:52:05 UTC
Federico, great work! What is your opinion on getting this into 2.12. It's a
huge change and not as tested that well, is it? On the other hand I like the new
scheme and we must not leak stuff. I could look more at the code and give it a
spin on the weekend. What do you think?
Comment 12 Christian Kellner 2005-08-11 04:53:56 UTC
Ohh and btw, we should get this picture you draw of the scheme into the
(developer) docs.
Comment 13 Alexander Larsson 2005-08-24 16:11:56 UTC
I had an initial look at this, and it doesn't look right. Apart from obvious
leftover debugging stuff like the fmq_log calls and things like +#if 0 out
things in the header I found some serious problems.

_gnome_vfs_dispatch_module_callback:
This is totally broken. This will be called when we're already executing a job,
for instance when doing an authentication. We can't assign a new type to the
already existing, executing job. So the removal of notify_result.type is a
problem here. There is no way callbacks could work with this patch.

cancellation vs the job_lock:
The job_lock is held during the execution of a operation (it is taken in
thread_routine before calling _gnome_vfs_job_execute). However, this patch adds
takings of job_lock in several places. Worst of all, it takes job_lock in
gnome_vfs_async_cancel(), both directly and in both _gnome_vfs_job_module_cancel
 and gnome_vfs_async_job_cancel_job. This means cancellation and execution of an
operation is mutually exclusive, meaning you can't cancel an executing blocking
operation!!! I don't see how cancellation is working with this patch.

In no way is this an exhaustive look at this patch, but its clear that there are
problems with it, and I'd be *very* scared of getting this into gnome 2.12 at
this late stage.
Comment 14 Federico Mena Quintero 2005-08-24 16:45:28 UTC
Alex is right about _gnome_vfs_dispatch_module_callback() - I more or less
forgot to handle that case.  I didn't know what module callbacks were for (I do
now, thanks to the inimitable Michael Meeks) :)

Cancellation vs. the job_lock: I'll take a second look, but so far cancellation
has worked fine for me.  The execute() functions should only pay attention to
the op, not the job.  The job->op will get moved to the job->canceled_ops list
if the operation is canceled mid-way, but that's the idea anyway.

I'll post an updated patch soon, hopefully today.
Comment 15 Luis Villa 2005-08-24 19:52:40 UTC
Question about this patch: how long have we been leaking this stuff? Since at
least late 2003, right? If so, I'm not going to keep this a 2.12 showstopper-
users have been living with this a long time and I won't stop the release for it.

That said, if Alex is comfortable with the patch, and we get it in for at least
one or two tarball releases before final, I won't try to stop the patch either.
Comment 16 Federico Mena Quintero 2005-08-24 23:09:11 UTC
Created attachment 51298 [details] [review]
gnome-vfs2-94400-cancelation-rewrite.diff

This updated patch handles module callbacks.  I just added an
is_module_callback flag to GnomeVFSNotifyResult:  since the op's type has to
stay the same, we need a new field.

2005-08-24  Federico Mena Quintero  <federico@ximian.com>

	A GnomeVFSJob represents an open handle --- as before, it is
	identified by a GnomeVFSAsyncHandle.  Each job has a current
	operation, which is job->op; this is of type GnomeVFSOp.  Each job
	also has a list of canceled operations, which is
	job->canceled_ops.

	The job queue now stores keys of the form <seq_num, priority> and
	values which are GnomeVFSOp.  Sending a job to the worker thread
	means sending a GnomeVFSOp.  All operations, including
	canceled ones, queue a GnomeVFSNotifyResult which the main thread
	will pick up in dispatch_job_callback().

	The only place where we can handle cancellations without race
	conditions is in dispatch_job_callback(), as that is when the
	worker thread is done and control has gone back to the main
	thread.  This even handles the case where the worker thread is
	done, and the main thread does a cancel() before reaching its main
	loop.

	Within a job, a GnomeVFSOp is either the current operation
	(job->op), or a canceled operation awaiting notification
	(job->canceled_ops).

	It is illegal to call gnome_vfs_async_*() on a handle whose
	corresponding job has a current operation in progress.	The only
	exception is of course gnome_vfs_async_cancel().

	Now that notify_results are tied to the lifetime of the job
	itself, there is no danger of aliasing them.  Thus, we can remove
	the async_job_callback_map and its associated functions.

	* libgnomevfs/gnome-vfs-job.h (GnomeVFSJob): Removed the
	"cancelled" and "failed" fields; the right place for these is in
	GnomeVFSOp.  Added a "canceled_ops" field, and a
	"free_when_canceled_ops_complete" field.
	(GnomeVFSOp): Added a "job" field to identify the job
	corresponding to this operation.  Added a "canceled" field.
	(GnomeVFSNotifyResult): Added an "op" field to identify the
	operation to which this notification refers.  Added an
	"is_module_callback" flag.  This indicates whether the result in
	question corresponds to a module callback, rather than a normal
	op's notification.  Removed the "cancelled" and "type" fields; the
	right place for those is in GnomeVFSOp.  Removed the "callback_id"
	field.
	(GnomeVFSOpType): Removed GNOME_VFS_OP_READ_WRITE_DONE; it's
	useless now.  Removed GNOME_VFS_OP_MODULE_CALLBACK; that
	information is stored in GnomeVFSNotifyResult.is_module_callback now.
	(GnomeVFSOpenOpResult): Removed the "callback" and "callback_data"
	fields, since they now come on the notify_result->op.
	(GnomeVFSOpenAsChannelOpResult): Likewise.
	(GnomeVFSCreateOpResult): Likewise.
	(GnomeVFSCreateAsChannelOpResult): Likewise.
	(GnomeVFSCloseOpResult): Likewise.
	(GnomeVFSReadOpResult): Likewise.
	(GnomeVFSWriteOpResult): Likewise.
	(GnomeVFSSeekOpResult): Likewise.
	(GnomeVFSGetFileInfoOpResult): Likewise.
	(GnomeVFSSetFileInfoOpResult): Likewise.
	(GnomeVFSFindDirectoryOpResult): Likewise.
	(GnomeVFSLoadDirectoryOpResult): Likewise.
	(GnomeVFSXferOpResult): Likewise.
	(GnomeVFSFileControlOpResult): Likewise.

	* libgnomevfs/gnome-vfs-job.c (_gnome_vfs_job_set): Don't allow
	changing the op of a job whose current op is still in progress.
	Set the new fields of the op structure.
	(_gnome_vfs_job_complete): Removed; it's no longer used.  Removed
	really old code inside "#ifdef OLD_CONTEXT_DEPRECATED".
	(_gnome_vfs_job_module_cancel): Lock the job mutex around getting
	its job->op->context.
	(_gnome_vfs_job_execute): Take a GnomeVFSOp rather than a
	GnomeVFSJob.  Use the op rather than the old fields in
	GnomeVFSNotifyResult.  Pass the op to the specific execute_*()
	functions.
	(job_oneway_notify): Take just a GnomeVFSNotifyResult as argument.
	(execute_open): Adjust for the new GnomeVFSNotifyResult and
	GnomeVFSOp.
	(execute_open_as_channel): Likewise.
	(execute_create): Likewise.
	(execute_create_symbolic_link): Likewise.
	(execute_create_as_channel): Likewise.
	(execute_close): Likewise.
	(execute_read): Likewise.
	(execute_write): Likewise.
	(execute_seek): Likewise.
	(execute_get_file_info): Likewise.
	(execute_set_file_info): Likewise.
	(execute_find_directory): Likewise.
	(execute_load_directory): Likewise.
	(xfer_callback): Likewise.
	(execute_xfer): Likewise.
	(execute_file_control): Likewise.
	(dispatch_open_callback): Likewise, and get the callback and
	callback_data from the op, not the notify_result.
	(dispatch_create_callback): Likewise.
	(dispatch_open_as_channel_callback): Likewise.
	(dispatch_create_as_channel_callback): Likewise.
	(dispatch_close_callback): Likewise.
	(dispatch_read_callback): Likewise.
	(dispatch_write_callback): Likewise.
	(dispatch_seek_callback): Likewise.
	(dispatch_load_directory_callback): Likewise.
	(dispatch_get_file_info_callback): Likewise.
	(dispatch_find_directory_callback): Likewise.
	(dispatch_set_file_info_callback): Likewise.
	(dispatch_xfer_callback): Likewise.
	(dispatch_file_control_callback): Likewise.
	(job_oneway_notify): Don't call _gnome_vfs_async_job_add_callback().
	(job_notify): Likewise.
	(dispatch_sync_job_callback): Don't call
_gnome_vfs_async_job_remove_callback().
	(dispatch_job_callback): Likewise.
	(_gnome_vfs_dispatch_module_callback): Set
notify_result->is_module_callback 
	to TRUE.
	(_gnome_vfs_job_destroy_notify_result): There is no
	GNOME_VFS_OP_MODULE_CALLBACK anymore; removed that case.
	(dispatch_sync_job_callback): If the notify_result refers to a
	module callback, handle it outside the switch(), as there is no
	GNOME_VFS_OP_MODULE_CALLBACK anymore.

	* libgnomevfs/gnome-vfs-async-ops.c (gnome_vfs_async_cancel):
	Check that the job exists, as its lifetime is now protected by its
	pending notifications.	Also, don't allow canceling a job which
	doesn't have a running operation.
	Removed really old code inside "#ifdef OLD_CONTEXT_DEPRECATED".
	(gnome_vfs_async_close): There is no need to re-try the close()
	anymore if a read/write is pending.

	* libgnomevfs/gnome-vfs-async-job-map.c:
	(_gnome_vfs_async_job_cancel_job_and_callbacks): Just take the job
	as an argument; it will always exist when this is called.
	(_gnome_vfs_async_job_map_shutdown): Don't call
	async_job_callback_map_destroy.
	(_gnome_vfs_async_job_callback_valid): Removed.
	(_gnome_vfs_async_job_add_callback): Removed.
	(_gnome_vfs_async_job_remove_callback): Removed.
	(_gnome_vfs_async_job_cancel_job): Renamed from
	_gnome_vfs_async_job_cancel_job_and_callbacks().  Don't frob the
	nonexistent async_job_callback_map.
	(async_job_callback_map_destroy): Removed.

	* libgnomevfs/gnome-vfs-job-queue.c (value_destroy): Removed.
	(job_queue_new): Use plain g_tree_new(), since the destroy funcs
	were never used anyway.
	(job_queue_add): Insert a GnomeVFSOp in the tree, not a
	GnomeVFSJob.
	(find_first_value): The tree now stores GnomeVFSOp as values.
	(job_queue_get_first): Likewise.
	(_gnome_vfs_job_queue_run): Likewise.  Pass the op to
	_gnome_vfs_job_create_slave().
	(_gnome_vfs_job_schedule): Pass the op to
	_gnome_vfs_job_create_slave().

	* libgnomevfs/gnome-vfs-job-slave.c (_gnome_vfs_job_create_slave):
	Take a GnomeVFSOp as an argument instead of a GnomeVFSJob.
	(thread_routine): Now we get passed a GnomeVFSOp instead of a
	GnomeVFSJob.  Also, since we don't access the async_job_map
	anymore, we can avoid acquiring its lock.
Comment 17 Alexander Larsson 2005-08-25 07:32:16 UTC
federico: Maybe cancellation works for short operations that return quickly from
executing, but I don't see how it can work for an operation that is actually
blocking for a long time. The cancel call will take the lock and wait until the
blocking job is finished.
Comment 18 Alexander Larsson 2005-08-25 07:33:01 UTC
luis: I think we have leaked this for the entire lifetime of gnome-vfs basically.
Comment 19 Luis Villa 2005-08-25 16:30:48 UTC
Then this is not urgent nor a 2.12 stopper. Alex, you and Christian are still
welcome to take it for 2.12 if you think it is a good idea, but it isn't
something I'll push for.
Comment 20 Federico Mena Quintero 2005-08-27 22:56:58 UTC
Created attachment 51440 [details] [review]
gnome-vfs-123472-cancelation-rewrite.diff

I'm an idiot; I had left an uninitialized struct field in GnomeVFSNotifyResult
which broke sync callbacks.  This patch fixes it.

The only leftovers are:

- Make gnome_vfs_async_cancel() not block if the job_lock is being held by the
worker thread.

- Actually close(2) the files until the job structure is freed.  There's a race
condition in this situation:

  thread 1: open() - finishes
  thread 1: read() - gets queued
  thread 1: cancel()
  thread 1: close() - gets queued
  thread 2: close() executes
  thread 3: read() executes

That is, the thread that would process the read() gets preempted, and the
close() takes place first.  In the best case, we end up reading from an invalid
fd.  In the worst case, we read from a fd we didn't intend, as the fd got
recycled.

The code before the patch had a READ_WRITE_DONE state to handle this; the new
code simply needs to delay the close() until the job is actually freed.
Comment 21 Alexander Larsson 2005-10-20 13:44:52 UTC
federico: I think your locking scheme is fundamentally incorrect. We can't share
a lock between the actual i/o path and the cancellation path. 

To show that this breaks i've added the test/test-long-cancel.c testcase to cvs.
You'll see that with the current code it works, and with your patch applied the
cancel call hangs.
Comment 22 André Klapper 2008-09-06 19:17:37 UTC
gnome-vfs has been deprecated and superseded by gio/gvfs since GNOME 2.22, hence mass-closing many of the gnome-vfs requests/bug reports. This means that gnome-vfs is NOT actively maintained anymore, however patches are still welcome.

If your reported issue is still valid for gio/gvfs, please feel free to file a bug report against glib/gio or gvfs.

@Bugzilla mail recipients: query for gnome-vfs-mass-close to get rid of these notification emails all together.


General further information: http://en.wikipedia.org/wiki/GVFS 
Reasons behind this decision are listed at http://www.mail-archive.com/gnome-vfs-list@gnome.org/msg00899.html