After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 536172 - optionally follow symlinks when monitoring files.
optionally follow symlinks when monitoring files.
Status: RESOLVED OBSOLETE
Product: glib
Classification: Platform
Component: gio
unspecified
Other Linux
: Normal normal
: ---
Assigned To: Tomas Bzatek
gtkdev
: 669320 (view as bug list)
Depends on:
Blocks: 546954
 
 
Reported: 2008-06-02 07:32 UTC by Behdad Esfahbod
Modified: 2018-05-24 11:27 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
test (2.56 KB, text/x-csrc)
2008-06-02 07:38 UTC, Behdad Esfahbod
  Details
gio-inotify-symlink-support.patch (16.31 KB, patch)
2008-07-29 15:58 UTC, Tomas Bzatek
reviewed Details | Review
gio-inotify-symlink-support.patch (16.08 KB, patch)
2008-08-12 13:20 UTC, Tomas Bzatek
none Details | Review
glib-symlink-inotify-support.patch (17.19 KB, patch)
2008-08-28 16:21 UTC, Tomas Bzatek
needs-work Details | Review
glib-monitoring-tests.patch (25.03 KB, patch)
2008-08-28 16:25 UTC, Tomas Bzatek
none Details | Review

Description Behdad Esfahbod 2008-06-02 07:32:47 UTC
I wrote some code to monitor all font configuration files and detect changes.
It works for some of the files but not the others.  Seems like when multiple files under the same directory are monitored, changes to those are not detected.  Attaching test code.
Comment 1 Behdad Esfahbod 2008-06-02 07:38:06 UTC
Created attachment 111925 [details]
test

Run the test, it lists files being monitored.  Modify one of the bunch under the same dir and notice it go undetected.  Now modify some other ones and see they are detected.
Comment 2 Matthias Clasen 2008-07-02 05:26:15 UTC
I believe the problem here is inconsistent handling of symlinks. It works for the config files which are files, it fails for the ones which are symlinks to
../config.avail/foo. Calling realpath() on the filename in
g_inotify_file_monitor_constructor appears to fix the issue.
Comment 3 Behdad Esfahbod 2008-07-02 16:36:25 UTC
Bug 532815 is about same thing for hardlinks.
Comment 4 Tomas Bzatek 2008-07-29 15:58:28 UTC
Created attachment 115502 [details] [review]
gio-inotify-symlink-support.patch

Okay, the patch attached can solve most of the symlink inotify issues, please test.

It can:
  - correctly detect changes in the file/directory the symlink is pointing to
  - watch is correctly recreated when watched file changes type from regular file/directory to symlink and back
  - in case of invalid symlink a missing watch is created and automatically changed to standard watch when symlink becomes valid (and vice versa)
  - changes made to watched symlinked file/directory are masked to symlink itself when emitting the changed signal

Few notes:
  - for directory monitor we always create two inotify watches; one to target directory and one on the directory itself (to detect symlink target changes)
  - for file monitor the second (symlink) inotify watch is created only when watched file is symlink (or turns to a symlink)
  - hopefully thread safe, recreate events are sent to g_idle queue, not blocking inotify mutexes. Would be good to test if it works fine not causing any lockups.
  - automatic symlink following is tricky, we need to always call realpath() to retrieve target path. If realpath() fails, we resolve path string through GIO functions; this won't however do recursive resolve
  - automatic directory symlink following is tricky; removing watch on symlinked directory will actually remove target directory watch, even when having other active watches

Limitations:
  - if symlink is pointing to a file and changes to a directory, you'll need to change from g_file_monitor_file() to g_file_monitor_directory(). I think this is correct by design, user should choose if he wants to watch directory or file (needs to be fixed in Nautilus)

Documentation:
  - the second GFile argument ("other_file") in GFileMonitor::changed signal and g_file_monitor_emit_event() function is not really used at the moment. I've noted this to the docs, leaving the argument there and when we decide to break the API, it can be removed. There are fragments of code indicating it was intended for passing parent directory to the signal, however not really used. In case of symlinks, we can pass symlink target as the second argument, might come handy for somebody. Opinions?


Nautilus patches and GFileMonitor glib tests will come soon...

Behdad: your test code always recreates watches; if you delete file and create it again, it won't get included in the list. Some text editors remove files and then create new ones when saving.
Comment 5 John McCutchan 2008-07-29 16:13:44 UTC
Aside:

Tomas is correct: The second argument to GFileMonitor::changed signal is not currently used. This is due to a change that Alex made, unfortunately, without any review. The change also makes the distinction between GFileMonitor and GDirectoryMonitor meaningless. Too bad. My original code had a different changed signal for GFileMonitor and GDirectoryMonitor. 
Comment 6 John McCutchan 2008-07-29 16:15:40 UTC
I believe the patch is racy: between g_file_test (filename, G_FILE_TEST_IS_SYMLINK) and resulting branches the file could have been deleted and recreated.
Comment 7 Matthias Clasen 2008-07-29 18:03:24 UTC
I agree that the g_file_test is racy and to be avoided. Of course, it'll always be racy in some ways unless there's a way to resolve the symlinks and add the watch atomically in one operation...

I think much of your detailed explanation of how symlinks are handled needs to 
end up in the documentation.
Comment 8 Matthias Clasen 2008-08-01 19:07:33 UTC
Tomas, can you update your patch ?

Also, having testcases would important, to verify that the other file monitor implementations behave similar to what we are implementing here.
Comment 9 Behdad Esfahbod 2008-08-05 07:57:28 UTC
(In reply to comment #4)

> Limitations:
>   - if symlink is pointing to a file and changes to a directory, you'll need to
> change from g_file_monitor_file() to g_file_monitor_directory(). I think this
> is correct by design, user should choose if he wants to watch directory or file
> (needs to be fixed in Nautilus)

We recently added g_file_monitor().  User may not really care.  My fontconfig example is one such case.  It should automatically do the right thing IMO.
 
> Documentation:
>   - the second GFile argument ("other_file") in GFileMonitor::changed signal
> and g_file_monitor_emit_event() function is not really used at the moment. I've
> noted this to the docs, leaving the argument there and when we decide to break
> the API, it can be removed. There are fragments of code indicating it was
> intended for passing parent directory to the signal, however not really used.
> In case of symlinks, we can pass symlink target as the second argument, might
> come handy for somebody. Opinions?

Makes sense.  Also for directory monitors, we can pass the directory as other_file?  Something like that.

> Nautilus patches and GFileMonitor glib tests will come soon...
> 
> Behdad: your test code always recreates watches; if you delete file and create
> it again, it won't get included in the list. Some text editors remove files and
> then create new ones when saving.

Not sure I understand what you mean.  In my case I ask fontconfig for list of files again, so it's not an issue.  Or do you mean something else?
Comment 10 Behdad Esfahbod 2008-08-05 07:58:25 UTC
(In reply to comment #5)
> Aside:
> 
> Tomas is correct: The second argument to GFileMonitor::changed signal is not
> currently used. This is due to a change that Alex made, unfortunately, without
> any review. The change also makes the distinction between GFileMonitor and
> GDirectoryMonitor meaningless. Too bad. My original code had a different
> changed signal for GFileMonitor and GDirectoryMonitor. 

Actually this was partially fixed.  There is no GDirectoryMonitor anymore.
Comment 11 Tomas Bzatek 2008-08-12 13:20:08 UTC
Created attachment 116421 [details] [review]
gio-inotify-symlink-support.patch

Updated patch which removes racy g_file_test() test.
Comment 12 Tomas Bzatek 2008-08-12 13:37:34 UTC
(In reply to comment #9)
> We recently added g_file_monitor().  User may not really care.  My fontconfig
> example is one such case.  It should automatically do the right thing IMO.

The g_file_monitor() queries info for the argument passed in and then calls either g_file_monitor_file() or g_file_monitor_directory() accordingly. For any later changes the file/directory monitor stays at its original type.

> > Behdad: your test code always recreates watches; if you delete file and create
> > it again, it won't get included in the list. Some text editors remove files and
> > then create new ones when saving.
> 
> Not sure I understand what you mean.  In my case I ask fontconfig for list of
> files again, so it's not an issue.  Or do you mean something else?

Not relevant anymore, but imagine the following repro steps:
 1. start the test program, let it create watches for all config files
 2. edit one of the symlinks, change the target to an invalid path
 3. "changed" signal is emitted, all watches are recreated against fresh config list. The invalid config file is excluded by FcConfigGetConfigFiles(). 
 4. further changes to previously modified symlink goes undetected since it's no longer monitored

All I wanted to say last time is that I was unable to properly reproduce your "some files go detected, some not" behaviour, this might be the reason. However this scenario should be covered by the patch.
Comment 13 Matthias Clasen 2008-08-12 14:49:38 UTC
We really need testcases here, to verify how different the behaviour of the various monitor implementations is.
Comment 14 Christian Neumair 2008-08-12 16:10:56 UTC
Through realpath(), this patch seems to introduce sync. I/O in the main thread, even if it's executed in an idle callback (i.e. idle_cb).

Another issue might be the number of automatic inotify subscriptions:

Assuming a directory hierarchy with a directory
foo/
containing N=100 files.

When we now have M=3 symbolic links
foolink -> foo
barlink -> foo
foobarlink -> foo

showing all of them, and registering a file monitor for each of the files in the symlinked directories creates N*M=300 additional inotify file monitors (M=3 for each destination file in foo/). Depending on the concrete inotify implementation, this may be totally fine, but we have to verify that before automatically enforcing it for all GIO-driven applications.

At least for Nautilus, I think it would be enough to proxy the events directly in Nautilus, since we already keep around a hash table for all known symlinks.
In contrast to Tomas' approach, this allows to quickly look up all affected files using hash tables in memory and also works for remote file systems like SFTP.

However, I do also understand that this is a generic problem, so many applications might profit from a shared implementation. But this automatic monitoring should at least be made optional, for instance by allowing to toggle it with GFileMonitorFlags.
Comment 15 Matthias Clasen 2008-08-13 05:57:10 UTC
> Through realpath(), this patch seems to introduce sync. I/O in the main thread,
> even if it's executed in an idle callback (i.e. idle_cb).

I don't see any way around it, though. Do you ? Except for punting the problem on the users of the API, that is...


> creates N*M=300 additional inotify file monitors

I don't think that is true. Of course, I may have misunderstood Tomas' description of what the patch does. It will make a nice testcase though


> At least for Nautilus, I think it would be enough to proxy the events directly
> in Nautilus, since we already keep around a hash table for all known symlinks.
> In contrast to Tomas' approach, this allows to quickly look up all affected
> files using hash tables in memory and also works for remote file systems like
> SFTP.

Not sure I understand what you are proposing here.
Comment 16 Christian Neumair 2008-08-13 09:32:19 UTC
> > Through realpath(), this patch seems to introduce sync. I/O in the main thread,
> > even if it's executed in an idle callback (i.e. idle_cb).

> I don't see any way around it, though. Do you ? Except for punting the problem
> on the users of the API, that is...

I can only speak for Nautilus, which uses Alex' asynchronous implementation. It uses GFileInfo (async.) and g_file_resolve_relative_path():

The symlink's GFileInfo provides the G_FILE_ATTRIBUTE_STANDARD_SYMLINK_TARGET attribute. It is also queried using readlink() [cf. gio/glocalfileinfo.c:read_link()], but in a worker thread, and the target file is composed I/O-less, using
  g_file_resolve_relative_path (symlink_parent, file->details->symlink_name);
which internally uses g_build_filename() instead of realpath().

> > creates N*M=300 additional inotify file monitors
> I don't think that is true.

Yes you are right, I skimmed to roughly over it. I thought it would call realpath() for all files, and add a monitor if the target file does not match. This is only done for symlinks, though.

> > At least for Nautilus, I think it would be enough to proxy the events directly in Nautilus
> Not sure I understand what you are proposing here.

I'm just pointing out that Nautilus keeps around a hash table containing all resolved symlinks, which could be used to listen for events on the resolved symlinks ourselves (i.e. in Nautilus). It would avoid further I/O, but - as you pointed out - “punt(s) the problem on the users of the API”.

We've received numeruous bug reports about all kinds of issues with synchronous I/O. It seems to be a huge problem with NFS setups when a server is down.
Comment 17 Matthias Clasen 2008-08-14 03:44:38 UTC
> I can only speak for Nautilus, which uses Alex' asynchronous implementation. It
> uses GFileInfo (async.) and g_file_resolve_relative_path():

Ok, thats worth checking out. Tomas ?
Comment 18 Tomas Bzatek 2008-08-14 12:52:21 UTC
(In reply to comment #14)
> Through realpath(), this patch seems to introduce sync. I/O in the main thread,
> even if it's executed in an idle callback (i.e. idle_cb).

That's right, it does additional I/O when not using g_file_test() anymore.
However g_file_resolve_relative_path (..., file->details->symlink_name) does not do recursive symlink expanding, which is crucial for inotify which is sometimes non-detereministic on symlinks. I'm using g_file_resolve_relative_path() anyway as a fallback when realpath() fails, e.g. on invalid symlink targets.

Perhaps we can use GIOScheduler and do some mutex magic to make all I/O asynchronous.


> Assuming a directory hierarchy with a directory
> foo/
> containing N=100 files.
> 
> When we now have M=3 symbolic links
> foolink -> foo
> barlink -> foo
> foobarlink -> foo
The real number of inotify slots used is decreased by the checks for duplicates before inotify watch is added (useful in case more symlinks are pointing to the same destination). Also, we always watch directories and not single files, then we just pick up changes for requested file after inotify callback is received.
This way we simply move expenses to internal lists instead of inotify slots.

So, if there are N=100 files in the same directory all pointing to one symlink, the real number of inotify slots used will be 2 (or even one when "foo" is in the same directory). Every other directory in the testing hierarchy will eat one inotify slot, no matter how many files it contains. However every symlink pointing outside this hierarchy will eat its own separate slot for the directory the target lies in. The only expenses here are records in internal lists and necessity to go through these lists to resolve all watches pointing to the particular inotify watch.

> But this automatic monitoring should at least be made optional, for instance
> by allowing to toggle it with GFileMonitorFlags.
Good idea, this can be done even without breaking the API.

> We've received numeruous bug reports about all kinds of issues with synchronous
> I/O. It seems to be a huge problem with NFS setups when a server is down.
Agree, this can be a real problem.


Comment 19 Matthias Clasen 2008-08-25 20:38:54 UTC
Still waiting for tests here. Tomas ?
Comment 20 Tomas Bzatek 2008-08-28 16:21:26 UTC
Created attachment 117530 [details] [review]
glib-symlink-inotify-support.patch

Updated version of the patch. Fixes three bugs found during the tests.

This version still uses blocking calls to expand symlinks. Async version is under development, it would need more testing and changes to the tests.

Some more info:
- Monitoring files inside of symlinked directory is tricky; since we always monitor parent directory for file monitors, we need to resolve their real path. However to be complete, we would need to monitor each path fragment, which can be symlinks too. This is almost impossible and extremely expensive. The other way would be to check that on watch remove, but that's even more expensive due to excessive I/O.

So we only resolve paths on monitor creation and don't care about other changes (to-be-checked).

The trick here is that kernel inotify probably doesn't distinguish between real directory and symlink to it, while the duplication code in gio does. So this way we would cancel both watches.
Comment 21 Tomas Bzatek 2008-08-28 16:25:44 UTC
Created attachment 117531 [details] [review]
glib-monitoring-tests.patch

Monitoring tests for GIO test suite. Covers basic file and directory operations as well as several symlink scenarios.

Please note these are live tests - creating testing directory where the tests are performed. Tests can be also run independently (check --help).
Comment 22 Behdad Esfahbod 2008-08-28 16:36:10 UTC
Maybe there's some inotify changes to request to make this easier?
Comment 23 John McCutchan 2008-08-28 16:59:54 UTC
Hey All,

inotify already supports the IN_ISDIR flag which will return an error if the inode that it opened is not a directory. A similar IN_ISNOTSYMLINK should be easy to add. This avoids the race. The problem of course is that you can't rely on this behaviour on older kernels. But people should be upgrading and in 2 years the older kernels will be rare. 

Hope that helps,
John
Comment 24 Alexander Larsson 2008-09-01 10:47:16 UTC
The other_file argument is meant to be used when we introduce a "file was moved" event (not currently implemented, but could be done w/ inotify). It would be the destination of the move.

For the current events its not used and we could use it for something else if needed.
Comment 25 Alexander Larsson 2008-09-03 19:53:22 UTC
Fully automatically handling of symlinks is very complicated and expensive. For instance if i'm monitoring /foo/bar/gazonk with the following symlinks

/foo -> /tmp
/tmp/bar -> /usr
/usr/gazonk -> /a/b
/a/b -> /c/d (a regular file)

Then to automatically handle all sort of changes then you really need to monitor all the 4 files (3 symlinks and one real file) for changes.

I don't think its really meaningful to do all this, and a partial implementation would just be a source of bug reports. Instead I think we should just follow what most other file apis does, i.e. have a flag for "follow symlinks", and if not set, we monitor only /usr/gazonk (i.e. the file (a symlink) specified by the filename and the filesystem symlinks) and if it is set we monitor only /c/d (i.e. the target file at monitor construction time).

Most people probably want the follow link behaviour, so the flag should be of the NOFOLLOW_SYMLINK form.
Comment 26 Behdad Esfahbod 2008-09-03 20:05:04 UTC
(In reply to comment #25)
> Fully automatically handling of symlinks is very complicated and expensive. For
> instance if i'm monitoring /foo/bar/gazonk with the following symlinks
> 
> /foo -> /tmp
> /tmp/bar -> /usr
> /usr/gazonk -> /a/b
> /a/b -> /c/d (a regular file)
> 
> Then to automatically handle all sort of changes then you really need to
> monitor all the 4 files (3 symlinks and one real file) for changes.
> 
> I don't think its really meaningful to do all this, and a partial
> implementation would just be a source of bug reports. Instead I think we should
> just follow what most other file apis does, i.e. have a flag for "follow
> symlinks", and if not set, we monitor only /usr/gazonk (i.e. the file (a
> symlink) specified by the filename and the filesystem symlinks) and if it is
> set we monitor only /c/d (i.e. the target file at monitor construction time).
> 
> Most people probably want the follow link behaviour, so the flag should be of
> the NOFOLLOW_SYMLINK form.

Fair enough.  Although in the follow-case, deleting the symlink should still trigger an event, right?

(I personally don't think the cost of monitoring all the necessary dirs is prohibitively expensive.  Most of the time there is no symlinks involved at all)
Comment 27 Matthias Clasen 2013-02-05 03:50:46 UTC
*** Bug 669320 has been marked as a duplicate of this bug. ***
Comment 28 GNOME Infrastructure Team 2018-05-24 11:27:01 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/144.