GNOME Bugzilla – Bug 728375
GVfs caching
Last modified: 2018-09-21 17:41:08 UTC
Several backend has own caches. File info caches are implemenented e.g. in archive, ftp, gphoto2, obexftp backend. Caches are used for results of queryinfo/enumeration jobs or for different checks used in other operations. Those caches reduces trafic and speed up repetitious jobs. However it would be good to have one gvfs-wide cache. We could simplify some backends and have better control about the results. File data caches are implemented e.g. in gphoto2, obexftp, webdav backend. Caches are used to emulate standard stream operations. Whole file is cached during open and readed from memory, or file is written to the memory and uploaded on close. Also gvfs-wide could siplify some backends if we could emulate those operations using push/pull. I'm working on caching in gvfs as my master theses. There are listed some other reasons and ideas for caching in gvfs: https://thesis-managementsystem.rhcloud.com/topic/show/161/gvfs-caching-subsystem
There is proposal for file info cache and enumeration cache: https://github.com/ondrejholy/gvfs/compare/gvfsinfocache It is simple cache (similar those in some backends). Each backend has it's own instance of cache (so we can set different policy for different backends). We have to allow caching when inicializing backend and it's all. Cache is integrated inside job objects. It uses hash table to map absolute file path to GFileInfo (info cache) or GList of GFileInfo (enumeration cache). It allows invalidation by time and it is possible to limit its size. LRU is used to find victim if the cache is full. We don't need to call methods of backend if we have valid data. Cached infos are invalidated by writing operations. Cache is also disabled during those operations. Invalidation by time should be set carefully. Readonly backends or backends with exclusive access (e.g. obexftp, gphoto2) don't need necessary the time invalidation, however shared backends should have be strictly limited by time.
There is proposal for file content cache: https://github.com/ondrejholy/gvfs/compare/gvfsfilecache It is just simple cache (also similar those in some backends). It emulates standard read operations using pull. It is integrated inside job objects. The content is downloaded using pull in open job. Stream is opened and temp file is unlinked. Following read and seek requests are done on the opened stream. The stream is closed (and removed) with close job. It has same disadvantages as current implementation of those caches in backends. Open job could take a long time without progress notification. It can consumes lot of space, however not necessary in the memory.
Nice thesis topic :-) I haven't yet had time to look through everything, but some brief thoughts that I've had on the topic for a while: - Some backends are expected to be local and don't need/benefit from caching. e.g. smb. In this situation, having persistent per-mount or per-backend options (even if they're not exposed in the GUI would be useful). - Caching annoys people unexpectedly when they change the file outside of gvfs unless the invalidation method is smart. - For it to work smoothly, you need to be smart about invalidation (whether it's timeouts or checking mtime, etc). Anyway, nice to see you working on this. I think it could solve quite a few problems while reducing the complexity of some of the backends.
*** Bug 556749 has been marked as a duplicate of this bug. ***
*** Bug 727852 has been marked as a duplicate of this bug. ***
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gvfs/issues/230.