GNOME Bugzilla – Bug 748452
registry: System registry cache file
Last modified: 2018-11-03 12:27:25 UTC
Right now, the first time a user runs a gstreamer program, it's slowed down by generating the registry cache file. Gstreamer should be able to try a system registry cache file if the user's cache file doesn't exist. See previous discussion here: http://lists.freedesktop.org/archives/gstreamer-devel/2015-March/052181.html
Created attachment 302335 [details] [review] registry: Don't store hash of environment variables The expanded environment variables for plugin dependencies are hashed and stored in the registry. The intended purpose is that if the environment variables haven't changed, then scanning of the directories they point to can be skipped. However, this actually isn't the case as the scanning for plugin dependencies happens unconditionally. Drop usage of this hash as it can only cause additional registry rebuilds to occur. If the expanded environment variables are different, then this will already cause the dependency scanning to search different locations. If there were no additional dependencies in changed paths, then the registry should not be rebuilt. The hash field is kept in the registry structure to allow existing registry files to be read correctly, but updates will always simply store a 0 there.
Created attachment 302336 [details] [review] registry: Allow usage of system plugin registry If the user's registry cache file in ~/.cache doesn't exist, initialize the registry from a system cache file at $prefix/share/gstreamer-$GST_API_VERSION. This allows a system registry file to be maintained, which significantly speeds up first time gstreamer initialization in the common case that only system plugins are being used. https://github.com/endlessm/eos-shell/issues/4986
Created attachment 302337 [details] [review] registry: Don't store hash of environment variables The expanded environment variables for plugin dependencies are hashed and stored in the registry. The intended purpose is that if the environment variables haven't changed, then scanning of the directories they point to can be skipped. However, this actually isn't the case as the scanning for plugin dependencies happens unconditionally. Drop usage of this hash as it can only cause additional registry rebuilds to occur. If the expanded environment variables are different, then this will already cause the dependency scanning to search different locations. If there were no additional dependencies in changed paths, then the registry should not be rebuilt. The hash field is kept in the registry structure to allow existing registry files to be read correctly, but updates will always simply store a 0 there.
Created attachment 302338 [details] [review] registry: Allow usage of system plugin registry If the user's registry cache file in ~/.cache doesn't exist, initialize the registry from a system cache file at $prefix/share/gstreamer-$GST_API_VERSION. This allows a system registry file to be maintained, which significantly speeds up first time gstreamer initialization in the common case that only system plugins are being used.
Comment on attachment 302337 [details] [review] registry: Don't store hash of environment variables > The intended purpose is that if the > environment variables haven't changed, > then scanning of the directories > they point to can be skipped. However, > this actually isn't the case as > the scanning for plugin dependencies > happens unconditionally. > > Drop usage of this hash as it can only > cause additional registry > rebuilds to occur. > > If the expanded environment variables > are different, then this will already > cause the dependency scanning to search > different locations. I didn't re-read the code in detail just now, but I'm not sure if the (my) comment in gstplugin.c why we store the environment variable hash is correct. One purpose of the hash is to trigger rescanning of a plugin when certain environment variables change. Example: we have a wrapper plugin around a library that itself has plugins and the FOO_PLUGINS_PATH env variable determines where that lib should look for additional plugins. Now if someone sets or changes FOO_PLUGINS_PATH that might cause the foo gstreamer wrapper plugin to expose a different set of elements, so this plugin should then be reloaded in this case.
(In reply to Tim-Philipp Müller from comment #5) > > I didn't re-read the code in detail just now, but I'm not sure if the (my) > comment in gstplugin.c why we store the environment variable hash is > correct. One purpose of the hash is to trigger rescanning of a plugin when > certain environment variables change. Example: we have a wrapper plugin > around a library that itself has plugins and the FOO_PLUGINS_PATH env > variable determines where that lib should look for additional plugins. Now > if someone sets or changes FOO_PLUGINS_PATH that might cause the foo > gstreamer wrapper plugin to expose a different set of elements, so this > plugin should then be reloaded in this case. Yes, it took me a little while to understand, but I agree that's the purpose. The thing is, you always have to run the dependency scanner regardless because you need to find out if anything in the paths changed. And the dependency scanner has to resolve the environment variables itself (it actually does a much more thorough job than the hashing). So, the environment variable hash seems to me to be completely useless. Case 1: FOO_PLUGINS_PATH is unchanged from before. You don't know what the actual value of the variable is from the hashed value, so you have to resolve it again anyway. Now you actually have to go look in the path to stat all the directories and files. Case 2: FOO_PLUGINS_PATH did change from before. You resolve the values and go stat all the directories and files just like case 1. So, I feel pretty confident that the variable has is not adding any value. If you drop it, you just immediately jump to resolving the variables and scanning the paths they point to, just like always happens now.
For some background, what we're doing is deploying a system with OSTree (https://wiki.gnome.org/Projects/OSTree). This is essentially a read only system, so unless the user has added their own plugins to their home directory, a registry file created and shipped with the OS should be sufficient for all users. One issue which prevents this is the hashing of the environment variables discussed above. Another issue is the use of st_ctime when hashing plugin dependencies - http://cgit.freedesktop.org/gstreamer/gstreamer/tree/gst/gstplugin.c#n1485. We can control the modification times (st_mtime) so that they stay persistent between composing and deploying the OS, but there's no way to control the ctime. However, I don't think that the ctime is significantly beneficial here. From stat(2) (http://man7.org/linux/man-pages/man2/stat.2.html): The field st_mtime is changed by file modifications, for example, by mknod(2), truncate(2), utime(2), and write(2) (of more than zero bytes). Moreover, st_mtime of a directory is changed by the creation or deletion of files in that directory. The st_mtime field is not changed for changes in owner, group, hard link count, or mode. The field st_ctime is changed by writing or by setting inode information (i.e., owner, group, link count, mode, etc.). Everything you really care about is covered by an mtime updates. I suppose the ctime is a convenient way to see if the owner/group/mode changed, but I would argue that hard link count should not affect the registry. It also doesn't capture if the owner/group/mode actually changed from before; it could have been changed and changed back. Assuming owner/group/mode are important (I'm not sure they are), I suggest using st_uid/st_gid/st_mode directly instead of using st_ctime. Thoughts?
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gstreamer/issues/110.