GNOME Bugzilla – Bug 312910
support running builds in parallel
Last modified: 2021-05-17 15:48:53 UTC
So, given that SMP is getting bigger, and we've actually now had two offers to do tinderbox hosting on 8-way boxes, it would be sweet if jhbuild could parallelize, and launch builds for multiple modules at once. i.e., there are several modules that formally depend only on gtk (including the beast known as mozilla); it would be nice if they would all (or up to -j # of them) would launch their builds when gtk finishes building. Ditto for all of gtk's deps that don't formally have gtk deps, etc. [Am I making sense?]
This is somewhat related to bug 133567. The easiest way to do this would be by refactoring the system to use threads and async queues: * 1 manager thread, a number of worker threads * an async queue of "packages to build", one for "built packages" The manager thread would do the following: 1. from the set of all packages to build, push each package with no dependencies onto the "packages to build" async queue. 2. pop an item off the "built packages" queue and add it to the "built packages" set. Any packages to be built that now have their dependencies met are pushed onto the "packages to build" async queue. 3. go to step 2. The worker threads would do the following: 1. pop a package off the "packages to build" async queue. 2. build the package 3. if the package build succeeded, push it to the "built packages" async queue 4. go to step 1. There are a few things that would need to be worked out, including: * make everything terminate when everything has been built * handle packages that fail to build * not obvious how to handle output in the normal "jhbuild build" interactive mode. To handle async download/compile as requested in bug 133567, another thread would be introduced, which would go through the full package list, doing the downloads/updates, and then feeding packages to the manager thread.
Some further thoughts on how to handle parallel builds in the normal interactive "jhbuild build" mode: 1. Run all build commands such that their output is piped through jhbuild (already done by tinderbox, and for CVS checkouts). 2. Require worker threads to hold a mutex when printing the output to the screen. 3. Hold the mutex when showing the "error occurred" menu so other worker threads block. This would result in mixed output, but shouldn't be much more confusing than parallel make. Handling the index page for the tinderbox mode, the manager thread could be responsible for this when it reads a package from the "built packages" async queue (which should probably be a "build status" async queue instead).
Notes added here: http://live.gnome.org/JhbuildParallelBuild
I've been updating the jhbuild code to get rid of dependence on the chdir() call, since it can cause problems with multiple threads running at once. This should make it easier to implement some kind of concurrent build system like this. There is still a lot of work to do before something like this would be possible though.
While jhbuild supporting this natively by default would be awesome, we have a lot to do to go from here to there. However, people are currently "manually" running multiple instances of jhbuild in parallel, but this will currently corrupt the packagedb.xml file, which in turn will break 'jhbuild uninstall' and unlinking stale files on upgrade ( bug 654872 ). We should at least 1) lock the packagedb.xml file 2) Check the timestamp on it after locking, and if it's newer than what we have in memory, reread it
*** Bug 655114 has been marked as a duplicate of this bug. ***
Created attachment 192962 [details] [review] packagedb: Make "entries" into private member, add public get() method This will make it easier to change the internals later.
Created attachment 192963 [details] [review] packagedb: Lazily load cache Only load the packagedb the first time someone calls get(). Refactor the internals so that check() and installdate() both call get() internally to avoid duplication. This is a speed optimization (we instatiate the packagedb in cases even if we're not going to read from it), as well as preparatory work for locking.
Created attachment 192964 [details] [review] symlinklock.py: New file This provides a lock file based on a symbolic link. The approach is stolen from Emacs. The basic idea is that built-in Unix locking is often broken, and the symlink approach lets us easily parse out things like *who* is holding the lock.
Created attachment 192965 [details] [review] packagedb: Use a lock file around modifications, reread after external changes To somewhat support concurrent operation like "jhbuild buildone foo", "jhbuild buildone bar" in separate terminals for speed, we need to at least avoid corrupting the packagedb.xml file. Note this does NOT provide safety against building/installing the same module twice.
Thank you for the patches Colin. I'm currently testing them.
Comment on attachment 192965 [details] [review] packagedb: Use a lock file around modifications, reread after external changes I just updated bootstrap.modules and changed prefix in ~/.jhbuildrc. # jhbuild build Traceback (most recent call last):
+ Trace 228128
jhbuild.main.main(sys.argv[1:])
rc = jhbuild.commands.run(command, config, args, help=lambda: print_help(parser))
return cmd.execute(config, args, help)
return self.run(config, options, args, help)
check_bootstrap_updateness(config)
for module in module_set.modules.values()])
entry = self.get(package)
self._ensure_cache()
assert self._entries_stat is not None AssertionError
Created attachment 197813 [details] [review] packagedb: Make "entries" into private member, add public get() method Rebased to master.
Created attachment 197814 [details] [review] packagedb: Lazily load cache Reattached for ordering.
Created attachment 197815 [details] [review] symlinklock.py: New file Reattached for ordering.
Created attachment 197816 [details] [review] packagedb: Use a lock file around modifications, reread after external changes Fixed exception thrown when no packagedb exists pointed out by Craig
Craig, can you try now? Frederic, this is the first patch I'd like to get in now that the Debian sysdeps stuff landed.
In bug 655417 comment 5, you wrote: > I *strongly* prefer writing data files like this atomically. That means create > a temporary file, write it out, call os.fdatasync(fd.fileno()) on it. > fd.close(), then os.rename() the temporary file over the real one. I was reminded of this by the comment pointing fdatasync() is not available on os x, and noted there was no other caller, wouldn't it be appropriate in such places: (or is fsync() ok and the other place could use it as well?) @@ -169,6 +189,8 @@ class PackageDB: os.fsync(tmp_dbfile.fileno()) tmp_dbfile.close() os.rename(tmp_dbfile_path, self.dbfile)
Review of attachment 197816 [details] [review]: ::: jhbuild/utils/packagedb.py @@ +25,3 @@ import xml.dom.minidom as DOM +from . import symlinklock Please use "import symlinklock", and move it after the system imports. @@ +236,3 @@ + finally: + self._lock.unlock() + Wouldn't a @lock decorator be nicer, instead of duplicating those methods to wrap their code with lock/unlock?
Review of attachment 197815 [details] [review]: ::: jhbuild/utils/symlinklock.py @@ +40,3 @@ + def _existing_process_matches(self, pid, uid): + if os.uname()[0] != 'Linux': + return os.path.exists('/proc/%d' % (pid, )) Maybe it would work on BSD, but it will fail on Windows. But then most probably the whole symlink locking approach won't work on Windows…
Review of attachment 197814 [details] [review]: ::: jhbuild/utils/packagedb.py @@ +196,3 @@ def get(self, package): '''Return entry if package is installed, otherwise return None.''' + self._ensure_cache() What about an @ensure_cache decorator?
Review of attachment 197813 [details] [review]: Ok.
(In reply to comment #19) > Review of attachment 197816 [details] [review]: > > ::: jhbuild/utils/packagedb.py > @@ +25,3 @@ > import xml.dom.minidom as DOM > > +from . import symlinklock > > Please use "import symlinklock", and move it after the system imports. The rationale for the . syntax is that it ensures we get the local one - if Python ever happens to ship a "symlinklock" jhbuild will blow up (and actually I just renamed it to lockfile.py). > Wouldn't a @lock decorator be nicer, instead of duplicating those methods to > wrap their code with lock/unlock? I guess...there are no other uses of decorators in the jhbuild codebase; this would be the first. There would only be two uses of it, so I'm not finding the idea compelling. If you say "Yes use a decorator" I'll do it though.
(In reply to comment #20) > Review of attachment 197815 [details] [review]: > > ::: jhbuild/utils/symlinklock.py > @@ +40,3 @@ > + def _existing_process_matches(self, pid, uid): > + if os.uname()[0] != 'Linux': > + return os.path.exists('/proc/%d' % (pid, )) > > Maybe it would work on BSD, but it will fail on Windows. > > But then most probably the whole symlink locking approach won't work on > Windows… I've refactored the code such that we use a dummy implementation on Windows; this at least avoids regressing. Python doesn't seem to have any wrappers for Windows locking stuff, just a wrapper for fcntl. I think I'd like to leave the implementation to someone actually using jhbuild on Windows who cares.
Committed with changes to use a decorator. Attachment 197813 [details] pushed as 8fb9e60 - packagedb: Make "entries" into private member, add public get() method Attachment 197814 [details] pushed as 8baaac9 - packagedb: Lazily load cache Attachment 197816 [details] pushed as 08c2f07 - packagedb: Use a lock file around modifications, reread after external changes
Thank you for your work on this Colin. Mark bug resolved fixed?
Created attachment 204260 [details] [review] Parallel jhbuilder Hi, I've been working on something similar to what is discussed here. I use twisted to do the threading and semaphores for synchronising the messages. I also use a semaphore (execSemaphore) to limit the number of executing processes instead of queues. This allows me to use unmodified (blocking) build logic. It may not be the most elegant approach, but it allowed me to ignore the actual build logic. It consumes some threads but most are idle, so that is OK (for me). It works (less or more). I hope the patch is OK (I'm not used to git). Wouter
Review of attachment 204260 [details] [review]: I like the idea of using twisted, but twisted can't be a hard dependency of JHBuild. It should work like: if you have twisted installed, you get parallel builds. If you don't have twisted installed, you get serial builds (like JHBuild is now).
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/jhbuild/-/issues/92.