After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 312910 - support running builds in parallel
support running builds in parallel
Status: RESOLVED OBSOLETE
Product: jhbuild
Classification: Infrastructure
Component: general
unspecified
Other Linux
: Normal enhancement
: ---
Assigned To: James Henstridge
Jhbuild QA
: 655114 (view as bug list)
Depends on:
Blocks: 654872
 
 
Reported: 2005-08-08 16:41 UTC by Luis Villa
Modified: 2021-05-17 15:48 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
packagedb: Make "entries" into private member, add public get() method (4.98 KB, patch)
2011-07-31 21:39 UTC, Colin Walters
none Details | Review
packagedb: Lazily load cache (2.84 KB, patch)
2011-07-31 21:39 UTC, Colin Walters
none Details | Review
symlinklock.py: New file (5.91 KB, patch)
2011-07-31 21:39 UTC, Colin Walters
none Details | Review
packagedb: Use a lock file around modifications, reread after external changes (4.57 KB, patch)
2011-07-31 21:39 UTC, Colin Walters
needs-work Details | Review
packagedb: Make "entries" into private member, add public get() method (5.00 KB, patch)
2011-09-29 20:07 UTC, Colin Walters
committed Details | Review
packagedb: Lazily load cache (2.84 KB, patch)
2011-09-29 20:07 UTC, Colin Walters
committed Details | Review
symlinklock.py: New file (5.91 KB, patch)
2011-09-29 20:07 UTC, Colin Walters
committed Details | Review
packagedb: Use a lock file around modifications, reread after external changes (4.84 KB, patch)
2011-09-29 20:07 UTC, Colin Walters
committed Details | Review
Parallel jhbuilder (20.83 KB, patch)
2011-12-27 12:02 UTC, Wouter De Borger
needs-work Details | Review

Description Luis Villa 2005-08-08 16:41:48 UTC
So, given that SMP is getting bigger, and we've actually now had two offers to
do tinderbox hosting on 8-way boxes, it would be sweet if jhbuild could
parallelize, and launch builds for multiple modules at once. i.e., there are
several modules that formally depend only on gtk (including the beast known as
mozilla); it would be nice if they would all (or up to -j # of them) would
launch their builds when gtk finishes building. Ditto for all of gtk's deps that
don't formally have gtk deps, etc. 

[Am I making sense?]
Comment 1 James Henstridge 2005-08-09 06:48:24 UTC
This is somewhat related to bug 133567.

The easiest way to do this would be by refactoring the system to use threads and
async queues:

 * 1 manager thread, a number of worker threads
 * an async queue of "packages to build", one for "built packages"

The manager thread would do the following:
 1. from the set of all packages to build, push each package with no
    dependencies onto the "packages to build" async queue.
 2. pop an item off the "built packages" queue and add it to the "built
    packages" set.  Any packages to be built that now have their dependencies
    met are pushed onto the "packages to build" async queue.
 3. go to step 2.

The worker threads would do the following:
 1. pop a package off the "packages to build" async queue.
 2. build the package
 3. if the package build succeeded, push it to the "built packages" async queue
 4. go to step 1.

There are a few things that would need to be worked out, including:
 * make everything terminate when everything has been built
 * handle packages that fail to build
 * not obvious how to handle output in the normal "jhbuild build" interactive
   mode.

To handle async download/compile as requested in bug 133567, another thread
would be introduced, which would go through the full package list, doing the
downloads/updates, and then feeding packages to the manager thread.
Comment 2 James Henstridge 2005-08-10 02:27:26 UTC
Some further thoughts on how to handle parallel builds in the normal interactive
"jhbuild build" mode:

 1. Run all build commands such that their output is piped through jhbuild
    (already done by tinderbox, and for CVS checkouts).
 2. Require worker threads to hold a mutex when printing the output to the
    screen.
 3. Hold the mutex when showing the "error occurred" menu so other worker
    threads block.

This would result in mixed output, but shouldn't be much more confusing than
parallel make.

Handling the index page for the tinderbox mode, the manager thread could be
responsible for this when it reads a package from the "built packages" async
queue (which should probably be a "build status" async queue instead).
Comment 3 James Henstridge 2005-08-10 06:46:46 UTC
Notes added here:
  http://live.gnome.org/JhbuildParallelBuild
Comment 4 James Henstridge 2006-05-18 02:14:48 UTC
I've been updating the jhbuild code to get rid of dependence on the chdir() call, since it can cause problems with multiple threads running at once.

This should make it easier to implement some kind of concurrent build system like this.  There is still a lot of work to do before something like this would be possible though.
Comment 5 Colin Walters 2011-07-25 15:47:35 UTC
While jhbuild supporting this natively by default would be awesome, we have a lot to do to go from here to there.

However, people are currently "manually" running multiple instances of jhbuild in parallel, but this will currently corrupt the packagedb.xml file, which in turn will break 'jhbuild uninstall' and unlinking stale files on upgrade ( bug 654872 ).

We should at least

1) lock the packagedb.xml file
2) Check the timestamp on it after locking, and if it's newer than what we have in memory, reread it
Comment 6 Colin Walters 2011-07-30 19:25:34 UTC
*** Bug 655114 has been marked as a duplicate of this bug. ***
Comment 7 Colin Walters 2011-07-31 21:39:37 UTC
Created attachment 192962 [details] [review]
packagedb: Make "entries" into private member, add public get() method

This will make it easier to change the internals later.
Comment 8 Colin Walters 2011-07-31 21:39:42 UTC
Created attachment 192963 [details] [review]
packagedb: Lazily load cache

Only load the packagedb the first time someone calls get().  Refactor
the internals so that check() and installdate() both call get()
internally to avoid duplication.

This is a speed optimization (we instatiate the packagedb in cases
even if we're not going to read from it), as well as preparatory work
for locking.
Comment 9 Colin Walters 2011-07-31 21:39:45 UTC
Created attachment 192964 [details] [review]
symlinklock.py: New file

This provides a lock file based on a symbolic link.  The approach is
stolen from Emacs.  The basic idea is that built-in Unix locking is
often broken, and the symlink approach lets us easily parse out things
like *who* is holding the lock.
Comment 10 Colin Walters 2011-07-31 21:39:48 UTC
Created attachment 192965 [details] [review]
packagedb: Use a lock file around modifications, reread after external changes

To somewhat support concurrent operation like "jhbuild buildone foo",
"jhbuild buildone bar" in separate terminals for speed, we need to at
least avoid corrupting the packagedb.xml file.

Note this does NOT provide safety against building/installing the same
module twice.
Comment 11 Craig Keogh 2011-08-13 11:28:59 UTC
Thank you for the patches Colin. I'm currently testing them.
Comment 12 Craig Keogh 2011-08-18 11:41:21 UTC
Comment on attachment 192965 [details] [review]
packagedb: Use a lock file around modifications, reread after external changes

I just updated bootstrap.modules and changed prefix in ~/.jhbuildrc.

# jhbuild build
Traceback (most recent call last):
  • File "/home/oxyde/.local/bin/jhbuild", line 31 in <module>
    jhbuild.main.main(sys.argv[1:])
  • File "/home/oxyde/gnome/jhbuild/jhbuild/main.py", line 153 in main
    rc = jhbuild.commands.run(command, config, args, help=lambda: print_help(parser))
  • File "/home/oxyde/gnome/jhbuild/jhbuild/commands/__init__.py", line 123 in run
    return cmd.execute(config, args, help)
  • File "/home/oxyde/gnome/jhbuild/jhbuild/commands/__init__.py", line 52 in execute
    return self.run(config, options, args, help)
  • File "/home/oxyde/gnome/jhbuild/jhbuild/commands/base.py", line 272 in run
    check_bootstrap_updateness(config)
  • File "/home/oxyde/gnome/jhbuild/jhbuild/commands/base.py", line 167 in check_bootstrap_updateness
    for module in module_set.modules.values()])
  • File "/home/oxyde/gnome/jhbuild/jhbuild/utils/packagedb.py", line 244 in installdate
    entry = self.get(package)
  • File "/home/oxyde/gnome/jhbuild/jhbuild/utils/packagedb.py", line 214 in get
    self._ensure_cache()
  • File "/home/oxyde/gnome/jhbuild/jhbuild/utils/packagedb.py", line 133 in _ensure_cache
    assert self._entries_stat is not None AssertionError

Comment 13 Colin Walters 2011-09-29 20:07:02 UTC
Created attachment 197813 [details] [review]
packagedb: Make "entries" into private member, add public get() method

Rebased to master.
Comment 14 Colin Walters 2011-09-29 20:07:20 UTC
Created attachment 197814 [details] [review]
packagedb: Lazily load cache

Reattached for ordering.
Comment 15 Colin Walters 2011-09-29 20:07:29 UTC
Created attachment 197815 [details] [review]
symlinklock.py: New file

Reattached for ordering.
Comment 16 Colin Walters 2011-09-29 20:07:53 UTC
Created attachment 197816 [details] [review]
packagedb: Use a lock file around modifications, reread after external changes

Fixed exception thrown when no packagedb exists pointed out by Craig
Comment 17 Colin Walters 2011-09-29 20:09:03 UTC
Craig, can you try now?

Frederic, this is the first patch I'd like to get in now that the Debian sysdeps stuff landed.
Comment 18 Frederic Peters 2011-09-29 21:57:36 UTC
In bug 655417 comment 5, you wrote:

> I *strongly* prefer writing data files like this atomically.  That means create
> a temporary file, write it out, call os.fdatasync(fd.fileno()) on it.
> fd.close(), then os.rename() the temporary file over the real one.

I was reminded of this by the comment pointing fdatasync() is not available on os x, and noted there was no other caller, wouldn't it be appropriate in such places: (or is fsync() ok and the other place could use it as well?)

@@ -169,6 +189,8 @@ class PackageDB:
         os.fsync(tmp_dbfile.fileno())
         tmp_dbfile.close()
         os.rename(tmp_dbfile_path, self.dbfile)
Comment 19 Frederic Peters 2011-09-29 22:07:21 UTC
Review of attachment 197816 [details] [review]:

::: jhbuild/utils/packagedb.py
@@ +25,3 @@
 import xml.dom.minidom as DOM
 
+from . import symlinklock

Please use "import symlinklock", and move it after the system imports.

@@ +236,3 @@
+        finally:
+            self._lock.unlock()
+

Wouldn't a @lock decorator be nicer, instead of duplicating those methods to wrap their code with lock/unlock?
Comment 20 Frederic Peters 2011-09-29 22:07:29 UTC
Review of attachment 197815 [details] [review]:

::: jhbuild/utils/symlinklock.py
@@ +40,3 @@
+    def _existing_process_matches(self, pid, uid):
+        if os.uname()[0] != 'Linux':
+            return os.path.exists('/proc/%d' % (pid, ))

Maybe it would work on BSD, but it will fail on Windows.

But then most probably the whole symlink locking approach won't work on Windows…
Comment 21 Frederic Peters 2011-09-29 22:08:20 UTC
Review of attachment 197814 [details] [review]:

::: jhbuild/utils/packagedb.py
@@ +196,3 @@
     def get(self, package):
         '''Return entry if package is installed, otherwise return None.'''
+        self._ensure_cache()

What about an @ensure_cache decorator?
Comment 22 Frederic Peters 2011-09-29 22:11:55 UTC
Review of attachment 197813 [details] [review]:

Ok.
Comment 23 Colin Walters 2011-09-29 23:13:31 UTC
(In reply to comment #19)
> Review of attachment 197816 [details] [review]:
> 
> ::: jhbuild/utils/packagedb.py
> @@ +25,3 @@
>  import xml.dom.minidom as DOM
> 
> +from . import symlinklock
> 
> Please use "import symlinklock", and move it after the system imports.

The rationale for the . syntax is that it ensures we get the local one - if Python ever happens to ship a "symlinklock" jhbuild will blow up (and actually I just renamed it to lockfile.py).

> Wouldn't a @lock decorator be nicer, instead of duplicating those methods to
> wrap their code with lock/unlock?

I guess...there are no other uses of decorators in the jhbuild codebase; this would be the first.  There would only be two uses of it, so I'm not finding the idea compelling.

If you say "Yes use a decorator" I'll do it though.
Comment 24 Colin Walters 2011-09-29 23:15:31 UTC
(In reply to comment #20)
> Review of attachment 197815 [details] [review]:
> 
> ::: jhbuild/utils/symlinklock.py
> @@ +40,3 @@
> +    def _existing_process_matches(self, pid, uid):
> +        if os.uname()[0] != 'Linux':
> +            return os.path.exists('/proc/%d' % (pid, ))
> 
> Maybe it would work on BSD, but it will fail on Windows.
> 
> But then most probably the whole symlink locking approach won't work on
> Windows…

I've refactored the code such that we use a dummy implementation on Windows; this at least avoids regressing.  Python doesn't seem to have any wrappers for Windows locking stuff, just a wrapper for fcntl.

I think I'd like to leave the implementation to someone actually using jhbuild on Windows who cares.
Comment 25 Colin Walters 2011-09-30 18:12:13 UTC
Committed with changes to use a decorator.

Attachment 197813 [details] pushed as 8fb9e60 - packagedb: Make "entries" into private member, add public get() method
Attachment 197814 [details] pushed as 8baaac9 - packagedb: Lazily load cache
Attachment 197816 [details] pushed as 08c2f07 - packagedb: Use a lock file around modifications, reread after external changes
Comment 26 Craig Keogh 2011-10-09 22:37:04 UTC
Thank you for your work on this Colin. Mark bug resolved fixed?
Comment 27 Wouter De Borger 2011-12-27 12:02:39 UTC
Created attachment 204260 [details] [review]
Parallel jhbuilder

Hi,

I've been working on something similar to what is discussed here. 

I use twisted to do the threading and semaphores for synchronising the messages.
I also use a semaphore (execSemaphore) to limit the number of executing processes instead of queues. This allows me to use unmodified (blocking) build logic. It may not be the most elegant approach, but it allowed me to ignore the actual build logic. It consumes some threads but most are idle, so that is OK (for me).

It works (less or more). I hope the patch is OK (I'm not used to git).

Wouter
Comment 28 Craig Keogh 2011-12-28 04:31:30 UTC
Review of attachment 204260 [details] [review]:

I like the idea of using twisted, but twisted can't be a hard dependency of JHBuild. It should work like: if you have twisted installed, you get parallel builds. If you don't have twisted installed, you get serial builds (like JHBuild is now).
Comment 29 GNOME Infrastructure Team 2021-05-17 15:48:53 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/jhbuild/-/issues/92.