After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 614930 - add g_get_num_processors (), return the max concurrent threads available to the process
add g_get_num_processors (), return the max concurrent threads available to t...
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gthread
unspecified
Other Linux
: Normal enhancement
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks: 687223
 
 
Reported: 2010-04-06 08:13 UTC by John Cupitt
Modified: 2012-12-18 18:15 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
patch to add g_get_num_processors() to gthread (7.50 KB, patch)
2010-04-06 08:13 UTC, John Cupitt
none Details | Review
patch to add g_thread_get_num_processors() (7.95 KB, patch)
2011-08-09 08:53 UTC, John Cupitt
none Details | Review
test result against lscpu -p (1.23 KB, patch)
2011-08-09 08:55 UTC, John Cupitt
none Details | Review
Updated patch (4.15 KB, patch)
2012-12-17 15:51 UTC, Colin Walters
reviewed Details | Review
updated patch (4.23 KB, patch)
2012-12-17 22:24 UTC, Colin Walters
accepted-commit_now Details | Review

Description John Cupitt 2010-04-06 08:13:43 UTC
Created attachment 158023 [details] [review]
patch to add g_get_num_processors() to gthread

This patch against 2.22.5 (though it should apply cleanly to trunk) adds:

  int g_get_num_processors (void)

a function to return the number of concurrent threads of execution available to this process. It supports POSIX, BSD and Windows, though I've only tested on linux. It includes docs and an extra "make check" test.

Rationale: glib has a threadpool system, but no way to pick a size for a threadpool which matches the available hardware. Most glib applications that support threading seem to have an equivalent function in their codebase, for example:

http://git.gnome.org/browse/gimp/tree/app/base/base-utils.c#n54

This patch moves that mess of ifdefs into the core library.

Original mailing list discussion for background:

http://mail.gnome.org/archives/gtk-devel-list/2010-March/msg00065.html
Comment 1 John Boncek 2010-04-06 14:50:31 UTC
Based on the bug title, "add g_get_num_processors (), return the max concurrent threads...", wouldn't a clearer function name be g_get_max_concurrent_threads?
Comment 2 Sven Herzberg 2010-04-06 17:14:43 UTC
If this is for gthread, shouldn't this be called g_thread_get_num_processors()?
Comment 3 John Cupitt 2010-04-06 17:35:25 UTC
The name was supposed to be familiar to people coming from other platforms. For example, libgomp has omp_get_num_procs(), gimp has get_number_of_processors(), the linux kernel has _SC_NPROCESSORS_ONLN and so on.

It could be in the gthread namespace. I was thinking that, although it's useful as part of gthread, it's really telling you something about the host machine, so it's more like the other g_get_*() functions.

Anyway, I'm very happy to update the patch to a new name if a consensus emerges.
Comment 4 Christian Dywan 2010-04-06 19:05:09 UTC
Reading patches without proper formatting including function names is tedious. git format-patch and git diff do a good job by default, use them, please.

I second Sven. If this function is in GThread, it should have the according namespace. It's not like g_thread_get_num_processors is any harder to recognize as similar to omp_get_num_procs. And using a search for "processors" in the documentation will yield it either way.

> + test_g_get_num_processors
> [...]
> + g_assert (nprocs >= 1 && nprocs < 10000);

How about comparing the result to the result of 'lscpu -p'? Otherwise the test case doesn't really tell us whether it works or not.
Comment 5 Paul Davis 2010-04-08 13:51:39 UTC
john, is this code (for linux) correct if group scheduling is in use? my impression is that sysconf() will not return the correct value.

christian&sven: this code has potential uses (and users) with or without the use of GThread, and for that reason i think it more appropriate that it lives in the g_* namespace.
Comment 6 Tor Lillqvist 2010-04-08 14:04:16 UTC
What potential use would that be? Surely any tool that wants to describe the machine hardware in an exact way would use platform-specific API anyway to get all the details? Especially considering comment #1 I think this function should be in the g_thread namespace.
Comment 7 Paul Davis 2010-04-08 14:08:35 UTC
the obvious use would be as a portable way to determine the number of available processors (or max concurrent threads), even if one were not using GThread. this is similar to the way that, say, glib_build_filename() is a useful, portable way to construct filenames even if you don't use any glib file i/o APIs for doing file i/o.
Comment 8 John Cupitt 2010-04-08 14:10:40 UTC
Paul, I had a look at libgomp and they have some surprisingly complicated code on linux to determine this value. It seems to be checking pthread_affinity and I guess group scheduling as well (I didn't read it too carefully).

http://repo.or.cz/w/official-gcc.git/blob/HEAD:/libgomp/config/linux/proc.c

We can't simply copy-paste their source due to GPL / LGPL (is that correct?) so short of contacting the author and asking permission, it seemed easiest to stick with sysconf().

I'm not sure the difference is too serious. In my experience, making threadpools too large only hurts performance slightly.

(I've updated the patch to incorporate all suggestions, I'm now trying to teach myself git diff before reposting it)
Comment 9 Tor Lillqvist 2010-04-08 14:13:31 UTC
But why would one bother want to use a portable (GLib) way to determine the number of available processors even if one is not going to use a portable thread API (GThread)?
Comment 10 David Zeuthen (not reading bugmail) 2010-04-08 21:39:51 UTC
(In reply to comment #9)
> But why would one bother want to use a portable (GLib) way to determine the
> number of available processors even if one is not going to use a portable
> thread API (GThread)?

Multi-processing can also mean multiple processes, not just multiple threads. E.g. you might want to use g_get_num_processors() to figure out how many helpers programs to spawn.

Also, this function shouldn't necessarily live in gthread - and even if it does, calling it g_get_num_processors() for name-spacing reasons is just wrong - we don't do that for e.g. libgio, e.g. it's GFile not GIOFile.
Comment 11 David Zeuthen (not reading bugmail) 2010-04-08 21:41:10 UTC
(In reply to comment #10)
> Also, this function shouldn't necessarily live in gthread - and even if it
> does, calling it g_get_num_processors() for name-spacing reasons 

Gah, I obviously meant g_thread_get_num_processors() here.
Comment 12 cee1 2010-04-10 13:08:21 UTC
Apple's GCD(Grand Central Dispatch) also provides functions to detect "various numbers" of CPUs:
1) activecpu
2) logicalcpu_max
3) physicalcpu_max

see http://libdispatch.macosforge.org/trac/browser/trunk/src/queue.c#L444

The meanings of "activecpu", "logicalcpu_max", "physicalcpu_max" are explained in this page: http://www.opensource.apple.com/source/xnu/xnu-792.13.8/libkern/libkern/sysctl.h
Comment 13 cee1 2010-05-27 05:45:03 UTC
The pthread_workqueues(Thread pool implementation in BSD libc) from FreeBSD 9.0:
http://people.freebsd.org/~sson/thrworkq/pthread_workqueue.3.txt
Quote:
"""The size of the thread pool is managed by the kernel based on physical resources and the following tunable sysctl(3) MIBs:"""
Comment 14 cee1 2010-06-15 06:11:35 UTC
(In reply to comment #13)
> The pthread_workqueues(Thread pool implementation in BSD libc) from FreeBSD
> 9.0:
> http://people.freebsd.org/~sson/thrworkq/pthread_workqueue.3.txt
> Quote:
> """The size of the thread pool is managed by the kernel based on physical
> resources and the following tunable sysctl(3) MIBs:"""

Mark have written a pure userspace implementation of the pthread_workqueue API.  It's portable to a wide range of POSIX-compliant platforms and have be tested under Linux, see http://lists.macosforge.org/pipermail/libdispatch-dev/2010-June/000379.html
Comment 15 John Cupitt 2011-08-09 08:53:41 UTC
Created attachment 193470 [details] [review]
patch to add g_thread_get_num_processors()

Patch redone with "git format-patch", renamed to g_thread_, against git master.
Comment 16 John Cupitt 2011-08-09 08:55:05 UTC
Created attachment 193471 [details] [review]
test result against lscpu -p

improve the test to compare against the result of "lscpu -p"
Comment 17 John Cupitt 2011-08-09 08:56:41 UTC
(In reply to comment #16)
> Created an attachment (id=193471) [details] [review]
> test result against lscpu -p
> 
> improve the test to compare against the result of "lscpu -p"

Forgot to say, I've left this as a separate patch since I was unsure of the best way to implement it.
Comment 18 Matthias Clasen 2011-08-13 21:24:56 UTC
I agree with Davids comment #10 here:

I don't think this has any particularly strong reason to live in gthread.
It should just be called g_get_num_processors() and live in glib itself, imo.
Comment 19 Matthias Clasen 2011-10-03 05:47:38 UTC
After our major gthread refactoring, the patch needs to be redone.
Comment 20 Colin Walters 2012-12-17 15:51:00 UTC
Created attachment 231735 [details] [review]
Updated patch

* Rebased against master - merged unix/win32 implementation
* Renamed to just g_get_num_processors()
* Now returns guint, not int (processors are never negative)
* Tweaked docs to have what I think is a better description
* Added a use in a test case
* Added to gtk-doc sections
Comment 21 Dan Winship 2012-12-17 19:35:52 UTC
Comment on attachment 231735 [details] [review]
Updated patch

(Yeah, I know most of these issues come from the original patch...)

>+    /* POSIX style.
>+     */
>+    int x;

Comment should be on a single line, and "x" should have a better name. (If each #if section had its own "return" statement then you could just use nproc for x and then "return nproc > 0 ? nproc : 1" maybe.)

>+    /* BSD style.
>+     */
>+    int x;

likewise

>+    sysctl ((int[2]) {CTL_HW, HW_NCPU}, 2, &x, &len, NULL, 0);

inline array constant???

>-#define N_THREADS (20)
>+  n_threads = (guint)(g_get_num_processors () * 1.5f);

That will be 1 on single-processor systems. (Do they still have those? ARM?) It's going to be a lot less than before either way... maybe "MIN (10, g_get_num_processors () * 1.5)" or "MAX (20, g_get_num_processors () * 1.5)"?
Comment 22 Colin Walters 2012-12-17 22:24:59 UTC
Created attachment 231778 [details] [review]
updated patch

* Drop comments, they weren't adding anything
* Return at bottom is fallback, return inside ifdefs is cleaner
* Use * 2 for test case, but cap at 64
Comment 23 Colin Walters 2012-12-17 22:25:35 UTC
* For the BSD sysctl case, I rewrote the code to look like the example from here:
http://www.freebsd.org/cgi/man.cgi?query=sysctl&sektion=3&apropos=0&manpath=FreeBSD+3.1-RELEASE
Comment 24 Colin Walters 2012-12-17 22:52:01 UTC
(In reply to comment #21)
>
> That will be 1 on single-processor systems. (Do they still have those? ARM?)

It's also the qemu default - I suspect that makes them very common.
Comment 25 Colin Walters 2012-12-18 18:15:04 UTC
2149b29468bb99af3c29d5de61f75aad735082dc