GNOME Bugzilla – Bug 667651
creating a thread = fatal error if under a scheduler other than SCHED_OTHER
Last modified: 2013-04-24 17:27:08 UTC
(This is related to Bug #501237, but more serious, since it's a crash rather than a missing abstraction.) Under at least GLib 2.12 to 2.28, and possibly other versions (probably 2.30 but not 2.31), if an application sets SCHED_RR or SCHED_FIFO (realtime scheduling) and then starts a thread, the call to pthread_attr_setschedparam() fails, and the application aborts via g_error(). (For those with access to the appropriate bug tracker, this is NB#47178.) The same problem was recently reported in Debian, <http://bugs.debian.org/637654>. For instance, this was a problem for telepathy-stream-engine on Maemo, which wants realtime scheduling (for maximum quality of voice/video calls), and uses GStreamer (which starts many threads). Maemo's GLib has had an undocumented patch since 2006 [1] which addressed this; I've only recently been able to dig out the details of why it was applied in the first place. However, that patch is incomplete: it still doesn't work when coming from a SCHED_IDLE or SCHED_BATCH (ultra-low-priority) thread. (For those with access to the appropriate bug tracker, this is NB#296256.) The SCHED_IDLE/SCHED_BATCH thing is arguably a bug in glibc (<http://sourceware.org/bugzilla/show_bug.cgi?id=7013>) but good luck getting that fixed... For comparison, I've been directed to how Qt does this [2], which is considerably more elaborate (hopefully more elaborate than we need...) I'll try to put together some test cases which illustrate the problem, and confirm whether 2.30 and/or 2.31 still have this bug. [1] Maemo patch from Jussi Laako: <https://gitorious.org/gnome-essentials/glib/blobs/maemo/debian/patches/50-gthread.patch> [2] <http://qt.gitorious.org/qt/qt/blobs/4.7/src/corelib/thread/qthread_unix.cpp> [3] <https://qt.gitorious.org/qt/qt/commit/f3405a516ac30fc7dee1021cc6f34ca03dd08d97>
This bug also exists in the 2.30 stable branch. It turns out that the priority-setting in g_thread_create has never worked as designed on POSIX-compliant systems. Instead, new threads always inherit their parent thread's scheduling policy and sched_priority. That actually seems to me to be more desirable than how it was designed, to be honest - as designed, each new thread would have gone back from real-time or idle priority to normal priority, which would basically mean that no GStreamer app could usefully have real-time privileges. Given that this is how GLib has always behaved, I think it's safe to say that every GLib application works fine with threads that inherit their parent's priority. So, I'd be inclined to carry over the removal of thread priorities from the 2.32 branch to 2.30, by deleting the broken code that pretends to make priority-setting happen. ---- Further explanation of why this happens, please check my reasoning: Scheduler priority in POSIX consists of two parts, a *policy* and a *parameter*. The policy can be normal scheduling (SCHED_OTHER), real-time scheduling (SCHED_RR or SCHED_FIFO), or an OS-specific extension. Linux has two OS-specific extensions, SCHED_BATCH and SCHED_IDLE, both of which are lower priority than SCHED_OTHER. The parameter is a struct whose contents can be interpreted differently for each policy. POSIX requires it to contain at least sched_priority, an integer; under Linux+glibc, it contains that integer and nothing else. The contents of the struct, and in particular the allowed values for sched_priority, are allowed to vary from policy to policy. In Linux, SCHED_RR and SCHED_FIFO have priorities ranging from 1 to 99 (where larger numbers are more urgent), while SCHED_OTHER, SCHED_BATCH and SCHED_IDLE always have priority 0. When unpatched GLib <= 2.30 creates a thread, it initializes a fresh pthread_attr_t and leaves its policy set to the default (which is SCHED_OTHER). It then sets the sched_priority according to a lookup table, g_thread_priority_map. Now, g_thread_priority_map is a map from GLib constants like G_THREAD_PRIORITY_LOW to OS-specific sched_priority values. On startup (actually, in g_thread_init()), GLib fills it like this: { G_THREAD_PRIORITY_LOW => (the lowest SCHED_OTHER priority), G_THREAD_PRIORITY_NORMAL => (the priority of the thread that first called g_thread_init()), G_THREAD_PRIORITY_HIGH => (2/3 of the way between NORMAL and URGENT), G_THREAD_PRIORITY_URGENT => (the highest SCHED_OTHER priority), } If the main thread had a real-time policy at the time it called g_thread_init(), this bug happens: the sched_priority corresponding to NORMAL will be more than zero. This isn't appropriate for the scheduling policy that GLib will choose (which was always SCHED_OTHER), while HIGH priority is a mixture, so it isn't necessarily appropriate either. This led to a crash. For instance, if telepathy-stream-engine started up with (say) real-time priority 42, we'd get this nonsensical lookup table: { LOW => 0, NORMAL => 42, HIGH => 14, URGENT => 0 } In practice, everyone uses G_THREAD_PRIORITY_NORMAL, so GLib would usually try to use (SCHED_OTHER, sched_priority = 42) which isn't valid. GLib sees the EINVAL result, isn't prepared to deal with it, and aborts. To fix that bug, and also get all of telepathy-stream-engine's threads running under the real-time policy (it uses GStreamer, so media handling is spread between threads), Jussi's patch changed the logic so the new thread would get the policy from the parent thread, set the sched_priority from the lookup table, *truncate out-of-range values to comply with that policy* (which is what prevented the EINVAL error), and use that (priority, policy) tuple. That patch appeared to work, so it was applied to the Maemo package and everyone got on with their lives. However, there were still three issues: 1) realtime processes' lookup table still makes no sense, with priorities that aren't even in the right order 2) as mentioned in Comment #0, pthread_attr_setschedpolicy wrongly doesn't allow SCHED_IDLE (or SCHED_BATCH for that matter), failing with EINVAL 3) if you read the pthread_create source code in glibc, it turns out that using pthread_attr_setschedpolicy() and pthread_attr_setschedparam() has absolutely no effect (other than potentially failing with EINVAL) unless you also call pthread_attr_setinheritsched(PTHREAD_EXPLICIT_SCHED) to turn off scheduling inheritance. So in fact, we didn't need to go to the effort of implementing explicit policy inheritance - the threads would all have inherited the main thread's policy and sched_priority anyway! ---- (FYI, nice values as set by nice(1) and setpriority(3) are something different - it appears they only apply to whole processes, not to individual threads. GLib never changes the nice value.)
Created attachment 205187 [details] [review] [1/2] Do not attempt to set new POSIX threads' priorities It turns out that GLib has never actually been able to set new threads' priorities on POSIX systems, because POSIX specifies that new threads inherit their parent's priority, ignoring the scheduling policy and parameter stored in the pthread_attr_t, unless you also set PTHREAD_EXPLICIT_SCHED in the pthread_attr_t (which we didn't). The default behaviour is PTHREAD_INHERIT_SCHED. In the absence of actually being able to set a priority, the only thing we could possibly achieve by calling pthread_attr_setschedparam was to crash when we set an invalid sched_priority and got EINVAL back. Unfortunately, in the presence of scheduling policies with different allowed ranges for sched_priority (i.e. at least modern Linux), the g_thread_priority_map lookup table was populated inconsistently: the LOW and URGENT sched_priority were set to something appropriate for SCHED_OTHER (i.e. always 0 on Linux), while the NORMAL sched_priority was set to something appropriate for the main thread's scheduling priority at the time it called g_thread_init() (i.e. always nonzero for real-time processes on Linux), and the HIGH sched_priority was a mixture of the two. For instance, in a real-time process which had SCHED_RR with sched_priority = 42 at the time it called g_thread_init(), the lookup table would be: { LOW => 0, NORMAL => 42, HIGH => 14, URGENT => 0 } which is clearly nonsense. Since in practice GLib on POSIX has always allowed the default policy/priority inheritance to take place, we can safely conclude that every currently-working POSIX application is happy with this, and drop the code that pretends to behave otherwise. Bug: https://bugzilla.gnome.org/show_bug.cgi?id=667651 Bug-Debian: http://bugs.debian.org/637654 Bug-NB: NB#47178 Bug-NB: NB#296256 Signed-off-by: Simon McVittie <simon.mcvittie@collabora.co.uk>
Created attachment 205188 [details] [review] [2/2] Add regression test for GNOME#667651 On current Linux, this is only useful when run as root (strictly speaking: with CAP_SYS_NICE), but running it with insufficient privilege is harmless. Bug: https://bugzilla.gnome.org/show_bug.cgi?id=667651 Signed-off-by: Simon McVittie <simon.mcvittie@collabora.co.uk>
Review of attachment 205188 [details] [review]: - The test should also cover SCHED_BATCH and SCHED_IDLE policies when available. - The test verifies only policy inheritance, not priority inheritance. ::: gthread/tests/realtime.c @@ +55,3 @@ + GThread *thread; + guint i; + int min_prio; Local variables i and min_prio are not used in the function.
Created attachment 205359 [details] [review] [2/2 v2] Add regression test for GNOME#667651 The actually problematic cases (SCHED_RR and SCHED_FIFO) usually need CAP_SYS_NICE on normal Linux, but we can at least run the test for the other cases. Run manually, e.g. with "sudo gthread/tests/sched", to test the situation that would actually crash. If RLIMIT_RTPRIO and RLIMIT_RTTIME are sufficiently high, it may be possible to test SCHED_RR and SCHED_FIFO as non-root. Bug: https://bugzilla.gnome.org/show_bug.cgi?id=667651 Signed-off-by: Simon McVittie <simon.mcvittie@collabora.co.uk> --- (In reply to comment #4) > Review of attachment 205188 [details] [review]: > > - The test should also cover SCHED_BATCH and SCHED_IDLE policies when > available. A fair point - although that doesn't actually crash on Linux anyway, because SCHED_BATCH and SCHED_IDLE have the same range of valid priorities as SCHED_OTHER (0 to 0). > - The test verifies only policy inheritance, not priority inheritance. Not true: this: pthread_getschedparam (pthread_self (), &policy, ¶m) ... g_assert_cmpint (sched_get_priority_min (policy), ==, param.sched_priority) is run in both main and child threads, and asserts that the priority is what the main thread said it should be. In an attempt to make this clearer, I've swapped the left and right sides of the ==, and used the median priority (min+max)/2 instead of the minimum priority. > ::: gthread/tests/realtime.c > @@ +55,3 @@ > + guint i; > + int min_prio; > > Local variables i and min_prio are not used in the function. Dropped. I've also switched the test to GTester, which I'd previously avoided, thinking that g_test_init() called g_thread_init() - it doesn't, so the bug can be reproduced even under GTester.
Review of attachment 205359 [details] [review]: Looks good to me, my comments for [2/2] have been addressed. (NB: Visual review only haven't yet executed the test)
(In reply to comment #5) > > - The test verifies only policy inheritance, not priority inheritance. > > Not true: this: > > pthread_getschedparam (pthread_self (), &policy, ¶m) > ... > g_assert_cmpint (sched_get_priority_min (policy), ==, param.sched_priority) > > is run in both main and child threads, and asserts that the priority is what > the main thread said it should be. > > In an attempt to make this clearer, I've swapped the left and right sides of > the ==, and used the median priority (min+max)/2 instead of the minimum > priority. I agree that the priority inheritance was tested for the minimum priority value. The median is definitively better than the previously used minimum value, but it is still a single value and a problem in inheriting e.g. the maximum value would go unnoticed by the test. What about testing all three min, median and max values for each policy?
Created attachment 205914 [details] [review] [2/2 v3] Add regresssion test for GNOME#667651 The actually problematic cases (SCHED_RR and SCHED_FIFO) usually need CAP_SYS_NICE on normal Linux, but we can at least run the test for the other cases. Run manually, e.g. with "sudo gthread/tests/sched", to test the situation that would actually crash. If RLIMIT_RTPRIO and RLIMIT_RTTIME are sufficiently high, it may be possible to test SCHED_RR and SCHED_FIFO as non-root. Bug: https://bugzilla.gnome.org/show_bug.cgi?id=667651 Signed-off-by: Simon McVittie <simon.mcvittie@collabora.co.uk> --- This is becoming more of a test of the kernel than a test of GLib (we're not really testing that GLib applies inherited priority - because it doesn't - but only that it gets out of the way and allows it to be inherited), but it does demonstrate that we're definitely inheriting policy and priority.
Review of attachment 205914 [details] [review]: Looks good to me.
2.30 is pretty old now, and I doubt anyone is going to apply these patches, so, resolving as obsolete. Fixed in 2.32.