GNOME Bugzilla – Bug 599079
Multithreaded apps launched as SCHED_RR or SCHED_FIFO unable to run
Last modified: 2011-10-03 05:17:02 UTC
I initially reported this bug in Ubuntu here: https://bugs.launchpad.net/ubuntu/+source/glib2.0/+bug/453898 where I was suggested to forward it upstream. This is with package libglib2.0-0 at version 2.22.2-0ubuntu1, current as of now for Ubuntu karmic. Steps to reproduce: 1) The user must have permission to run real-time jobs. This can be accomplished by adding a line like the following to /etc/security/limits.conf: @audio - rtprio 99 However executing as root also triggers the problem and requires no setup, so that's what I do below (using sudo; replace by "su -c" where appropriate). The problem in practice happens with a normal user, not root, mind you. 2) From a terminal, run sudo chrt 50 totem You will get: --- GThread-ERROR **: file /build/buildd/glib2.0-2.22.2/gthread/gthread-posix.c: line 348 (g_thread_create_posix_impl): error 'Invalid argument' during 'pthread_attr_setschedparam (&attr, &sched)' aborting... Aborted (core dumped) --- The above is the bug. 3) Now run: totem & sleep 3 ; chrt -p 50 $! which will open totem correctly and re-schedule it after launching: $ chrt -p $! pid 3319's current scheduling policy: SCHED_RR pid 3319's current scheduling priority: 50 4) The above can be reproduced, for example, with nautilus and rhythmbox, which I assume are also multithreaded, while it doesn't happen with gcalctool, gnome-terminal or same-gnome, which I assume are not. Multithreaded applications not using glib (e.g., qt application such as hydrogen or kdenlive) don't have this problem either. 5) The above did not happen with GNOME 2.26 (i.e. glib 2.20). 6) The bug can be reproduced in both i386 and amd64. From what I see, I would think that there is a regression in glib 2.22 preventing real-time applications from running, although the problem may turn out to lie elsewhere. Let me know if I can provide more information to solve this.
Thanks to the submitter for this excellent bug report. ANALYSIS Enabling any POSIX scheduling policy besides the default SCHED_OTHER exposes this gthread bug in gthread-posix.c: g_thread_impl_init() captures the current POSIX scheduling priority value in 'priority_normal_value' and uses that value to generate the g_thread_priority_map[] array of priority values. But it fails to capture the current POSIX scheduling *policy*, without which these captured "initial" priority values have no context. Later, g_thread_create_posix_impl() tries to use a g_thread_priority_map[] priority value as the scheduling priority value for its new thread. The priority value is only relevant to whatever scheduling policy was in effect when the init routine captured it -- but this routine always implicitly creates its new thread with the default scheduling policy SCHED_OTHER. So, if the scheduling policy had been set to anything besides SCHED_OTHER, the create routine tries to set an non-relevant (illegal out-of-range) priority value, resulting in an EINVAL error from pthread_attr_setschedparam() at gthread-posix.c:348. BOTTOM LINE In order to support the notion of capturing the "initial" scheduling priority as the "normal" priority, the gthread-posix code must also capture the initial scheduling *policy* and create new threads using that scheduling policy. The attached patch implements such, and allows the reporter's example problem case "sudo chrt 50 totem" to start up properly. A simple test program (also attached) can also be started with "chrt" to verify that new g_thread_create()'ed threads do run with SCHED_RR scheduling after the patch: $ sudo chrt 50 ./test_rt_gthread PID POL RTPRIO LWP S TTY TIME COMMAND 7596 RR 50 7596 S pts/3 00:00:00 ./test_rt_gthread 7596 RR 50 7597 S pts/3 00:00:00 ./test_rt_gthread
Created attachment 149938 [details] demonstration program: test_rt_gthread
Created attachment 149940 [details] [review] [PATCH] Fix g_thread_create_posix_impl() crash for sched policy != SCHED_OTHER Capture the initial scheduling policy along with the initial priority, and create new threads using that policy, since the priority will not be valid otherwise.
ping?
I confirm that the patch works. Thanks!
Dear glib/gthread maintainers, Please review this bug 599079 and attached patch (and its companion bug 604857). Thanks in advance.
I don't believe that this patch is correct. If the policy is not SCHED_OTHER, then it is dangerous and probably misconceived to create other threads with the same policy (and priority) by default. What should happen is that if the policy is not SCHED_OTHER, some other way of establishing a default priority for the new thread should be used. Applications that need SCHED_RR or SCHED_FIFO generally do not want all their threads running with this policy, and as a general rule, it would be a bad idea to make them behave in this way from the perspective of the overall system. It is clearly a bug that glib's thread support grabs a SCHED_RR/FIFO priority value and uses it for SCHED_OTHER, but the solution is not to make new threads SCHED_RR/FIFO. In addition, I would note that users who use chrt to try to improve the performance of media-related software are victims of poorly designed software. This is not something that should be done on an application-wide level. The application itself needs to identify those parts of itself that should run with RR or FIFO scheduling. The GUI almost never should, for example. chrt is a useful tool with single-threaded command line applications, but is rarely the correct approach to use to get media scheduling right for multithreaded GUI apps.
Is this Paul Davis the Ardour developer? If so, regarding your last comment, how does one get Ardour to run its relevant thread(s) at a given SCHED_RR priority? I originally triggered this problem with Ardour 2.8.4, for which I used `chrt' to set the round-robin scheduler priority to 69, as advised in some set-up guides I read. Can this be done properly then?
yes, this is paul davis of ardour & JACK. Ardour does not use SCHED_RR, it (and all JACK clients) use SCHED_FIFO. For a while, the jackmp/jack2 implementation of JACK did use SCHED_RR; this has changed. You should NEVER be messing with application priorities in a multithreaded app like Ardour using chrt. You simply start JACK in realtime mode (these days, this is the default) and this will cause all the correct parts of Ardour to run with SCHED_FIFO. If you can't start JACK in realtime mode, you need to fix that problem rather than using chrt (http://jackaudio.org/faq/)
from IRC: <las> kamalmostafa: the approach would be something like ... gthread_init() checks to see if the thread in which its called is SCHED_OTHER; if so, grab that thread priority as a fallback, otherwise pick a number. <las> kamalmostafa: then, when creating a new thread, if the parent is not SCHED_OTHER, use the one stored away in gthread_init(); if the parent is SCHED_OTHER, then use its priority
Great, thanks. I probably came across a fairly outdated guide. Can the specific SCHED_FIFO priority be set at all in Ardour? In my current set up it's at 69, with jackd running at 75 and hydrogen at 69. Should I be tweaking the priorities of the rtc and of the soundcard's IRQ (I have them at SCHED_RR 98 and 85, respectively)? Sorry about staying off-topic on this, but it seemed like a good time to ask. Regarding the bug and fix, what is wrong with new threads inheriting the priority of the main process? I see that it may not be wise to run all threads under a real-time scheduler, but: 1) well-designed programs can override this inheritance if they need to (can't they?), 2) using chrt and relying on inheritance may be the only way for a user to get the correct priority for a given thread in an ill-designed program. In my opinion, inheritance is the most sensible _default_ behaviour. Whether users should our should not run with chrt in the first instance is a different matter.
Kamal, thanks for the patch! "In addition, I would note that users who use chrt to try to improve the performance of media-related software are victims of poorly designed software. This is not something that should be done on an application-wide level. The application itself needs to identify those parts of itself that should run with RR or FIFO scheduling." I run with real-time priority the Xorg server, media players, games. The reason is that I have other background processes that impose a high load on CPU, HDD and virtual memory. If I will wait until all application developers provide setting of real-time priority in their code, I will wait *forever*.
Thread priorities have been removed in 2.31.