GNOME Bugzilla – Bug 705837
kms: add public API to override the default configuration of outputs
Last modified: 2013-08-23 13:16:14 UTC
Add API to allow complex applications using the KMS backend to go almost straight to direct configuration (which is not possible because Cogl needs to be in charge of buffers and FB objects).
Created attachment 251346 [details] [review] kms: add public API to override the default configuration of outputs
Two comments on this patch, hoping that you guys know better: 1) I get a number of these warnings: (gnome-shell-real:21231): Cogl-WARNING **: Failed to flip: Device or resource busy They don't appear to have any visible effects. What do they mean? Are we dropping frames? Are we not blocking for the swap correctly? 2) If I use the KMS interface for DPMS, after I turn monitors off and on again the DRM device appears to freeze (it doesn't poll readable), and Clutter stops repainting, as if forever waiting for a vblank Is there something I need to do, for example stop flipping when the screens are off? Or reset the CRTCs?
Also, with this patch I can reproducibly cause a hang with triple head. Normal execution is blocked at poll(), and is running the mainloop regularly, except that no drawing occurs (like the DPMS case above). Closing mutter and calling the ioctl for restoring the vt to text mode hangs the process in noninterruptible sleep (kill -9 has no effect), with the following kernel stack: [<ffffffffa00a74ed>] intel_crtc_wait_for_pending_flips+0x8d/0x120 [i915] [<ffffffffa00aa6cb>] ironlake_crtc_disable+0x7b/0x8d0 [i915] [<ffffffffa00abe77>] intel_crtc_disable+0x37/0x130 [i915] [<ffffffffa00ac3c6>] __intel_set_mode+0x456/0xcb0 [i915] [<ffffffffa00b3b16>] intel_set_mode+0x16/0x30 [i915] [<ffffffffa00b4270>] intel_crtc_set_config+0x640/0x960 [i915] [<ffffffffa002cb23>] drm_mode_set_config_internal+0x23/0x50 [drm] [<ffffffffa0076511>] drm_fb_helper_set_par+0x71/0xf0 [drm_kms_helper] [<ffffffff8132fc11>] fb_set_var+0x191/0x430 [<ffffffff8133b511>] fbcon_blank+0x1d1/0x2d0 [<ffffffff813b0384>] do_unblank_screen+0xb4/0x1e0 [<ffffffff813a711e>] vt_ioctl+0x118e/0x1210 [<ffffffff8139af55>] tty_ioctl+0x275/0xbe0 [<ffffffff811a9585>] do_vfs_ioctl+0x305/0x520 [<ffffffff811a9821>] SyS_ioctl+0x81/0xa0 [<ffffffff81647559>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff SysRQ+V gives an oops with a similar stack, ends with intel_crtc_wait_for_pending_flips and warn_slowpath_common. This may be a kernel bug, in that kill -9 and SysRQ+V should keep working no matter how confused the HW is, but Xorg works fine, so this is also definitely a userspace bug.
(In reply to comment #2) > Two comments on this patch, hoping that you guys know better: > > 1) I get a number of these warnings: > (gnome-shell-real:21231): Cogl-WARNING **: Failed to flip: Device or resource > busy > > They don't appear to have any visible effects. What do they mean? Are we > dropping frames? Are we not blocking for the swap correctly? This happens if drmModePageFlip returns EBUSY which from looking at the driver can happen if the driver spuriously finds that the crtc has a NULL fb associated with it. This either implies that we didn't call drmModeSetCrtc first or that a hotplug event has happened that we haven't yet discovered which resulted in the fb being disassociated from the crtc. In this case no page flip has been queued and so we also wont get a page flip event for this. When you see these errors do they coincide with you unplugging an external monitor perhaps? > > 2) If I use the KMS interface for DPMS, after I turn monitors off and on again > the DRM device appears to freeze (it doesn't poll readable), and Clutter stops > repainting, as if forever waiting for a vblank > Is there something I need to do, for example stop flipping when the screens are > off? Or reset the CRTCs? Yeah trying to get my head around the driver code here, I'm not entirely sure what the expected semantics are. It looks like the driver disables the crtc in such a way that it can be re-enabled without needing to explicitly re-set the mode. before disabling the vblank the driver also waits for any outstanding flips so we shouldn't have to worry about flips being dropped on the floor. It doesn't look like the page-flip ioctl will directly check for the crtc being disabled so I don't think it's an error per-se to try and page flip while the display is off, but notably there is a fixed sized buffer for pending flips and you'd eventually receive ENOSPC errors if you continued trying to page flip to a monitor that was off.
(In reply to comment #3) > Also, with this patch I can reproducibly cause a hang with triple head. > > Normal execution is blocked at poll(), and is running the mainloop regularly, > except that no drawing occurs (like the DPMS case above). > Closing mutter and calling the ioctl for restoring the vt to text mode hangs > the process in noninterruptible sleep (kill -9 has no effect), with the > following kernel stack: > [<ffffffffa00a74ed>] intel_crtc_wait_for_pending_flips+0x8d/0x120 [i915] > [<ffffffffa00aa6cb>] ironlake_crtc_disable+0x7b/0x8d0 [i915] > [<ffffffffa00abe77>] intel_crtc_disable+0x37/0x130 [i915] > [<ffffffffa00ac3c6>] __intel_set_mode+0x456/0xcb0 [i915] > [<ffffffffa00b3b16>] intel_set_mode+0x16/0x30 [i915] > [<ffffffffa00b4270>] intel_crtc_set_config+0x640/0x960 [i915] > [<ffffffffa002cb23>] drm_mode_set_config_internal+0x23/0x50 [drm] > [<ffffffffa0076511>] drm_fb_helper_set_par+0x71/0xf0 [drm_kms_helper] > [<ffffffff8132fc11>] fb_set_var+0x191/0x430 > [<ffffffff8133b511>] fbcon_blank+0x1d1/0x2d0 > [<ffffffff813b0384>] do_unblank_screen+0xb4/0x1e0 > [<ffffffff813a711e>] vt_ioctl+0x118e/0x1210 > [<ffffffff8139af55>] tty_ioctl+0x275/0xbe0 > [<ffffffff811a9585>] do_vfs_ioctl+0x305/0x520 > [<ffffffff811a9821>] SyS_ioctl+0x81/0xa0 > [<ffffffff81647559>] system_call_fastpath+0x16/0x1b > [<ffffffffffffffff>] 0xffffffffffffffff > > SysRQ+V gives an oops with a similar stack, ends with > intel_crtc_wait_for_pending_flips and warn_slowpath_common. > > This may be a kernel bug, in that kill -9 and SysRQ+V should keep working no > matter how confused the HW is, but Xorg works fine, so this is also definitely > a userspace bug. Hmm that looks like an intel driver bug on ironlake; waiting for pending flips should complete in a finite time :-/
Btw, this is Ivybridge (Intel Corporation 3rd Gen Core processor Graphics Controller), not Ironlake
OK, so, what is the way forward for this? Are we merging it as is, and maybe debugging the issues later? Do you have other plans for the API? Are you going to investigate more?
Created attachment 252621 [details] [review] kms: add public API to override the default configuration of outputs Add API to allow complex applications using the KMS backend to go almost straight to direct configuration (which is not possible because Cogl needs to be in charge of buffers and FB objects). Found the problem, we were trying to page flip disabled CRTCs...
ok sorry for the delay in replying about the patch itself. I think if we can mark this api as Stability: unstable then it should be ok to land this, with the understanding that it's pretty likely we'll aim to replace this api later in favour of a more complete solution. Just for reference I am working on a wip/output branch with improved kms support though I don't want to block any mutter work on that work at this stage. Another minor tweak would be that instead of taking a GList it would be preferable to take a CoglKmsCrtc **crtcs array and int n_crtcs arguments, since Cogl avoids exposing glib types in the api except for glib specific integration apis, given that cogl supports building standalone without glib. If you are happy with that I don't mind if you either want to make the tweaks yourself or I can tweak your patch and land it myself.
Created attachment 252814 [details] [review] kms: add public API to override the default configuration of outputs Add API to allow complex applications using the KMS backend to go almost straight to direct configuration (which is not possible because Cogl needs to be in charge of buffers and FB objects).
Ok I've pushed your patch to master and cherry picked back to cogl-1.16. I just squashed in a minor cleanup of the style to not use tabs and not align arguments to be consistent with other code in cogl. thanks