After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 705837 - kms: add public API to override the default configuration of outputs
kms: add public API to override the default configuration of outputs
Status: RESOLVED FIXED
Product: cogl
Classification: Platform
Component: general
unspecified
Other All
: Normal normal
: ---
Assigned To: Cogl maintainer(s)
Cogl maintainer(s)
wayland
Depends on:
Blocks:
 
 
Reported: 2013-08-12 13:00 UTC by Giovanni Campagna
Modified: 2013-08-23 13:16 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
kms: add public API to override the default configuration of outputs (11.54 KB, patch)
2013-08-12 13:00 UTC, Giovanni Campagna
none Details | Review
kms: add public API to override the default configuration of outputs (11.54 KB, patch)
2013-08-21 16:00 UTC, Giovanni Campagna
none Details | Review
kms: add public API to override the default configuration of outputs (11.86 KB, patch)
2013-08-23 08:06 UTC, Giovanni Campagna
none Details | Review

Description Giovanni Campagna 2013-08-12 13:00:08 UTC
Add API to allow complex applications using the KMS backend
to go almost straight to direct configuration (which is not possible
because Cogl needs to be in charge of buffers and FB objects).
Comment 1 Giovanni Campagna 2013-08-12 13:00:10 UTC
Created attachment 251346 [details] [review]
kms: add public API to override the default configuration of outputs
Comment 2 Giovanni Campagna 2013-08-15 15:52:18 UTC
Two comments on this patch, hoping that you guys know better:

1) I get a number of these warnings:
(gnome-shell-real:21231): Cogl-WARNING **: Failed to flip: Device or resource busy

They don't appear to have any visible effects. What do they mean? Are we dropping frames? Are we not blocking for the swap correctly?

2) If I use the KMS interface for DPMS, after I turn monitors off and on again the DRM device appears to freeze (it doesn't poll readable), and Clutter stops repainting, as if forever waiting for a vblank
Is there something I need to do, for example stop flipping when the screens are off? Or reset the CRTCs?
Comment 3 Giovanni Campagna 2013-08-16 08:21:37 UTC
Also, with this patch I can reproducibly cause a hang with triple head.

Normal execution is blocked at poll(), and is running the mainloop regularly, except that no drawing occurs (like the DPMS case above).
Closing mutter and calling the ioctl for restoring the vt to text mode hangs the process in noninterruptible sleep (kill -9 has no effect), with the following kernel stack:
[<ffffffffa00a74ed>] intel_crtc_wait_for_pending_flips+0x8d/0x120 [i915]
[<ffffffffa00aa6cb>] ironlake_crtc_disable+0x7b/0x8d0 [i915]
[<ffffffffa00abe77>] intel_crtc_disable+0x37/0x130 [i915]
[<ffffffffa00ac3c6>] __intel_set_mode+0x456/0xcb0 [i915]
[<ffffffffa00b3b16>] intel_set_mode+0x16/0x30 [i915]
[<ffffffffa00b4270>] intel_crtc_set_config+0x640/0x960 [i915]
[<ffffffffa002cb23>] drm_mode_set_config_internal+0x23/0x50 [drm]
[<ffffffffa0076511>] drm_fb_helper_set_par+0x71/0xf0 [drm_kms_helper]
[<ffffffff8132fc11>] fb_set_var+0x191/0x430
[<ffffffff8133b511>] fbcon_blank+0x1d1/0x2d0
[<ffffffff813b0384>] do_unblank_screen+0xb4/0x1e0
[<ffffffff813a711e>] vt_ioctl+0x118e/0x1210
[<ffffffff8139af55>] tty_ioctl+0x275/0xbe0
[<ffffffff811a9585>] do_vfs_ioctl+0x305/0x520
[<ffffffff811a9821>] SyS_ioctl+0x81/0xa0
[<ffffffff81647559>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

SysRQ+V gives an oops with a similar stack, ends with intel_crtc_wait_for_pending_flips and warn_slowpath_common.

This may be a kernel bug, in that kill -9 and SysRQ+V should keep working no matter how confused the HW is, but Xorg works fine, so this is also definitely a userspace bug.
Comment 4 Robert Bragg 2013-08-16 14:14:08 UTC
(In reply to comment #2)
> Two comments on this patch, hoping that you guys know better:
> 
> 1) I get a number of these warnings:
> (gnome-shell-real:21231): Cogl-WARNING **: Failed to flip: Device or resource
> busy
> 
> They don't appear to have any visible effects. What do they mean? Are we
> dropping frames? Are we not blocking for the swap correctly?

This happens if drmModePageFlip returns EBUSY which from looking at the driver can happen if the driver spuriously finds that the crtc has a NULL fb associated with it. This either implies that we didn't call drmModeSetCrtc first or that a hotplug event has happened that we haven't yet discovered which resulted in the fb being disassociated from the crtc. 

In this case no page flip has been queued and so we also wont get a page flip event for this.

When you see these errors do they coincide with you unplugging an external monitor perhaps?

> 
> 2) If I use the KMS interface for DPMS, after I turn monitors off and on again
> the DRM device appears to freeze (it doesn't poll readable), and Clutter stops
> repainting, as if forever waiting for a vblank
> Is there something I need to do, for example stop flipping when the screens are
> off? Or reset the CRTCs?

Yeah trying to get my head around the driver code here, I'm not entirely sure what the expected semantics are. It looks like the driver disables the crtc in such a way that it can be re-enabled without needing to explicitly re-set the mode. before disabling the vblank the driver also waits for any outstanding flips so we shouldn't have to worry about flips being dropped on the floor. It doesn't look like the page-flip ioctl will directly check for the crtc being disabled so I don't think it's an error per-se to try and page flip while the display is off, but notably there is a fixed sized buffer for pending flips and you'd eventually receive ENOSPC errors if you continued trying to page flip to a monitor that was off.
Comment 5 Robert Bragg 2013-08-16 14:28:20 UTC
(In reply to comment #3)
> Also, with this patch I can reproducibly cause a hang with triple head.
> 
> Normal execution is blocked at poll(), and is running the mainloop regularly,
> except that no drawing occurs (like the DPMS case above).
> Closing mutter and calling the ioctl for restoring the vt to text mode hangs
> the process in noninterruptible sleep (kill -9 has no effect), with the
> following kernel stack:
> [<ffffffffa00a74ed>] intel_crtc_wait_for_pending_flips+0x8d/0x120 [i915]
> [<ffffffffa00aa6cb>] ironlake_crtc_disable+0x7b/0x8d0 [i915]
> [<ffffffffa00abe77>] intel_crtc_disable+0x37/0x130 [i915]
> [<ffffffffa00ac3c6>] __intel_set_mode+0x456/0xcb0 [i915]
> [<ffffffffa00b3b16>] intel_set_mode+0x16/0x30 [i915]
> [<ffffffffa00b4270>] intel_crtc_set_config+0x640/0x960 [i915]
> [<ffffffffa002cb23>] drm_mode_set_config_internal+0x23/0x50 [drm]
> [<ffffffffa0076511>] drm_fb_helper_set_par+0x71/0xf0 [drm_kms_helper]
> [<ffffffff8132fc11>] fb_set_var+0x191/0x430
> [<ffffffff8133b511>] fbcon_blank+0x1d1/0x2d0
> [<ffffffff813b0384>] do_unblank_screen+0xb4/0x1e0
> [<ffffffff813a711e>] vt_ioctl+0x118e/0x1210
> [<ffffffff8139af55>] tty_ioctl+0x275/0xbe0
> [<ffffffff811a9585>] do_vfs_ioctl+0x305/0x520
> [<ffffffff811a9821>] SyS_ioctl+0x81/0xa0
> [<ffffffff81647559>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> SysRQ+V gives an oops with a similar stack, ends with
> intel_crtc_wait_for_pending_flips and warn_slowpath_common.
> 
> This may be a kernel bug, in that kill -9 and SysRQ+V should keep working no
> matter how confused the HW is, but Xorg works fine, so this is also definitely
> a userspace bug.

Hmm that looks like an intel driver bug on ironlake; waiting for pending flips should complete in a finite time :-/
Comment 6 Giovanni Campagna 2013-08-16 14:32:07 UTC
Btw, this is Ivybridge (Intel Corporation 3rd Gen Core processor Graphics Controller), not Ironlake
Comment 7 Giovanni Campagna 2013-08-21 07:57:27 UTC
OK, so, what is the way forward for this?

Are we merging it as is, and maybe debugging the issues later?
Do you have other plans for the API?
Are you going to investigate more?
Comment 8 Giovanni Campagna 2013-08-21 16:00:11 UTC
Created attachment 252621 [details] [review]
kms: add public API to override the default configuration of outputs

Add API to allow complex applications using the KMS backend
to go almost straight to direct configuration (which is not possible
because Cogl needs to be in charge of buffers and FB objects).

Found the problem, we were trying to page flip disabled CRTCs...
Comment 9 Robert Bragg 2013-08-22 17:36:34 UTC
ok sorry for the delay in replying about the patch itself.

I think if we can mark this api as Stability: unstable then it should be ok to land this, with the understanding that it's pretty likely we'll aim to replace this api later in favour of a more complete solution. Just for reference I am working on a wip/output branch with improved kms support though I don't want to block any mutter work on that work at this stage.

Another minor tweak would be that instead of taking a GList it would be preferable to take a CoglKmsCrtc **crtcs array and int n_crtcs arguments, since Cogl avoids exposing glib types in the api except for glib specific integration apis, given that cogl supports building standalone without glib.

If you are happy with that I don't mind if you either want to make the tweaks yourself or I can tweak your patch and land it myself.
Comment 10 Giovanni Campagna 2013-08-23 08:06:42 UTC
Created attachment 252814 [details] [review]
kms: add public API to override the default configuration of outputs

Add API to allow complex applications using the KMS backend
to go almost straight to direct configuration (which is not possible
because Cogl needs to be in charge of buffers and FB objects).
Comment 11 Robert Bragg 2013-08-23 13:16:14 UTC
Ok I've pushed your patch to master and cherry picked back to cogl-1.16.

I just squashed in a minor cleanup of the style to not use tabs and not align arguments to be consistent with other code in cogl.

thanks