After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 788764 - mutter:ERROR:backends/meta-monitor-manager.c:2267:meta_monitor_manager_get_logical_monitor_from_number: assertion failed: ((unsigned int) number < g_list_length (manager->logical_monitors)
mutter:ERROR:backends/meta-monitor-manager.c:2267:meta_monitor_manager_get_lo...
Status: RESOLVED FIXED
Product: mutter
Classification: Core
Component: general
3.26.x
Other Linux
: Normal normal
: ---
Assigned To: mutter-maint
mutter-maint
: 789040 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2017-10-10 10:18 UTC by Georg Wicherski
Modified: 2018-02-15 16:39 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
systemd journal excerpt including backtrace (42.78 KB, text/plain)
2017-10-10 10:18 UTC, Georg Wicherski
  Details
Backtrace (967 bytes, text/plain)
2017-10-11 06:00 UTC, Daniël de Kok
  Details
libmutter.so null pointer dereference backtrace and regs (18.72 KB, text/plain)
2017-10-12 20:55 UTC, Georg Wicherski
  Details
gdb log with JS backtrace (3.11 KB, text/plain)
2017-10-13 10:01 UTC, Georg Wicherski
  Details
Trace of NULL pointer deref in meta_logical_monitor_get_scale (3.82 KB, text/plain)
2017-10-13 10:12 UTC, Daniël de Kok
  Details
window/wayland: Handle resizing when headless (1.72 KB, patch)
2017-10-16 09:08 UTC, Jonas Ådahl
committed Details | Review
ArchLinux PKGBUILD patch to include Jonas' working patch (2.85 KB, patch)
2017-10-16 10:11 UTC, Georg Wicherski
rejected Details | Review
systemd journal of crashing after applying the latest patch (2.77 KB, text/plain)
2017-10-16 15:21 UTC, Georg Wicherski
  Details
backtrace of the last bug from gdb (8.33 KB, text/plain)
2017-10-16 15:24 UTC, Georg Wicherski
  Details
another backtrace of the latest bug (10.60 KB, text/plain)
2017-10-18 12:59 UTC, Georg Wicherski
  Details

Description Georg Wicherski 2017-10-10 10:18:55 UTC
Created attachment 361234 [details]
systemd journal excerpt including backtrace

I have a physical KVM switch that "unplugs" the monitor when switching devices (generally desired behavior for me). After upgrading mutter (3.24.4-1 -> 3.26.1-1) as packaged with ArchLinux, the following assertion is hit when switching screens on a machine that has only the KVM connected:

mutter:ERROR:backends/meta-monitor-manager.c:2267:meta_monitor_manager_get_logical_monitor_from_number: assertion failed: ((unsigned int) number < g_list_length (manager->logical_monitors))

A full excerpt from the systemd journal including a coredump is attached.

As a result, my Gnome session dies everytime I switch screens. The assertion is not hit on a laptop (with an internal screen) that is connected to the same KVM. Based on the assertion's text, this seems logical as it likely only hits when the only screen is "unplugged".

Thank you for your time on working on OpenSource software!
Comment 1 Jan Alexander Steffens (heftig) 2017-10-10 10:27:57 UTC
Probably a duplicate of bug 788607.
Comment 2 Daniël de Kok 2017-10-11 06:00:02 UTC
I hit the same assertion when I switch my screen off/on (Dell P2415Q HiDPI, NVIDIA Quadro 2000 using Nouveau, on Wayland). I have applied the diff of commit 6eb7d13 referenced in bug 788607. Unfortunately, this did not resolve the problem.
Comment 3 Daniël de Kok 2017-10-11 06:00:54 UTC
Created attachment 361300 [details]
Backtrace
Comment 4 Georg Wicherski 2017-10-12 20:38:05 UTC
Tbe problem persists with the following ArchLinux packages:

gnome-shell 3.26.1+3+g43ec5280b-1
mutter 3.26.1+7+g41f7a5fdf-1

Based on the git commit in the package versions, I'd assume the fixes from bug 788607 would be included.

However, one overlap I have identified with the comments from bug 788607 is that there is no problem when there are no open windows.
Comment 5 Georg Wicherski 2017-10-12 20:55:22 UTC
Created attachment 361462 [details]
libmutter.so null pointer dereference backtrace and regs

Null pointer dereference that still occurs with the folliwng packages on ArchLinux:

gnome-shell 3.26.1+3+g43ec5280b-1
mutter 3.26.1+7+g41f7a5fdf-1
Comment 6 Jonas Ådahl 2017-10-13 03:58:00 UTC
The trace looks strikingly similar indeed, but where from in Javascript it came from is harder to determine.

Any way you can attach a gdb to the process and reproduce? Then when you hit the assert, run:

print gjs_dumpstack()

and paste what is printed to stdout/stderr. (note that it might end up in the journal).
Comment 7 Georg Wicherski 2017-10-13 10:01:28 UTC
Created attachment 361499 [details]
gdb log with JS backtrace

I've added a SIGSEGV catchpoint to invoke gjs_dumpstack(), but the JS stacktrace is empty (also note that there is no JS calls in the GDB stacktrace for the null ptr dereference).


Oct 13 11:57:43 hostname org.gnome.Shell.desktop[5852]: == Stack trace for context 0x55c289d52000 ==
Oct 13 11:57:43 hostname org.gnome.Shell.desktop[5852]: (EE)
Oct 13 11:57:43 hostname org.gnome.Shell.desktop[5852]: Fatal server error:
Oct 13 11:57:43 hostname org.gnome.Shell.desktop[5852]: (EE) failed to read Wayland events: Broken pipe
Oct 13 11:57:43 hostname org.gnome.Shell.desktop[5852]: (EE)
Oct 13 11:58:25 hostname org.gnome.Shell.desktop[6236]: glamor: EGL version 1.4 (DRI2):
Comment 8 Daniël de Kok 2017-10-13 10:11:23 UTC
Georg: your second trace (attachment 361462 [details]) looks more similar to mine in bug 788788. I will attach it here as well, it contains more symbols than your trace.
Comment 9 Daniël de Kok 2017-10-13 10:12:47 UTC
Created attachment 361502 [details]
Trace of NULL pointer deref in meta_logical_monitor_get_scale
Comment 11 Daniël de Kok 2017-10-15 07:11:42 UTC
The latest changes in the gnome-3-26 branch solve these problems for me. I still have to apply the workaround in bug 788788 though. With mutter from gnome-3-26 and that workaround I can reliably switch my screen off and on without gnome-shell/mutter crashing.
Comment 12 Jonas Ådahl 2017-10-16 09:07:36 UTC
*** Bug 789040 has been marked as a duplicate of this bug. ***
Comment 13 Jonas Ådahl 2017-10-16 09:08:50 UTC
Created attachment 361654 [details] [review]
window/wayland: Handle resizing when headless

We tried to get the geometry scale, which may depend on the main
logical monitor assigned to the window. To avoid dereferencing a NULL
logical monitor when headless, instead assume the geometry scale is 1.
Comment 14 Georg Wicherski 2017-10-16 10:11:05 UTC
Created attachment 361657 [details] [review]
ArchLinux PKGBUILD patch to include Jonas' working patch

Thank you, your proposed patch works and solves the issue for me.

I've included the PKGBUILD patch that I used to test this locally for Jan's convenience.
Comment 15 Daniël de Kok 2017-10-16 12:03:45 UTC
Thanks! I will try the patch as well tonight and report back.
Comment 16 Georg Wicherski 2017-10-16 15:21:31 UTC
Created attachment 361681 [details]
systemd journal of crashing after applying the latest patch

Zu frueh gefreut!

Seems there is another bug, this one triggered when switching the screen back on. And not a helpful backtrace. Doesn't always reproduce.
Comment 17 Georg Wicherski 2017-10-16 15:24:33 UTC
Created attachment 361682 [details]
backtrace of the last bug from gdb

Some more information to pinpoint the issue.
Comment 18 Jonas Ådahl 2017-10-16 15:43:17 UTC
Could you install debug symbols, and get a full backtrace and attach that? Using gdb, that'd be the output of "backtrace full" on the core dump. FWIW, the new trace could be also be bug 788627 or bug 788908.
Comment 19 Daniël de Kok 2017-10-16 17:58:09 UTC
For me it works now: I can reliably turn off and on the screen without gnome-shell crashing. This increases usability significantly :).

There are some glitches though. After switching on the screen again, gnome-terminal windows are 1/4th of the size. Nautilus windows retain the same size but are lo-DPI and zoomed. After switching back and forth to another desktop all the windows are normal again.
Comment 20 Anatol Pomozov 2017-10-17 04:17:29 UTC
Jonas, I applied your patch from https://bugzilla.gnome.org/show_bug.cgi?id=788764#c13 and it fixed the crash. Thanks a lot!

But I see other issue exactly the described by Daniel. I have a HiDPI monitor with scaling coefficient "2". After monitor resume terminal becomes scaled down to "1" and Nautilus fonts become blurred. I need to press "Super" button (show all windows at current desktop) to return to the correct state.
Comment 21 Georg Wicherski 2017-10-18 12:59:17 UTC
Created attachment 361801 [details]
another backtrace of the latest bug

Same bug with different backtrace this time.

I tried building mutter with:
CFLAGS="-flto=no -O0 -ggdb" ./configure --enable-debug …
But still didn't get any useful debug symbols.
Comment 22 Rui Matos 2017-10-27 13:59:37 UTC
Review of attachment 361654 [details] [review]:

lgtm
Comment 23 Jonas Ådahl 2017-11-10 02:23:25 UTC
Comment on attachment 361654 [details] [review]
window/wayland: Handle resizing when headless

Attachment 361654 [details] pushed as 3572502 - window/wayland: Handle resizing when headless
Comment 24 Jonas Ådahl 2017-11-10 02:27:06 UTC
Pushed to master. Marking the ARCH patch as "rejected" as I can't find any better patch status for it.