After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 787668 - Crash on monitor config migration
Crash on monitor config migration
Status: RESOLVED FIXED
Product: mutter
Classification: Core
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: mutter-maint
mutter-maint
Depends on:
Blocks:
 
 
Reported: 2017-09-14 11:30 UTC by Carlos Garnacho
Modified: 2017-10-04 14:33 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
backends: Add some wiggle room for refresh rate comparisons (1.46 KB, patch)
2017-09-14 11:31 UTC, Carlos Garnacho
committed Details | Review
backends: Fallback on scale=1 if no matching monitor mode can be found (912 bytes, patch)
2017-09-14 11:31 UTC, Carlos Garnacho
reviewed Details | Review
monitor-config-migration: Discard if configured mode is missing (1.39 KB, patch)
2017-09-25 22:01 UTC, Jonas Ådahl
committed Details | Review
monitor-tests: Test config migration with refresh rate wiggle room (14.21 KB, patch)
2017-09-25 22:01 UTC, Jonas Ådahl
committed Details | Review

Description Carlos Garnacho 2017-09-14 11:30:28 UTC
After upgrading an install to f27, gnome-shell started crashing on startup.

(gdb) bt
  • #0 meta_monitor_mode_get_resolution
    at backends/meta-monitor.c line 1569
  • #1 calculate_scale
    at backends/meta-monitor.c line 1372
  • #2 meta_monitor_calculate_mode_scale
    at backends/meta-monitor.c line 1434
  • #3 meta_finish_monitors_config_migration
    at backends/meta-monitor-config-migration.c line 1178
  • #4 meta_monitor_config_manager_get_stored
    at backends/meta-monitor-config-manager.c line 385
  • #5 meta_monitor_manager_ensure_configured
    at backends/meta-monitor-manager.c line 496
  • #6 meta_monitor_manager_kms_ensure_initial_config
    at backends/native/meta-monitor-manager-kms.c line 1253
  • #7 meta_monitor_manager_ensure_initial_config
    at backends/meta-monitor-manager.c line 424
  • #8 meta_monitor_manager_constructed
    at backends/meta-monitor-manager.c line 723
  • #9 g_object_new_internal
    at /home/carlos/Source/gnome/glib/gobject/gobject.c line 1821
  • #10 g_object_new_with_properties
    at /home/carlos/Source/gnome/glib/gobject/gobject.c line 1949
  • #11 g_object_new
    at /home/carlos/Source/gnome/glib/gobject/gobject.c line 1621
  • #12 meta_backend_native_create_monitor_manager
    at backends/native/meta-backend-native.c line 405
  • #13 create_monitor_manager
    at backends/meta-backend.c line 364
  • #14 meta_backend_real_post_init
    at backends/meta-backend.c line 444
  • #15 meta_backend_native_post_init
    at backends/native/meta-backend-native.c line 385
  • #16 meta_backend_post_init
    at backends/meta-backend.c line 632
  • #17 meta_clutter_init
    at backends/meta-backend.c line 1034
  • #18 meta_init
    at core/main.c line 569
  • #19 main
    at ../../../../Source/gnome/gnome-shell/src/main.c line 435

As the crash happens during monitor config migration and it never finishes, every next login will still try to perform the migration and fail at it.

AFAICT the crash seems mostly down to slightly differing refresh_rate values. I'm attaching a patch to add some threshold to refresh rate comparisons that seems to fix the case here. However I'm also attaching a patch to bail out safely if no MetaMonitorModeSpec matches.
Comment 1 Carlos Garnacho 2017-09-14 11:31:37 UTC
Created attachment 359780 [details] [review]
backends: Add some wiggle room for refresh rate comparisons

We have not enough control over the sources of the refresh rate
float variable to make == comparisons reliable, add some room
when comparing these.
Comment 2 Carlos Garnacho 2017-09-14 11:31:51 UTC
Created attachment 359781 [details] [review]
backends: Fallback on scale=1 if no matching monitor mode can be found

It is possible to receive NULL here, so make it sure we handle it safely.
Comment 3 Jonas Ådahl 2017-09-14 12:34:23 UTC
Review of attachment 359780 [details] [review]:

lgtm.
Comment 4 Jonas Ådahl 2017-09-14 12:37:08 UTC
Review of attachment 359781 [details] [review]:

I don't think the caller should pass NULL to this function. I guess this is the migration code that doesn't have a matching mode, since it might try to migrate a config to different monitor, and that code should handle not finding the mode.
Comment 5 Carlos Garnacho 2017-09-14 13:52:17 UTC
(In reply to Jonas Ådahl from comment #4)
> Review of attachment 359781 [details] [review] [review]:
> 
> I don't think the caller should pass NULL to this function. I guess this is
> the migration code that doesn't have a matching mode, since it might try to
> migrate a config to different monitor, and that code should handle not
> finding the mode.

Right, but I wonder what's the best way to proceed here, remove the affected MetaLogicalMonitorConfig altogether?
Comment 6 Carlos Garnacho 2017-09-14 13:56:22 UTC
Comment on attachment 359780 [details] [review]
backends: Add some wiggle room for refresh rate comparisons

Attachment 359780 [details] pushed as 743e8cc - backends: Add some wiggle room for refresh rate comparisons
Comment 7 Jonas Ådahl 2017-09-15 03:22:31 UTC
(In reply to Carlos Garnacho from comment #5)
> (In reply to Jonas Ådahl from comment #4)
> > Review of attachment 359781 [details] [review] [review] [review]:
> > 
> > I don't think the caller should pass NULL to this function. I guess this is
> > the migration code that doesn't have a matching mode, since it might try to
> > migrate a config to different monitor, and that code should handle not
> > finding the mode.
> 
> Right, but I wonder what's the best way to proceed here, remove the affected
> MetaLogicalMonitorConfig altogether?

Yea, discard it (together with the whole MetaMonitorsConfig) with a warning. The alternative is to keep it unmigrated and maybe try when an "identical" (same connector/vendor/product/serial) monitor that is actually different is connected in the future. Not sure that's better though, to keep a bunch of possibly never applicable half migrated configurations left.
Comment 8 Jonas Ådahl 2017-09-25 22:01:23 UTC
Created attachment 360389 [details] [review]
monitor-config-migration: Discard if configured mode is missing

If a configuration key matched a current system state, but no monitor
mode was found (for example because of an incorrect refresh rate),
discard it while logging a warning.
Comment 9 Jonas Ådahl 2017-09-25 22:01:28 UTC
Created attachment 360390 [details] [review]
monitor-tests: Test config migration with refresh rate wiggle room

Check that we finish configurations within range, and discard the ones
out of range.
Comment 10 Florian Müllner 2017-10-03 22:26:13 UTC
Review of attachment 360389 [details] [review]:

Sure
Comment 11 Florian Müllner 2017-10-03 22:32:13 UTC
Review of attachment 360390 [details] [review]:

LGTM

::: src/tests/monitor-unit-tests.c
@@ +4947,3 @@
+                                         old_config_file,
+                                         &error))
+    g_error ("Failed to migrated config: %s", error->message);

Typo: s|migrated|migrate|

@@ +5091,3 @@
+                                         old_config_file,
+                                         &error))
+    g_error ("Failed to migrated config: %s", error->message);

Dto.
Comment 12 Jonas Ådahl 2017-10-04 14:33:19 UTC
Attachment 360389 [details] pushed as 43eeb00 - monitor-config-migration: Discard if configured mode is missing
Attachment 360390 [details] pushed as 120db06 - monitor-tests: Test config migration with refresh rate wiggle room