GNOME Bugzilla – Bug 724402
memory corruption in mutter keybinding name causes segfault
Last modified: 2014-12-29 09:37:52 UTC
After a change to shellshape (my gnome-shell extension), I started getting segfaults in gnome-shell sometimes when using shellshape keybindings (it adds a bunch of global keyboard shortcuts to control window tiling). I would occasionally see errors like: (gnome-shell:24457): Gjs-WARNING **: JS ERROR: InternalError: buffer too small WindowManager<._filterKeybinding@/usr/share/gnome-shell/js/ui/windowManager.js:930 And: (gnome-shell:9979): Gjs-WARNING **: JS ERROR: TypeError: malformed UTF-8 character sequence at offset 0 WindowManager<._filterKeybinding@/usr/share/gnome-shell/js/ui/windowManager.js:930 After enabling debug messaging for the keybinding mutter topic, I saw the following output when I reproduced the crash: (gnome-shell:9979): Gjs-WARNING **: JS ERROR: TypeError: malformed UTF-8 character sequence at offset 0 WindowManager<._filterKeybinding@/usr/share/gnome-shell/js/ui/windowManager.js:930 KEYBINDINGS: Binding keycode 0x34 mask 0x40 matches event 0x34 state 0x40 KEYBINDINGS: Running handler for �\ (the name above was printed as "<unprintable>/<unprintable>" in my terminal originally, which is why I assume this is a memory issue - perhaps the keybinding structure is being kept around after its name is freed). The change that I made in shellshape to uncover this issue has to do with how the extension is enabled / disabled. Previously, I cleaned up everything nicely on disable() and basically had a fresh start on enable(). But because screen lock disables and re-enables all extensions[0], I've had to keep much of the extension's state around in memory even after a disable(). I'm still removing all keybindings on disable() and re-adding them on enable(), so I can't see why this would cause this bug. But it does, quite reproducably. The code which causes the issue is at [1] - the commit that uncovered the problem is [2]. I'm afraid I was unable to create a smaller test case that reproduces the problem. [0] https://github.com/gfxmonk/shellshape/issues/107 [1] https://github.com/gfxmonk/shellshape/tree/lock-persist [2] https://github.com/gfxmonk/shellshape/commit/232216d55332997e301d8dd05f19687890cfd546
Oh, also: the crash only ever seems to happen after returning from a screen lock. Disabling and re-enabling the extension (with gnome-tweak-tool) continues to work fine.
Can you at least run the shell under valgrind and see where you start getting memory errors?
I'm afraid not - I tried, but it doesn't get so far as drawing the screen once (i.e no windows decorations ever appear) when running under valgrind - it gets an error about an unknown instruction. I've attached the valgrind log in case that's any use.
Created attachment 270329 [details] valgrind log
(In reply to comment #3) > I'm afraid not - I tried, but it doesn't get so far as drawing the screen once > (i.e no windows decorations ever appear) when running under valgrind - it gets > an error about an unknown instruction. I've attached the valgrind log in case > that's any use. Mh, you can probably try to run with GJS_DISABLE_JIT=1, which will disable code generation.
Created attachment 270334 [details] valgrind log
(In reply to comment #5) > Mh, you can probably try to run with GJS_DISABLE_JIT=1, which will disable code > generation. Thanks, that worked. I've updated valgrind.log with a successful run. I couldn't get it to crash, presumably because valgrind handles memory differently. I did get memory errors reported in `change_binding_keygrabs` though, which sounds relevant. Hopefully someone more familiar with the code and/or valgrind can make better sense of this than I can...
Created attachment 270336 [details] [review] keybindings: fix invalid read after a keybinding is removed The handler pointer is dangling in MetaKeyBinding until rebuild_key_binding_table() is run, so we can't dereference it. Because we only need the flags at ungrab time, store a copy in the MetaKeyBinding structure.
This is what I see from valgrind as invalid read. I don't see how that could cause a segfault, though...
Review of attachment 270336 [details] [review]: Indeed, makes sense. I'm re-working keybindings.c a bit for wayland right now, but this is self-contained enough that I can easily rebase.
Attachment 270336 [details] pushed as 682d6f9 - keybindings: fix invalid read after a keybinding is removed
Hi, Thanks for the patch Giovanni. I applied & tested it today, and you're right, it wasn't the cause of the segfault (i.e I am still getting the crash). I tried a number of times to reproduce the issue under valgrind with this patch, but couldn't, so I'm afraid I still can't provide any additional details. I can provide information on how to install the broken branch of shellshape, if that helps others reproduce the issue?
Are you still seeing this? I do remember fixing a memory corruption in keybindings.c over the summer.
Thanks for the reminder Jasper, I've had this code re-enabled in shellshape for ~6 months now[0], and haven't seen any of these crashes. So it definitely sounds like you've fixed it - thanks! [0]: https://github.com/gfxmonk/shellshape/issues/107#issuecomment-46961165