GNOME Bugzilla – Bug 778776
system-monitor 3.23.90 crashes in Ubuntu
Last modified: 2017-03-27 08:20:41 UTC
gnome-system-monitor 3.23.90-0ubuntu1 Ubuntu 17.04 Alpha We are still using libgtop2 2.34.2-1. Is that the problem? gnome-system-monitor crashes when trying to access the Processes tab. Although now it's crashing on startup. Originally reported at https://launchpad.net/ubuntubugs/1665209 .
+ Trace 237148
Thread 1 (LWP 14185)
+ Trace 237149
+ Trace 237150
No, libgtop 2.34.2 should be fine, I doubt that is the reason. Based on the stacktrace it has something to do with cgroups. I will investigate and provide a fix as soon as I can.
Jeremy, are you able to run gnome-system-monitor within valgrind ?
(In reply to Benoît Dejean from comment #4) > Jeremy, are you able to run gnome-system-monitor within valgrind ? Could you point me to step by step directions of how I should use valgrind?
I'm trying Ubuntu alpha right now. The ubuntu button loops forever. I cannot start any application, even from the file browser, there's no way to start a program. The only way in is alt-f2. <personnal rant/> Looks like some kind of corruption the lower 32bits are zeroed, but not the upper half.
(In reply to Benoît Dejean from comment #6) > I'm trying Ubuntu alpha right now. The ubuntu button loops forever. That's Unity. I think you'd be happier with Ubuntu GNOME. http://cdimage.ubuntu.com/ubuntu-gnome/daily-live/current/
So, I've installed that Ubuntu GNOME daily (SO much better) apt-get build-dep gnome-system-monitor && apt-get source gnome-system-monitor && ./configure && make && ./src/gnome-system-monitor it doesn't segfault at all. The cgroups are correctly displayed. Is it possible that something is wrong in the build chain ?
But the shipped gnome-system-monitor does segfaults with the same stacktrace.
Adding @Artem Vorotnikov to take a peek, maybe he has an idea.
Ubuntu builds with systemd and wnck and -fPIE -fstack-protector-strong. What I can already tell is that disabling wnck bypasses the problem. With wnck, the corruption is visible, I can reproduce it even on my debian.
I don't think that the corruption even comes from cgroups.cpp. Making get_process_cgroup_info(info) into a simple logging statement for the value of info.cgroup-name shows a lot of invalid pointers. Tested with gcc and clang, reproductible in -O0 with -stack-protector-strong.
I've bisect the bug, the first bad commit with wnck is the cgroup reform. https://git.gnome.org/browse/gnome-system-monitor/commit/?id=2c1c564401ad42f24a73a162210bba0e0623fb1f But it still doesn't make sense.
It seems crazy but Artem dropped the #include <config.h>. Adding it back at the top of cgroups.cpp seems to fix this. No more segfault and no more valgrind errors. Robert, can you please test ?
(In reply to Benoît Dejean from comment #14) > It seems crazy but Artem dropped the #include <config.h>. Adding it back at > the top of cgroups.cpp seems to fix this. No more segfault and no more > valgrind errors. Robert, can you please test ? I have tried adding #include <config.h> to cgroups.cpp, but compiling with --enable-wnck --enable-systemd and -O0 -fstack-protector-strong still gives the sigsegv. As you have also notice, this has something to do with the --enable-wnck option, as with --disable-wnck the app runs fine.
(In reply to Robert Roth from comment #15) > (In reply to Benoît Dejean from comment #14) > > It seems crazy but Artem dropped the #include <config.h>. Adding it back at > > the top of cgroups.cpp seems to fix this. No more segfault and no more > > valgrind errors. Robert, can you please test ? > > I have tried adding #include <config.h> to cgroups.cpp, but compiling with > --enable-wnck --enable-systemd and -O0 -fstack-protector-strong still gives > the sigsegv. As you have also notice, this has something to do with the > --enable-wnck option, as with --disable-wnck the app runs fine. It may be time to re-evaluate the need for WNCK code paths since these seem to be neither tested nor used.
I think we are hit by the dual-ABI for std::string. It would match what I see. I've implemented my own canary and the 4 bytes before string objects are obliterated and I now get segfaults in when ProcInfo construct the first of its string member
(In reply to Artem Vorotnikov from comment #16) > It may be time to re-evaluate the need for WNCK code paths since these seem > to be neither tested nor used. I'm still puzzled by the fact that #include <config.h> makes the problem disappear on my system. Because wnck as always been unstable and because we are also able to get icons from other sources, for the time being and until we understand that heap corruption, we could either disable it or only rename the --enable-wnck to --enable-broken-wnck ?
(In reply to Benoît Dejean from comment #18) > > Because wnck as always been unstable and because we are also able to get > icons from other sources, for the time being and until we understand that > heap corruption, we could either disable it or only rename the --enable-wnck > to --enable-broken-wnck ? Also to note is the fact that gtkmm has no bindings for WNCK. Thus if possible, we should cut this code as unportable C.
I've pushed a commit to rename the configure option. https://git.gnome.org/browse/gnome-system-monitor/commit/?id=ffe597e58402b0dfc59bb33ebd92d2ca0dd359de
I rebuilt gnome-system-monitor for Ubuntu without using the --enable-wnck option to get this app working again there. Debian has been using --enable-wnck for 3 years https://tracker.debian.org/media/packages/g/gnome-system-monitor/changelog-3.22.2-1
I think we could lower the importance of this bug because we have a workaround.
This works now with --enable-wnck with gnome-system-monitor 3.24.0 on Ubuntu, but I've removed that option from the Ubuntu packaging on your recommendation anyway.