GNOME Bugzilla – Bug 524830
process list uses too much CPU
Last modified: 2018-06-20 12:21:22 UTC
As reported downstream: https://bugs.launchpad.net/ubuntu/+source/gnome-system-monitor/+bug/93847: "Opening top and gnome-system-monitor simultaneously (in Feisty), I can observe that top takes < 1% CPU, g-s-m takes 10-30% CPU. This isn't a misreporting issue unless it's at a lower level: the two report similar results. The fact that top takes much less is also notable, in that it indicates to me this is probably fixable. The CPU panel applet seems to reflect the same thing, with nearly zero usage except for huge spikes very consistently every three seconds, which stop immediately when I close g-s-m." This has been a problem for a few versions, and was originally reported on 2.18. Offhand, I wonder aloud if this might be related to GTK+ ListView.
Yup, if we are talking about the process list, treeview are not really designed to be frequently updated. Does the CPU usage drops to a more acceptable level when showing "All processes" instead of "My processes" ? With or without dependencies ?
I consistently get: All processes: 8% (8% with dependencies) My processes: 6% (7-8% with dependencies) Active processes: 0% This is on an X2 5000+. The CPU usage takes the form of spiking to 100% of one core for a brief period (100 ms or less) every update period, which I guess is updating the tree view. The figures are according to gsm's own output. It used to take more when I was on an older CPU, more like the 10-30% reported in comment #0. top takes 0-1% according to its own output.
I am having this problem also. I think it is related to the new cairo smooth scrolling resource tab.
This is a pretty big issue since it essentially makes g-s-m absolutely pointless for monitoring CPU usage. If you can't rely on it reporting CPU usage information, it shouldn't have the feature. It's most certainly due to very poorly optimized timeout/rendering/clipping/etc with the fancy Cairo graphs.
That's not about cairo because we're talking about the process list. There is/was already a bug about cairo being awfully slow with some/many graphics cards (while it runs well on some/many low-end/old GPU). Last time i run sysprof on gsm, GTK treeview was eating 75% of the CPU time: it's not meant to be constantly updated, render+sort takes ages.
It looked as though it might be related to the graphs from comment #3. If there's a more appropriate bug for the graphing, I'll comment there. In all, it's related stuff - see the title of the bug.
Looking at load-graph.cpp, right off the bat one thing I see that would help is to avoid redrawing the entire graph on each timeout. In load_graph_draw, gtk_widget_queue_draw is called which in turn will trigger an expose event on the widget with a single damage rectangle consisting of the entire widget allocation. If damage can be computed ahead of time based on the graph points, it could be beneficial to call gtk_widget_queue_draw_area instead. The expose handler should then be updated to iterate over the damage rectangles, clip the cairo context to those regions, and render only what is damaged. For instance, if there's been no CPU spike above 10%, then there's no reason to render the 100%-10% region of the graph on each iteration. Smartly subdividing the plot into damage regions can help a /lot/.
Curious - does the graph grab CPU data from the process tree model?
No. You're in the wrong bug. See http://bugzilla.gnome.org/show_bug.cgi?id=507797.
Ok, well my comment #7 is still quite relevant to improving performance of the graph, so perhaps I'll reopen bug#507797? If the graph does not depend on the process tree model, why does it crash if I switch to the resources tab after disabling the process tab in code?
Oh and thanks for your blog entry, that's a very positive and friendly look on my work. Not everyone gets a job at some big linux company. I've been all alone for years on system-monitor on my spare time, you can't just wander, scream "FAIL" and fill the wrong bug report.
(In reply to comment #10) > Ok, well my comment #7 is still quite relevant to improving performance of the > graph, so perhaps I'll reopen bug#507797? > > If the graph does not depend on the process tree model, why does it crash if I > switch to the resources tab after disabling the process tab in code? > Open a new bug and please send a stacktrace.
Hey, I'm not crapping on your work. I really appreciate gnome-system-monitor and the work you've tirelessly put into it. I'm trying to help here. But I do quite like the humor in the fact that the tool to monitor system load can't be trusted because it's consuming all the system load. That is pretty amusing, and I think many others would agree. When I originally commented on this bug report, the title was simply something like, "gnome-system-monitor uses too much CPU" - it was quite relevant. For what it's worth, I'm looking into this on my spare time as well because I like g-s-m, it's important to GNOME so it shouldn't be misrepresenting, and I'm just interested in fixing it.
Filed bug #550708 for rendering improvement ideas.
IMO the solution for this is to use gdk_window_scroll() whenever new data points need to be rendered. This will copy the existing content of the window using a screen-to-screen blit and invalidate only the area that actually needs to be drawn because it has not been drawn before. Then you just need to make sure that the expose event handler does only redraw the invalidated area.
Sven: see bug #550708 since Benoît doesn't want graph related tidbits on this bug. Using scroll would only get you part of the damage calculation for free though (along the x-axis). The hard part is computing the regions in chunks along the plot itself to minimize rendering on the y-axis. Am thinking Riemann sums.
I'd agree, just from a guess, based on my experience with Gtk that the commentor blaming the TreeView for this is 100% spot on. TreeView is horrendously slow.
Banshee developers wrote their own treeview widget, because of some weakness' of the gtk's one.
Please everyone, keep treeview bashing on treeview bugs
Writing a new tree view is not the solution here ;)
If the tree view is eating the CPU, then it should be simple to fix inside gnome-system monitor, because typically for the system monitor there are one or two lines in the tree view changing once a second, and that will take no time at all. So if the tree model is eating lots of CPU, then it's because something is being done like setting all the rows from "0%" to "0%" on each iteration.
(In reply to comment #21) > If the tree view is eating the CPU, then it should be simple to fix inside > gnome-system monitor, because typically for the system monitor there are one or > two lines in the tree view changing once a second, and that will take no time > at all. So if the tree model is eating lots of CPU, then it's because something > is being done like setting all the rows from "0%" to "0%" on each iteration. > That's not the case anymore. But you are right, i need to come up with a fresh sysprof profile.
I am guessing that the smoothening of the CPU, network and Memory curve may also be contributing significantly to the overall CPU utilization of g-s-m.
(In reply to comment #23) > I am guessing that the smoothening of the CPU, network and Memory curve may > also be contributing significantly to the overall CPU utilization of g-s-m. See comment 16 and bug 550708.
Moving component back as per Benoît's comments.
(In reply to comment #7) > Looking at load-graph.cpp, right off the bat one thing I see that would help is > to avoid redrawing the entire graph on each timeout. Originally we did this by caching a surface of the background, and one of the graph line, then compositing them together, the problem was that once we did this a few intel chips, a few amd chips and some nvidia configurations (which could be fixed in xorg) behaved strangely, suddenly using 100% CPU. See: http://bugzilla.gnome.org/show_bug.cgi?id=507797 IMHO, this is most certainly an xorg/driver bug, and should be fixed there. Somewhere between the driver and pixman would be where I'd start looking if I were that kind of X hacker, but I'm not... I can easily produce test apps which can reproduce this bug, however the fd.o bug I filed was not very helpful; https://bugs.freedesktop.org/show_bug.cgi?id=15479 On the chipsets which didn't demonstrate that bug we could observe ~2% CPU on a modern system and I think 10% was reported on older machines. > In load_graph_draw, gtk_widget_queue_draw is called which in turn will trigger > an expose event on the widget with a single damage rectangle consisting of the > entire widget allocation. > > If damage can be computed ahead of time based on the graph points, it could be > beneficial to call gtk_widget_queue_draw_area instead. The expose handler > should then be updated to iterate over the damage rectangles, clip the cairo > context to those regions, and render only what is damaged. > > For instance, if there's been no CPU spike above 10%, then there's no reason to > render the 100%-10% region of the graph on each iteration. Smartly subdividing > the plot into damage regions can help a /lot/. > I'm not sure it'll help that much, but I recon it would be possible to give this a go. although, as the entire graph is moving, we need to be aware that if we have near 100% on any of the data points we'd still have to redraw the entire graph. So the optimization would only be relevant when the whole graph is sub 100%. I think we could also get some optimizations from pre-rendering the text onto the background and only updating it at relevant intervals (only really required for network load). Pushing the text off to a stored buffer saves on rendering the text, this is yet another minor optimization which could end up providing a less than zero investment to benefit. I think the only way we can fix the graphs, and all other cairo dodgyness is to fix up all the xorg drivers so they don't have so many problems with certain types of cairo operations. wrt the treeview performance, chances are its in the tree model. The treeview itself is quite fast, but the model is dog slow. I've seen bugs like that with it before. There are even performance gains to be had using list store rather than tree store. It's probably time for tree(view|model|store) to get a good overhaul and maybe be replaced completely...
Sorry to disturb you, but is it normal that even in the system tab (where nothing is updated) it constantly runs on 6% of the CPU? (controlled using "top"). Can it be that there's a bug in another place, that eats lots of CPU?
Hello Nicolo, i think what is happening is that you first displayed the Resources tab. Then even if you switched to another tab, it's still collecting data but not drawing them. 6% seems a lot: what version are you running ?
It seems you are right. If I open in directly in the system tab it does not eat CPU time, sorry.
I don't share Your experiences, guys - I checked it and even if I open the Resource tab first, if I then switch to System tab, top shows me only 1-2% CPU usage. I have quite fast Core2 Duo P8600, but still - when I switch back to Processes, CPU usages increases to 8-10% (for Resources tab it's about 13-16%). Switching back to System tab... still 1-2%. It means that probably collecting data for Resources tab is hardly causing any performance issues, it's rather refreshing the list in Processes tab or drawing charts in Resources tab that eats CPU... On the other hand, the usual... scrolling (sic!) the Processes list up and down causes CPU usage to boost up even to 30%! Is this a Compiz issue or again sth connected to System Monitor? I'd like to thank You people for this app - I really like and appreciate it. :) However, this particular problem seems quite disturbing in a program measuring usage of computer resources.
Created attachment 148041 [details] gsm's processes tab showing gsm taking 47% cpu Screenshot for clarity what this bug is about: The processes tab. The processes tab here is taking 47% cpu on a Core2Duo 1.6GHz machine.
I'm thinking that there may be many people who say "works for me here" who have modern CPUs, such as Core2 Duo or Core2 Quad or similar/better. Try using a Pentium M or an Atom processor and the pain should be much more obvious, even with Mist (the fastest GTK+ theme engine I know). I'm seeing two things currently: - 100% CPU spike that lasts exactly 10 seconds after g-s-m's window has started (directly into the Processes tab, no graphs) - Every 30 seconds, like clockwork, there is a 5 seconds 100% CPU spike. You can notice it quite easily using the system monitor gnome-panel applet. Let me know if there's any more info I can provide.
*** Bug 599929 has been marked as a duplicate of this bug. ***
*** Bug 653898 has been marked as a duplicate of this bug. ***
[Adding missing "QA Contact" entry so system monitor bug report changes can still be watched via the "Users to watch" list on https://bugzilla.gnome.org/userprefs.cgi?tab=email when the assignee is changed to an individual.]
I have started investigating this, and while updating the process list, a lot of cpu time is spent in glibtop_get_proc_map from proctable.cpp's get_process_memory_writable method, called from get_process_memory_info for each process on each update. get_process_memory_info already contains a plan b if writable memory information is not available, and uses process' resident memory size instead of the writable memory size (resident memory is higher in most cases, but doesn't need recalculation on each update based on the process' memory map). So by using the plan B (resident memory size instead of writable memory size for the memory column) all the time, we can get a less accurate process memory size estimation, but CPU usage on my quad-core core 2 duo processor goes down from avg. 55% (ranging from 38% to 67%) to avg 20% (ranging from 5% to 33%) (These values are a result of approx 1 minute of g-s-m CPU usage from top, while scrolling through the My Processes list). We could also tweak the process list a bit further by only updating the data which is displayed (e.g. if Waiting Channel column is not visible, it doesn't need to be refreshed). What do you think? Benoit? Chris? Others?
Created attachment 250544 [details] [review] Use a different definition of Memory Currently, the "Memory" column is using the writable memory when available. While being a very good heuristics for identifying private process memory, it's damn slow to compute as it requires the whole memory map of the process. This is the single major culprit of the long-standing high CPU usage problem of the Process list in System Monitor. A much faster-to-compute approximation of private memory is the Resident Set Size (RSS), after subtracting the shared memory. While being an underestimation, it's still the best definition, roughly corresponding to the memory which would be freed by killing the process. This is the same value used by other popular system monitoring tools, like KSysGuard. This commit changes the definition of the Memory column from "writable" to "RSS - shared". Writable memory is no longer computed for every single process at every update of the list. Instead, it is just shown in the property dialog of the process. ---------- Robert, this is my proposal as discussed on IRC
Attachment 250544 [details] pushed as 5f6251d - Use a different definition of Memory
*** Bug 565149 has been marked as a duplicate of this bug. ***
I'd like to reopen this bug. I've just discovered that the "Memory" calculation had changed ... and it is so wrong because Resident - Shared has just nothing to do with real memory usage. The RSS usually accounts for readonly pages from the various binaries and the shared ... well it's just SHM and normal process pages are just more likely to be shared between similar process or parent/child. Etc. It's like saying that "top" is accurate for memory usage. The whole point of writable memory was to account the process memory usage as the sum of its private memory. That's not perfect, but that's faster than what tools like `smem`. Instead, I'd like to do the following in order to avoid those slow get_proc_map calls: I'm sure there's a way to 'guess' if it's worth it to recalculate the memwritable. For example, between two updates, we could compare the other metrics and for example, if rss/cputime/pagefaults hasn't changed that much, we would just skip the call to get_memory_writable. And as a failsafe, we would also force a refresh from time to time.