GNOME Bugzilla – Bug 659254
reduce number of calls to malloc to mitigate fragmentation
Last modified: 2021-07-05 14:08:02 UTC
See attached.
Created attachment 196731 [details] [review] memory: Add display of elapsed seconds since a garbage collection This is useful information for debugging.
Created attachment 196732 [details] [review] global: Initiate *full* GC at idle While I've been trying to make the GC kick in more often, I've decided it's a better tradeoff to aggressively GC at "leisure", for multiple reasons. We can and should revisit this at a later time, but basically: * The shell doesn't generate *that* much JS data - garbage collection is very fast here. * Long periods without GC mean we're not calling free() when we could, which in turn makes heap fragmentation much worse. * Ensuring the GC runs at idle makes it much less likely we'll take a random large GC hit in the middle of an animation.
Created attachment 196733 [details] [review] theme: Minor memory usage optimization While profiling with Massif, about 2% of our allocated space is in over 1000 calls to g_strdup(). Because there are many StThemeNode instances allocated, we can reduce the calls to malloc() here.
(In reply to comment #2) > Created an attachment (id=196732) [details] [review] > global: Initiate *full* GC at idle I'd definitely like people to test this and report any performance regressions. By regression here I basically mean, does it "feel" slower for you? We don't currently have any perf metrics for this.
(In reply to comment #4) > (In reply to comment #2) > > Created an attachment (id=196732) [details] [review] [details] [review] > > global: Initiate *full* GC at idle > > I'd definitely like people to test this and report any performance regressions. > By regression here I basically mean, does it "feel" slower for you? > > We don't currently have any perf metrics for this. Which probably means that it is not suitable for 3.1 at this point, right ?
(In reply to comment #5) > > Which probably means that it is not suitable for 3.1 at this point, right ? I think we should really consider it. I'd very much like people to try these patches and see.
(In reply to comment #5) > (In reply to comment #4) > > We don't currently have any perf metrics for this. > > Which probably means that it is not suitable for 3.1 at this point, right ? Oh and if you mean "do we know if this is worth it?" - I can say it makes the RSS grow noticeably more slowly on my system.
Review of attachment 196731 [details] [review]: Makes sense and looks good to me.
Review of attachment 196732 [details] [review]: From quick testing it does not seem to cause any noticeable slowdown, but I have not run it for long enough to be sure about that. Code looks fine .. do we still want to keep shell_global_maybe_gc() around? (will be unused after this patch).
(In reply to comment #9) > Review of attachment 196732 [details] [review]: > > From quick testing it does not seem to cause any noticeable slowdown, but I > have not run it for long enough to be sure about that. > Code looks fine .. do we still want to keep shell_global_maybe_gc() around? > (will be unused after this patch). Yeah, you can still run it from looking glass I guess. (We have a lot of API duplication and wrapping because the gjs API isn't introspectable, and neither is the Spidermonkey one).
Attachment 196731 [details] pushed as 36bfe8c - memory: Add display of elapsed seconds since a garbage collection Attachment 196732 [details] pushed as f54b82f - global: Initiate *full* GC at idle
Leaving this bug open for the theme patch (not critical, can wait for 3.3 to open).
I guess gjs req should be bumped to 1.29.18 in configure.ac
Review of attachment 196733 [details] [review]: If this isn't a noticeable speedup, I'm not sure we want this. It's fairly complex, I'd say due to the generic nature and varargs. A simple cleanup would be to make it fixed with four strings.
(In reply to comment #14) > Review of attachment 196733 [details] [review]: > > If this isn't a noticeable speedup, I'm not sure we want this. It's fairly > complex, I'd say due to the generic nature and varargs. A simple cleanup would > be to make it fixed with four strings. I can't claim this particular fix is large in itself. But I'm not going to go around micro-optimizing the entire source tree. What I'm looking for is places where malloc() is called frequently, e.g. inside a loop. Particularly if these malloc() calls are long-lived and especially if they interleave with short-lived calls. See: http://blog.pavlov.net/2007/11/10/memory-fragmentation/ for why fragmentation can be a serious problem. This qualifies as an instance of that. How much does this one patch help? From quick instrumentation, I have ~450 theme nodes peak after entering the overview. That means we save 450*3 or 1350 calls to g_strdup() - not a small number.
(In reply to comment #15) > ( > This qualifies as an instance of that. How much does this one patch help? > From quick instrumentation, I have ~450 theme nodes peak after entering the > overview. That means we save 450*3 or 1350 calls to g_strdup() - not a small > number. Actually we allocate ~950 theme nodes, so this really saves 2850 calls to g_strdup() total.
So what are we going to do with this?
How many *different* strings do we have here? Would it be possible to use referenced strings (like Rhythmbox does), or use GQuarks (do they get disposed? do they need to get disposed?) ?
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/gnome-shell/-/issues/ Thank you for your understanding and your help.