GNOME Bugzilla – Bug 793031
Decrease memory usage by disabling backend-per-process by default
Last modified: 2018-05-22 15:26:24 UTC
the calendar factory processes are dbus activated and are supposed to shut themselves in 10 seconds if no other process make a request, however the processes stay running forever on a session in GNOME 3.26
Thanks for a bug report. Is gnome-shell-calendar-server process running in the background? If yes, then it's due to it. You can even try to kill evolution-calendar-factory and the gnome-shell-calendar-server will restart it on its own. The other one possibly using it is evolution-alarm-notify. I'm not aware of other system processes which would keep the factories running.
I am not entirely sure gnome-shell-calendar is the culprit (or at least not the only one), I'm running this script to kill the calendar (it needs a few trials to keep it from restarting) while [ 1 ] ; do kill -9 `pgrep gnome-shell-ca` ; ps faxu | grep calendar ; sleep 1 ; done And the calendar factory keeps running, minutes after gnome-shell-calendar-server is long gone. Moreover, I've been doing a dbus-monitor and I don't see any other dbus calls to evolution-calendar-factory, so it clearly doesn't shut itself down after gnome-shell-calendar is gone...
In any case, if gnome-shell-calendar-server is at least part of the problem, we need to fix it too (perhaps on a separate bug). gnome-shell-calendar-server has been abandoned for a long time, and I think it needs someone who knows EDS to take care of it.
Also see bug #792779. As I said, the other one can be evolution-alarm-notify. I do not know of any other. I booted GNOME 3.26 in Fedora 27. There are running both gnome-shell-calendar-server and evolution-alarm-notify after start. When I: $ kill -9 `pidof evolution-alarm-notify` then I see that the evolution-addressbook-factory-subprocess-es are gone immediately, while the evolution-addressbook-factory itself closes after those 10 seconds. When I: $ kill -9 `pidof gnome-shell-calendar-server` then it is immediately restarted, which I can recognize due to the new PID being assigned to it. That is, other part of GNOME makes sure that the process is running, possibly the same way as the gnome-shell-calendar-server makes sure that the evolution-calendar-factory is kept running, even after the factory crashes (such behaviour of gnome-shell-calendar-server can have unwanted side-effects, like when something really bad happens to the evolution-calendar-factory that it won't start, then the gnome-shell-calendar-server will keep trying in a busy loop). The only safe way is to move away the /usr/libexec/gnome-shell-calendar-server file and restart the machine. Though you lose some functionality of GNOME with that. When I move away the file and restart the machine, then log in, then only the evolution-alarm-notify is running (that's because it's autostarted with /etc/xdg/autostart/org.gnome.Evolution-alarm-notify.desktop). Killing it will kill all the background evolution-*-factory-subprocess processes and after ten seconds both factories are gone, the only left process is evolution-source-registry. That means, what is left running and what not depends on the desktop environment you use (for example MATE has no mate-calendar-server, similarly as XFCE), and your settings (what you left autostart after login). In both cases, not having processes running means missing functionality. (In reply to Alberto Ruiz from comment #3) > In any case, if gnome-shell-calendar-server is at least part of the problem, > we need to fix it too (perhaps on a separate bug). > gnome-shell-calendar-server has been abandoned for a long time, and I think > it needs someone who knows EDS to take care of it. I'm not sure what kind of knowledge you mean, but I'm willing to help, of course.
(In reply to Milan Crha from comment #4) > Also see bug #792779. > > As I said, the other one can be evolution-alarm-notify. I do not know of any > other. I booted GNOME 3.26 in Fedora 27. There are running both > gnome-shell-calendar-server and evolution-alarm-notify after start. When I: Indeed! It seems one needs to kill both. I guess this spurs a couple of questions for me: - Why do we need alarm notify to run all the time? - Why do we need alarm notify to keep the calendar factory awake? > (In reply to Alberto Ruiz from comment #3) > > In any case, if gnome-shell-calendar-server is at least part of the problem, > > we need to fix it too (perhaps on a separate bug). > > gnome-shell-calendar-server has been abandoned for a long time, and I think > > it needs someone who knows EDS to take care of it. > > I'm not sure what kind of knowledge you mean, but I'm willing to help, of > course. I guess that what I am asking is whether you can have a look at gnome-shell-calendar-server and see if we can patch it to do polling mode instead of keeping the calendar factory awake. If we did that on both ends face
(In reply to Alberto Ruiz from comment #5) > I guess that what I am asking is whether you can have a look at > gnome-shell-calendar-server and see if we can patch it to do polling mode > instead of keeping the calendar factory awake. If we did that on both ends > face I hit send too soon. If we did that on both ends we could get rid of the need of the factory and its subprocesses having to run all the time, even if no calendars are configured.
There is always some calendar configured, it's On This Computer/Personal. It's for all Contacts, Calendars, Tasks, Memos and Mails (for mails it's just On This Computer). Without evolution-alarm-notify you get absolutely no reminders of upcoming events/tasks. Like when you close Evolution, you will not be notified about new mails after it. Both gnome-shell-calendar-server [1] and evolution-alarm-notify use views, which are so called live queries to the resources. The view notifies about changes when they happen. This views keep the factories opened (not necessarily the views, it's the calendar itself being opened, but that's only the detail). I believe the idea of polling instead of being notified on the fly is not good, because the polling means activating D-Bus services, do some initial connection to it, re-do the query and then close everything again. You may not notice it with few almost empty calendars, but I have configured over 50 calendars, which are all opened whenever I select a meeting invitation email, and when the work for subprocesses had been started I faced significant CPU load due to quick open&close of the subprocesses through D-Bus, which led to a conclusion to not have one subprocess per calendar (which would mean that when one calendar subprocess crashes then only that single calendar will be dropped from the view), but we use subprocess per calendar type, like one for local calendars, one for CalDAV and so on (which means when for example one CalDAV causes a crash, all the other CalDAV calendars will disappear). That's the price, but not a big deal, when one considers that crashes should not happen. [1] https://gitlab.gnome.org/GNOME/gnome-shell/blob/master/src/calendar-server/gnome-shell-calendar-server.c#L638
So, I think I understand why each individual piece is behaving the way it does, thanks a lot for the detailed explanation. Wrt that CPU spike, how bad is it? I want to put it in context with the amount of RAM needed to keep things running all the time. I've been doing heap profiling with valgrind/massif and each calendar factory subprocess is consuming around 10-14 megs of heap in my setup (two Google accounts with multiple calendars each). In a 512Mb or 1Gb of RAM system, these are precious resources, while CPU badnwidth is generally avaialble these days, specially taking into account that most low resourced systems these days have multiple cores. If we poked, say, every 5 minutes, wouldn't that CPU spike oughtweight the cost of keeping that RAM used all the time. I'm just saying, most users will only have one or two calendar accounts with a couple of calendars each at most and that's if you're an enterprise user, if you're just a regular end user, you'll have your personal Google Calendar at most. I am not saying that polling is the solution, just saying that we need to find a way to have the least impact on the majority of users. In any case, this concerns me from an architectural POV, basically each client that needs to react when either a) an alarm kicks in or b) a new event/calendar is added needs to keep a service running 100% of the time and that in turn, keeps the calendar factories running. Again, from an architectural POV, it seems to be that gnome-shell-calendar-server is fulfilling the same functionality that evolution-alarm-notify (in a slightly different context) and also suffers from this problem of having to keep the factories running 100% of the time for very little gain. I think we should look into how to solve this, here's an idea to overcome polling: Could we add a facility in EDS to register "global clients" so that when an event or calendar is added, we let those clients know? A client would be a well known bus name, dbus path and a particular interface (say: org.gnome.evolution.CalendarWorker) that receives new event or calendar data. The idea here is that whenever an EDS client registers a new calendar, or client, EDS would wake up each registered client through DBus activation, removing the need to go for a polling mechanism. I have an additional question here wrt EDS and calendar syncrhonization: when the calendar factories are running, do they poll their (say Google Calendar) servers to check if new events have been added and synchronize? If so, how often do they poll.
Created attachment 367760 [details] peek of CPU This shows comparison with shared processes (the current state) and without shared processes (one calendar is one subprocess), when I selected a meeting invitation mail and evolution was searching for it in my calendars. There is marked place where the peek was significant. You can see that the process sharing is almost 4 times quicker than without process sharing.
The 5 minute poll is not ideal, imagine a meeting reminder 10 minutes before it starts, which would be postponed due to the poll interval by 5 minutes. People would think it's a bug. I do not agree with your "very little gain", but I agree it depends on the point of view. (In reply to Alberto Ruiz from comment #8) > Could we add a facility in EDS to register "global clients" so that when an > event or calendar is added, we let those clients know? No calendar opened means no idea what the calendar content is, neither whether there had been anything added/modified/removed. With respect of calendars as such, there's the evolution-source-registry serving for it. Anyone can connect to it and listen for "source-added", "source-changed", "source-removed".... I also added a wrapper on top of it, named ESourceRegistryWatcher, which makes certain operations on the sources changes simpler (it's used on the evolution-alarm-notify side and some other). > I have an additional question here wrt EDS and calendar syncrhonization: > when the calendar factories are running, do they poll their (say Google > Calendar) servers to check if new events have been added and synchronize? If > so, how often do they poll. Most of the calendars do poll, evolution-ews can truly listen for changes (since some version of Exchange server there is a possibility to do that), but protocols like CalDAV (which is used also for Google) do not have such "streaming" functionality, thus they poll. The interval is set by the user, in the calendar properties. ---------------------------------------------------------------------------- If I got it right, then your main concern is about memory usage. Calendars used to have their whole content in memory before I added EBook/CalMetaBackend base class, which is used by most of the calendar backends, except of the local one, which I didn't touch. It's because that would mean to have the events saved twice on the disk, once in the cache, second time as the .ics file itself. With respect of overall processes memory usage, I'm wondering how much of the memory is used by other libraries, not directly by the code of the evolution-data-server. It would be interesting to try to figure it out somehow. eds can be compiled to have only one subprocess, whether it may help or not. I'm not sure which values I should look on with respect of the memory usage. The main thing might be Resident Memory, right? I checked one thing,I restarted all the background processes and opened evolution in Calendar, which has several different calendars opened. I see that the resident memory is: source-registry 126.9M addressbook-factory 61.6M calendar-factory 99.8M cf-subprocess 118.5M cf-subprocess 107.1M cf-subprocess 108.4M cf-subprocess 108.7M cf-subprocess 102.4M alarm-notify 69.8M I didn't have started any evolution-addressbook-factory-subprocess-yet. After this I disabled backend-per-process, which will cause to have running always only one factory-subproress and it gave me these numbers: source-registry 118.7M addressbook-factory 61.8M calendar-factory 99.4M cf-subprocess 131.9M alarm-notify 69.2M I didn't expect to receive exactly the same numbers, even the conditions are quite close, but I understand this as the cf-subprocess has assigned most of the memory not by eds itself, but by something else, either the system overhead or the libraries or anything else. That means, from my point of view, that you should not use backend-per-process on devices which have low memory and the rest can be as is on all sides, together with a benefit of not losing any functionality for the users. The downside is that once one calendar crashes the subprocess, all the calendars will die too, but that should be fine, because the processes should not crash.
I made a semi-related change in evolution-data-server [1], which adds a --backend-per-process argument for evolution-addressbook-factory and evolution-calendar-factory processes. This can be used to override the option from the compile time of evolution-data-server. Use --backend-per-process=0 to disable it, --backend-per-process=1 to enable it and anything else (or not set) to respect the option from the compile-time. The idea is that you can change the D-Bus .service files to add there this option without a need to recompile evolution-data-server. [1] https://git.gnome.org/browse/evolution-data-server/commit/?id=1bb5cf732
(In reply to Milan Crha from comment #11) > I made a semi-related change in evolution-data-server [1], which adds a > --backend-per-process argument for evolution-addressbook-factory and > evolution-calendar-factory processes. This can be used to override the > option from the compile time of evolution-data-server. Use > --backend-per-process=0 to disable it, --backend-per-process=1 to enable it > and anything else (or not set) to respect the option from the compile-time. > The idea is that you can change the D-Bus .service files to add there this > option without a need to recompile evolution-data-server. > > [1] https://git.gnome.org/browse/evolution-data-server/commit/?id=1bb5cf732 Thanks! This helps. Though I'm a bit hesitant to say "if you have less RAM, be prepared for a less resilient setup" ;-)
(In reply to Milan Crha from comment #9) > Created attachment 367760 [details] > peek of CPU > > This shows comparison with shared processes (the current state) and without > shared processes (one calendar is one subprocess), when I selected a meeting > invitation mail and evolution was searching for it in my calendars. There is > marked place where the peek was significant. You can see that the process > sharing is almost 4 times quicker than without process sharing. Do we know for certain that the CPU spike has more to do with the D-Bus side of things or is this more EDS related? Have we profiled this in the past?
(In reply to Alberto Ruiz from comment #13) > Do we know for certain that the CPU spike has more to do with the D-Bus side > of things or is this more EDS related? Have we profiled this in the past? No, I did not profile it, mostly because I wasn't sure how to do that. I usually use sysprof, but I didn't try it here. Any advice?
Nice, (re)moving /usr/libexec/gnome-shell-calendar-server frees up 75M of RAM on my x86_64 machine, on machines with 1G of RAM that is a quite significant amount of total RAM (and a huge percentage of free RAM). I think that one solution here (from my pov) would be to ensure that gnome-shell-calendar-server does not even get started for sessions where no calendar data-source is configured.
Created attachment 367812 [details] Massif output for gnome-shell-calendar-server How trustable is the RSS? This is the massif report I get from running it manually under valgrind, shows less than 1MB on the long run. Which one is the right figure?
(In reply to Alberto Ruiz from comment #16) > Created attachment 367812 [details] > Massif output for gnome-shell-calendar-server > > How trustable is the RSS? This is the massif report I get from running it > manually under valgrind, shows less than 1MB on the long run. Which one is > the right figure? Sorry, less than 3Mb
Milan has done a build of eds that allows running all calendar factories on a single process with a cmd line argument (--backend-per-process=0), I'm leaving the build here for reference: https://koji.fedoraproject.org/koji/taskinfo?taskID=24649409
Created attachment 367868 [details] profile data I have profiled the data both from a heap usage and from a CPU usage point of view (see sysprof and massif output). Each factory subprocess has a heap overhead of ~10Mb. But prehaps the most hideous problem is the CPU one. The sysprof profile data is a bit hideous, so I wondered if the CPU overhead was kernel driven and therefore Milan was right to think this was probably a D-Bus issue. Oh. My. God. Why is there SO MUCH DATA going through the wire? This is a pretty terrible architecture. Everything goes through the wire here. Several times. I ran dbus-monitor and the amount of data for a bunch of calendars in two Google accounts is obscene: 1530 method calls! D-Bus is certainly not intended to be used like this. I think we really need to look at the basics of how EDS and the EDS backend subprocesses operate. Note that this is profile data with evo spawning one process per calendar. The patched build you sent me doesn't seem to prevent subprocesses being spawn even though I passed the -b 0 argument.
(In reply to Hans de Goede from comment #15) > I think that one solution here (from my pov) would be to ensure that > gnome-shell-calendar-server does not even get started for sessions where no > calendar data-source is configured. There is always configured On This Computer/Personal, all of mail store, book, calendar, task list, memo list. The gnome-shell-calendar-server uses calendars and tasks. The (some part of) gnome-shell keeps running the gnome-shell-calendar-server, as had been proved above.
(In reply to Alberto Ruiz from comment #19) > I have profiled the data both from a heap usage and from a CPU usage point > of view (see sysprof and massif output). What tool do you use to decipher those files, please? > Oh. My. God. Why is there SO MUCH DATA going through the wire? This is a > pretty terrible architecture. I do not think so. It highly depends what it had been exactly doing. Each evolution process connects to evolution-source-registry, which holds the list of known sources (accounts/books/calendars/...), but it also does some fancy things like providing OAuth2 tokens, refreshing them and talks to GOA. The call is roughly like this: 1) client talks to evolution-addressbook/calendar-factory to get an EBook/CalClient 2) the factory runs the subprocess (or talks to an existing) 3) the subprocess talks to the evolution-source-registry, then it opens the source it had been asked to open 4) subprocess tells the factory about the success (and the new D-Bus path (or address or what it's called) for the backend) 5) the factory returns this to the client 6) the client talks directly to the backend (thus to the subprocess) > Note that this is profile data with evo spawning one process per calendar. > The patched build you sent me doesn't seem to prevent subprocesses being > spawn even though I passed the -b 0 argument. Works for me. Did you run it as: $ /usr/libexec/evolution-calendar-factory -b 0 -w The "-w" makes sure it'll stay alive until some client connects to it, then it'll stop once not being used. Without that the factory closes itself after ~10 seconds, because it's not used. See --help for other options.
As discussed in IRC. The nature of EWS and other Calendar providers means that we can't really shut everything down. To mitigate I think we should follow this strategy: - Having less processes would help alleviate the heap consumption (which I've measured in ~10Mb per process). Is the patch for the "-b 0" solution in master already? - Another aspect that would help greatly to reduce resource consumption in this model is reducing the need to pass data through the bus. This means adding Direct Read Access to all possible calendar backends. We should prolly create another bug to track that effort. - It'd be interesting to analyze the resource usage of the factory subprocess to understand what are we using those 10Mb of data for, and whether it is needed all the time or at all.
(In reply to Alberto Ruiz from comment #22) > - Having less processes would help alleviate the heap consumption (which > I've measured in ~10Mb per process). Is the patch for the "-b 0" solution in > master already? See comment #11, yes, it is.
(In reply to Alberto Ruiz from comment #19) > Created attachment 367868 [details] > profile data I briefly looked on it and one of the most heap usage belongs to libical, the icalproperty_new_rrule takes about 1.7MB on the heap. Looking on the place where it is called, it's part of the attempt to preload built-in timezones, because libical could cause use-after-free when the array of timezones had been freed due to reallocation of the array. I'm not sure whether it's still an issue, but the preload might be good overall, thus this will be addressed by disabling backend-per-process by default (if we agree on this).
(In reply to Milan Crha from comment #24) > (In reply to Alberto Ruiz from comment #19) > > Created attachment 367868 [details] > > profile data > > I briefly looked on it and one of the most heap usage belongs to libical, > the icalproperty_new_rrule takes about 1.7MB on the heap. Looking on the > place where it is called, it's part of the attempt to preload built-in > timezones, because libical could cause use-after-free when the array of > timezones had been freed due to reallocation of the array. I'm not sure > whether it's still an issue, but the preload might be good overall, thus > this will be addressed by disabling backend-per-process by default (if we > agree on this). So, what you are saying here is that we address this with backend-per-process disabled. In that case, what we're doing is avoiding this allocation multiple times as we have less subprocesses, right? Sounds like a nice first approach, it'd be good if we could reduce that usage anyway, but I'm not sure if what you're saying is that we can't avoid it. What is that timezone data used for during the lifetime of the factory? Can we free everything and allocate it again on demand when a we need it back?
(In reply to Alberto Ruiz from comment #25) > So, what you are saying here is that we address this with > backend-per-process disabled. In that case, what we're doing is avoiding > this allocation multiple times as we have less subprocesses, right? Basically yes. I've been playing with sytemtap a bit yesterday and I see that as each evolution process creates its own ESourceRegistry, which in turn loads every defined ESource into memory (transferred over D-Bus), then this ESourceRegistry causes, apart of D-Bus transfer issues, the next biggest user of memory, at least with respect of GObject-s being instantiated. I want to change disabled backend-per-process to not run a subprocess, but keep the backends running in the main factory process, which will save memory a bit more too, not talking about performance and relaxing D-Bus as well. It used to work this way before the work on backend-per-process. > What is that timezone data used for during the lifetime of the factory? Well, built-in (provided by system) time zones. > Can we free everything and allocate it again on demand when a we need it back? No, due to the way libical API works with zones, where it expects them to live "forever" (that's a bit inaccurate statement from my side, but it's close to impossible to track whether any structure still requires the zone). This preload of zones was done just to workaround the issue with re-allocation of the zone, while some structures still pointed into the previously-allocated zones.
Created attachment 368818 [details] [review] eds patch for evolution-data-server; Oh how I hate code duplication... Anyway, this causes the calendar/book backends run in calendar/book-factory, instead of running the subprocess, thus saving a bit more memory too. It's pretty invasive change, thus I'm postponing it for 3.30, thus there will be some time to test it and find eventual issues. I plan to commit it for 3.29.1, shortly after the git master branches for 3.28.
I made one related change for 3.28 too, I disabled the backend-per-process by default, thus distributions which do not override this option in their build scripts will pick the change automatically. I also changed how the subprocess identifies the '--factory' (which is an argument only for debugging purposes, to know which subprocess serves to which sources) to show "all" when the backend-per-process is disabled. That's just a nitpick, which can be reverted before applying the above change (comment #27). Created commit 2b058c94b in eds master (3.27.92+)
Sounds like we're making good progress, great stuff Milan. I have two other things that I'd like to get your input on: 1) The ical memory usage, I'd assume that the timezone data is read only and exactly the same for all libical users/processes? I am wondering if we could contribute something in libical upstream to serialize stuff on disk and just mmap the serialized data? That way whatever's shared across processes will be cached in the memory hierarchy just once instead of consuming additional heap per process. Just an idea to explore. 2) I've identified that a lot of the memory usage comes from this _asn1_copy_structure allocating functions, which seem to come from gnutls certificate handling. for example in the caldav1 massif file 3.9 megs out of 10 megs comes from gnutls stuff which seems like the biggest culprit here. I'd like to understand why gnutls needs so much heap data, not sure if you have any clue as to why.
(In reply to Alberto Ruiz from comment #29) > 1) The ical memory usage, I'd assume that the timezone data is read only and > exactly the same for all libical users/processes? For the list of built-in timezones yes, that is deciphered from information provided by tzdata. > I am wondering if we could contribute something in libical upstream to > serialize stuff on disk and just mmap the serialized data? That would mean twice on the disk, which I do not think they would do, but I can be wrong. And the data is supposed to be updated whenever tzdata changes. And mmap is not available everywhere, as far as I know, while libical is a cross-platform project. Better to ask them directly, on libical-devel@lists.infradead.org on in their github project. My personal experience is that their response takes some time. > 2) I've identified that a lot of the memory usage comes from this > _asn1_copy_structure allocating functions, which seem to come from gnutls > certificate handling. No idea, I'm sorry, gio/gnutls internals are out of my knowledge. If I'd guess, then it's for https handling and reading available system issuer's certificates into memory, but it's really just a blind guess. You should ask them directly too.
The attached patch had been merged: Created commit 7ca206e0c in eds master (3.29.1+)
I forgot of bug #793727 before committing the above change, thus I added a workaround for the glib bug at commit 1bdbb4cd2. I hope we get the glib bug fix into the sources by 2.58.0.