GNOME Bugzilla – Bug 516408
Problems with libical memory management
Last modified: 2013-09-14 16:50:12 UTC
Libical owns a temporary buffer ring of size 2500 blocks and free's the memory in a round robin fashion. The strings which are provided to the clients are also put inside this buffer ring and so they can be free'ed at any point when its turn comes causing some random crashers. The buffer ring should only hold the data which is used internally in libical and not the memory which is provided to the clients. The buffer ring size also can be reduced.
Created attachment 105217 [details] [review] Fixes the memory managment issues in libical The string returned to the clients from the functions, icalreqstattype_as_string, icalproperty_as_ical_string, icalproperty_get_parameter_as_string, icalproperty_get_value_as_string, icallangbind_property_eval_string, icalperiodtype_as_ical_string, icaltime_as_ical_string, icalvalue_as_ical_string, icalcomponent_as_ical_string, and e_cal_component_get_recurid_as_string would be owned by the clients itself. With this patch unnecessary duping of memory in libical and its clients would be fixed. This patch contains the corresponding changes in EDS for the changes made in libical. I will be making similar changes in evolution as well. Other clients such as clock-applet and external backends should be incorporating these changes.
This is a sexy patch for sure :-) Things like the original icalparameter_as_ical_string blow the mind, great to see it sensibly fixed. Similarly icallangbind_property_eval_string and it's APPENDS / thousands of calls to realloc are mind-boggling ;-) this library needs re-writing: a simple g_sprintf would be faster & clearer IMHO. I love the cleanup, of course there is a small risk here, but the old code is known-bad; leaky, inefficient and unmanageable; the sooner it's in, the sooner we can all be testing it ( with those SL10.3 packages ;-) ) Nice work.
The libical and EDS changes are committed to trunk. Will get the patch for evolution side changes soon.
If this is going to cause existing eds/libical users to leak memory/crash, do you think we could have an #ifdef that will error out on compiling to warn of these changes? It'd be nice to avoid the mess of random wrongness happening with other eds API changes, such as ECalComponentID and EContactPhoto... Of course, if this isn't a valid concern for whatever reason, do please say :)
Chris, it definitely makes sense and i have added the following warning message to libical/ical.h which seems to include other header files, #ifndef HANDLE_LIBICAL_MEMORY #warning "Please ensure that the memory returned by the functions mentioned at http://bugzilla.gnome.org/show_bug.cgi?id=516408#c1 are free'ed" #endif EDS has this variable set to 1 in config.h.
so - to confirm, we're not changing the e-d-s external API contract to do this ? [ clearly there are a number of cute work-arounds for that sort of thing '_dup' variants & static char * buffers and so on but ... ].
michael, the return types for some of the functions at comment#1 have been changed from const char * to char *. Some functions already returned char * but the memory was owned by libical which is not the case with the patch now.
I guess it's possible to do #if E_D_S_VER > 123\nfree (memory);\n#endif goodness in the clients :-)
> I guess it's possible to do #if E_D_S_VER > 123\nfree (memory);\n#endif > goodness in the clients :-) I have to do something like this for SyncEvolution. I would prefer another preprocessor define to check against (MUST_FREE_ICAL_MEMORY?) instead of a version check because the latter makes assumptions about which version of EDS will have the patch. Okay, this is probably not going to change, but still... If another define is not wanted, what is the E_D_S_VER that should be checked against? Michael's "123" looks like a placeholder to me. Where is that define set currently at all? I don't find it in /usr/include/evolution-data-server-1.12/
E_D_S_VER should be >= 2.22.
How will this effect eds-dbus though? I don't know if the version numbers are synchronised...
Chris confirms that my concerns are real: detection of this API change should not be tied to the EDS version but rather to a define which is part of the patch. That way code can be written so that is compatible with any libical which chooses to use the patch.
The regression testing of SyncEvolution + EDS that I run picked up this patch today, as can be seen in the Subversion/compile log here: http://www.estamos.de/runtests/2008-02-20-10-00/head-evolution-trunk-minimal/1-evolutiontrunk/ It then failed with a double free detected by glibc inside the modified libical: http://www.estamos.de/runtests/2008-02-20-10-00/head-evolution-trunk-minimal/6-evolution/ Valgrind points towards libecal as the place where the buffer was already freed: Client::Source::ical20::testChanges==17323== ==17323== Invalid free() / delete / delete[] ==17323== at 0x401D0CA: free (vg_replace_malloc.c:233) ==17323== by 0x4111873: icalmemory_add_tmp_buffer (icalmemory.c:165) ==17323== by 0x41118EE: icalmemory_tmp_buffer (icalmemory.c:196) ==17323== by 0x411337D: make_segment (icalparser.c:208) ==17323== by 0x4113467: icalparser_get_prop_name (icalparser.c:240) ==17323== by 0x4113E37: icalparser_add_line (icalparser.c:664) ==17323== by 0x4113C16: icalparser_parse (icalparser.c:587) ==17323== by 0x4114A04: icalparser_parse_string (icalparser.c:1118) ==17323== by 0x40DD899: e_cal_get_object (e-cal.c:2823) ==17323== by 0x80EC0E3: EvolutionCalendarSource::retrieveItem(std::string const&) (EvolutionCalendarSource.cpp:495) ==17323== by 0x80EDCDA: EvolutionCalendarSource::retrieveItemAsString(std::string const&) (EvolutionCalendarSource.cpp:511) ==17323== by 0x80EDFE5: EvolutionCalendarSource::createItem(std::string const&) (EvolutionCalendarSource.cpp:294) ==17323== Address 0x56EBEF8 is 0 bytes inside a block of size 301 free'd ==17323== at 0x401D0CA: free (vg_replace_malloc.c:233) ==17323== by 0x42DA900: g_free (gmem.c:187) ==17323== by 0x40E0DE8: e_cal_get_component_as_string_internal (e-cal.c:4184) ==17323== by 0x40E0E3D: e_cal_get_component_as_string (e-cal.c:4208) ==17323== by 0x80EDCF6: EvolutionCalendarSource::retrieveItemAsString(std::string const&) (EvolutionCalendarSource.cpp:514) ==17323== by 0x80EDFE5: EvolutionCalendarSource::createItem(std::string const&) (EvolutionCalendarSource.cpp:294) ==17323== by 0x805403F: EvolutionSyncSource::itemList::iterate() (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x8054273: EvolutionSyncSource::itemList::start() (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x8054309: EvolutionSyncSource::getFirstItem() (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x807308A: countAnyItems(SyncSource*, SyncItem* (SyncSource::*)(), SyncItem* (SyncSource::*)()) (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x80741A1: countItems(SyncSource*) (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x80749D8: LocalTests::testIterateTwice() (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== ==17323== Invalid free() / delete / delete[] ==17323== at 0x401D0CA: free (vg_replace_malloc.c:233) ==17323== by 0x4111873: icalmemory_add_tmp_buffer (icalmemory.c:165) ==17323== by 0x41118EE: icalmemory_tmp_buffer (icalmemory.c:196) ==17323== by 0x41229AB: icalvalue_int_as_ical_string (icalvalue.c:761) ==17323== by 0x41235A1: icalvalue_as_ical_string (icalvalue.c:1084) ==17323== by 0x41154DB: icalproperty_as_ical_string (icalproperty.c:495) ==17323== by 0x410CE88: icalcomponent_as_ical_string (icalcomponent.c:322) ==17323== by 0x40E151B: e_cal_modify_object (e-cal.c:4336) ==17323== by 0x80ED959: EvolutionCalendarSource::insertItem(SyncItem&, bool) (EvolutionCalendarSource.cpp:437) ==17323== by 0x80EBFA3: EvolutionCalendarSource::updateItemThrow(SyncItem&) (EvolutionCalendarSource.cpp:454) ==17323== by 0x805B55D: EvolutionSyncSource::processItem(char const*, int (EvolutionSyncSource::*)(SyncItem&), SyncItem&, bool) (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x805B645: EvolutionSyncSource::updateItem(SyncItem&) (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== Address 0x6004490 is 0 bytes inside a block of size 287 free'd ==17323== at 0x401D0CA: free (vg_replace_malloc.c:233) ==17323== by 0x42DA900: g_free (gmem.c:187) ==17323== by 0x40E15F2: e_cal_modify_object (e-cal.c:4350) ==17323== by 0x80ED959: EvolutionCalendarSource::insertItem(SyncItem&, bool) (EvolutionCalendarSource.cpp:437) ==17323== by 0x80EBF70: EvolutionCalendarSource::addItemThrow(SyncItem&) (EvolutionCalendarSource.cpp:329) ==17323== by 0x805B55D: EvolutionSyncSource::processItem(char const*, int (EvolutionSyncSource::*)(SyncItem&), SyncItem&, bool) (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x805B68B: EvolutionSyncSource::addItem(SyncItem&) (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x80CFE80: LocalTests::insert(CreateSource, char const*) (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x807018F: LocalTests::testLocalDeleteAll() (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x80E1FBC: CppUnit::TestCaller<LocalTests>::runTest() (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x4051C88: CppUnit::TestCaseMethodFunctor::operator()() const (in /usr/lib/libcppunit-1.12.so.0.0.0) ==17323== by 0x40433ED: CppUnit::DefaultProtector::protect(CppUnit::Functor const&, CppUnit::ProtectorContext const&) (in /usr/lib/libcppunit-1.12.so.0.0.0) ==17323== ==17323== Invalid free() / delete / delete[] ==17323== at 0x401D0CA: free (vg_replace_malloc.c:233) ==17323== by 0x4111873: icalmemory_add_tmp_buffer (icalmemory.c:165) ==17323== by 0x41118EE: icalmemory_tmp_buffer (icalmemory.c:196) ==17323== by 0x41231FF: icalvalue_datetime_as_ical_string (icalvalue.c:992) ==17323== by 0x4123613: icalvalue_as_ical_string (icalvalue.c:1103) ==17323== by 0x41154DB: icalproperty_as_ical_string (icalproperty.c:495) ==17323== by 0x410CE88: icalcomponent_as_ical_string (icalcomponent.c:322) ==17323== by 0x40E0D8B: e_cal_get_component_as_string_internal (e-cal.c:4174) ==17323== by 0x40E0E3D: e_cal_get_component_as_string (e-cal.c:4208) ==17323== by 0x80EDCF6: EvolutionCalendarSource::retrieveItemAsString(std::string const&) (EvolutionCalendarSource.cpp:514) ==17323== by 0x80EDFE5: EvolutionCalendarSource::createItem(std::string const&) (EvolutionCalendarSource.cpp:294) ==17323== by 0x805403F: EvolutionSyncSource::itemList::iterate() (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== Address 0x6191FF8 is 0 bytes inside a block of size 287 free'd ==17323== at 0x401D0CA: free (vg_replace_malloc.c:233) ==17323== by 0x42DA900: g_free (gmem.c:187) ==17323== by 0x40E15F2: e_cal_modify_object (e-cal.c:4350) ==17323== by 0x80ED959: EvolutionCalendarSource::insertItem(SyncItem&, bool) (EvolutionCalendarSource.cpp:437) ==17323== by 0x80EBF70: EvolutionCalendarSource::addItemThrow(SyncItem&) (EvolutionCalendarSource.cpp:329) ==17323== by 0x805B55D: EvolutionSyncSource::processItem(char const*, int (EvolutionSyncSource::*)(SyncItem&), SyncItem&, bool) (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x805B68B: EvolutionSyncSource::addItem(SyncItem&) (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x80CFE80: LocalTests::insert(CreateSource, char const*) (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x807018F: LocalTests::testLocalDeleteAll() (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x806FC8E: LocalTests::testLocalUpdate() (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x80E1FBC: CppUnit::TestCaller<LocalTests>::runTest() (in /tmp/runtests/head/tmp/build/src/client-test) ==17323== by 0x4051C88: CppUnit::TestCaseMethodFunctor::operator()() const (in /usr/lib/libcppunit-1.12.so.0.0.0)
sure - we should have a simple-to-check macro specific to this. OTOH - lets not detract from the huge win that draining this swap will be :-)
Patrick, From the traces, ==17323== Invalid free() / delete / delete[] ==17323== at 0x401D0CA: free (vg_replace_malloc.c:233) ==17323== by 0x4111873: icalmemory_add_tmp_buffer (icalmemory.c:165) ==17323== by 0x41118EE: icalmemory_tmp_buffer (icalmemory.c:196) ==17323== by 0x41229AB: icalvalue_int_as_ical_string (icalvalue.c:761) ==17323== by 0x41235A1: icalvalue_as_ical_string (icalvalue.c:1084) ==17323== by 0x41154DB: icalproperty_as_ical_string (icalproperty.c:495) ==17323== by 0x410CE88: icalcomponent_as_ical_string (icalcomponent.c:322) I think you do not have the libical changes. I can say this as icalvalue_as_ical_string would not allocate memory in tmp buffer any more. Please verify the same.
Its seems that Paul Smith's Makefile didn't pick up the most recent libical, only the rest of the changes. I think the reason is that the Makefile uses "svn switch http://svn.gnome.org/svn/evolution-data-server/trunk" to update the working copy. This works for the files from the evolution-data-server repository, but libical seems to be a different repository and is not updated by "svn switch". I'll add "svn update" after "svn switch"; that should to get all changes.
Will the soname of libecal be bumped? If I'm not mistaken, then it is still libecal-1.2.so.7, the same as in Evolution 2.10 and 2.12. That has the effect that a client which was linked against the latest libecal and frees memory as expected now will start when combined with Evolution 2.10, but that combination will crash. I wonder if we should go one step beyond a compile-time MUST_FREE_ICAL_MEMORY define: when we also add a globabally visible "BOOL libical_must_free_memory", then a client can link against libecal-1.2.so.7 or dlopen it, then check at runtime via dlsym(RTLD_NEXT, "libical_must_free_memory") != NULL whether the libecal it runs with expects the client to free memory. I'd probably use that in SyncEvolution where I currently ship one binary for Evolution 2.10 and 2.12 and (unless there will be other API changes) 2.14.
Patrick, I'm sure if we have to bump it. We haven't broken the API or ABI (?). Though we still need to have a fix done from the users of libical. Do we consider this an API change? Also, we don't have to bump with libecal versions. libical has its own versions, but that wasn't maintained so well. The last time it was bumped was in mid 2003.
> Patrick, I'm sure if we have to bump it. We haven't broken the API or ABI (?). You are *not* sure, right? > Though we still need to have a fix done from the users of libical. That changes the semantic of the calls and thus the API. The situation that I described (user of the calls adapts to new semantic, then his program crashes when combined with the old libecal) is something which is normally avoided by not allowing the program to run with the old library in the first place; an error about a missing libecal.1.2.so.8 instead of a crash is much easier to understand. Just to be clear, what I suggest is: LIBECAL_CURRENT = 8 LIBECAL_REVISION = 0 LIBECAL_AGE = 1 That way code linked against 1.2.so.7 will still work with the new release. If we add the "BOOL libical_must_free_memory" then I'll avoid the crash by compiling/linking against libecal.1.2.so.7 and then checking at runtime whether I have to free memory, so I don't really need the soname change. But someone else might... > Also, we don't have to bump with libecal versions. libical has its own > versions, but that wasn't maintained so well. Isn't libical contained wholesale inside libecal? Therefore any change inside libical implies changes inside libecal. I agree with Michael, this change is a step in the right direction and I applaud any effort to improve the stability of Evolution. But as with any other change we need to ensure a smooth transition... ;-}
Have added a macro LIBICAL_MEMFIXES defined as 1 in libical/src/libical/ical.h which can be used by the clients and committed the evolution and evolution-exchange changes early morning today. The changes committed were, Libical, http://svn.gnome.org/viewvc/libical?view=revision&revision=635 Evolution, http://svn.gnome.org/viewvc/evolution?view=revision&revision=35081 http://svn.gnome.org/viewvc/evolution?view=revision&revision=35082 http://svn.gnome.org/viewvc/evolution?view=revision&revision=35085 http://svn.gnome.org/viewvc/evolution?view=revision&revision=35086 http://svn.gnome.org/viewvc/evolution?view=revision&revision=35087 Evolution-exchange http://svn.gnome.org/viewvc/evolution-exchange?view=revision&revision=1584 http://svn.gnome.org/viewvc/evolution-exchange?view=revision&revision=1585 http://svn.gnome.org/viewvc/evolution-exchange?view=revision&revision=1586 We would need to free the memory returned by one libecal API though which is not part of libical (e_cal_component_get_recurid_as_string).
Patches in libical caused bug #518744
Marking the bug as fixed as all the changes have been committed.
We are not yet done with the discussion about how the ABI change shall be handled, are we? I would prefer to keep the issue open until that is settled. Also note Milan's comment about bug #518744 - there might be some side-effects of the changes.
Patrick, there was a version bump for libecal in 2.22.1 from 1.12.x http://svn.gnome.org/viewvc/evolution-data-server/trunk/configure.in?r1=8142&r2=8169 But apart from that, I'm not thinking of doing a ABI bump for this. We haven't broken the ABI AFAICS.
> Patrick, there was a version bump for libecal in 2.22.1 from 1.12.x. This is not the soname change that I was talking about: only the minor revision was incremented. This change is not visible to programs using dlopen(), is it? > We haven't broken the ABI AFAICS. I disagree. Compile and link a program against trunk, following the rules about freeing memory as indicated by LIBICAL_MEMFIXES. Then run with libecal from the previous stable release. The program will run and crash. This shows that the ABI has changed because if it hadn't, program and old libecal would be compatible and not crash. I checked SyncEvolution + Evo trunk with valgrind. It seems that I currently do not use any of the functions for which I have to free memory, because otherwise I would expect more leaks. So although I seem to be safe right now, I still think that either an soname change or adding a global variable should be done. I can provide a patch for the later.
srini - this did break the ABI; and we should bump the .so version (AFAICS), but only for the ecal lib - not my precious addressbook libs ;-)
Ok guys. I'll do it as part of the next commit to configure.in.
By this... # Before making a release, the LT_VERSION string should be modified. # The string is of the form C:R:A. # - If interfaces have been changed or added, but binary compatibility has # been preserved, change to C+1:0:A+1 # - If binary compatibility has been broken (eg removed or changed interfaces) # change to C+1:0:0 # - If the interface is the same as the previous version, change to C:R+1:A It is 8:0:0 (Broken ABI) Committed to trunk.
Sri, did you see my comment #19? Have you considered using 8:0:1? I would prefer 8:0:1 because the ABI was extended, not broken: the extension is that clients have to free memory. Clients who were build without knowing that will leak memory, just like they will when getting recompiled without changes. Now one could argue that clients who leak memory shouldn't be allowed to run, but I don't believe in that argument myself. I would prefer the existing binary to run, even if it'll require more memory over time. Using a preprocessor warning instead of an error goes in the same direction. Finally, a client who is aware of the ABI extension could dynamically check for the extension if we added something like the "BOOL libical_must_free_memory" (comment #17). Then it would run just fine with the old and new library revision - but only with 8:0:1, not with 8:0:0.
One more remark: using 8:0:0 means that I have to provide yet another binary release for SyncEvolution even though it is not affected by the new requirement to free memory. Please, consider making my life easier and use 8:0:1.
Created attachment 106430 [details] [review] let's clients determine at runtime whether they have to free memory Here's the patch which adds a global variable "ical_memfixes" which (similar to ICAL_MEMFIXES at compile time) can be checked to determine whether the current library expects clients to free memory. I'll add the test program that I used to check this in a separate attachment.
Created attachment 106431 [details] demonstrates how to check at compile and runtime whether memory must be freed Compile with cc -Wl,--as-needed `pkg-config --cflags libecal-1.2` ecaltest.c -o ecaltest `pkg-config --libs libecal-1.2` I compiled against Evolution 2.10.3. It ran correctly with that and with libecal from trunk (without the soname change) - correctly in the sense that it detected whether it had to free memory and only did that when required. When compiled against trunk it detects the preprocessor symbol and always frees memory. Due to the upcoming soname change it will refuse to run with older libecal, as intended.
Point taken. Committed. Thx.
Patrick - your check-at-runtime patch is somewhat odd. Since this global symbol will not be available in older versions - really using dlsym is a pretty awful hack there. It would be altogether simpler to just push the stupid ical mem-management ring-buffer up into the e-d-s API, and ensure the methods continue working as they used to: and introduce 'fixed' versions with _dup suffixes or something (sane) - then deprecated the old versions [ right ? ]. littering the code with dlsym hacks is pretty shocking :-)
> Patrick - your check-at-runtime patch is somewhat odd. Since this global symbol > will not be available in older versions That's exactly the point: the *presence* of the symbol indicates that freeing memory is required, not its *value*. > - really using dlsym is a pretty awful > hack there. Can't argue with that - you are right, it is a hack ;-) > It would be altogether simpler to just push the stupid ical mem-management > ring-buffer up into the e-d-s API, That won't help programs using libical directly. > and ensure the methods continue working as > they used to: and introduce 'fixed' versions with _dup suffixes or something > (sane) - then deprecated the old versions [ right ? ]. I agree that keeping two versions of all affected calls (inluding the libical ones) would be the better solution. But this is not what Chentill's patch did and I just try to mitigate the effects - I don't have time for more intrusive changes.
So what are we going to do about runtime detection of the new memory handling? My vote still is for applying the attachement (id=106431) unless someone has a better suggestion. Remember, this has to go in in time for the next release to have the desired effect.