After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 363695 - EMemPool, EStrv and EPoolv are obsolete
EMemPool, EStrv and EPoolv are obsolete
Status: RESOLVED OBSOLETE
Product: evolution-data-server
Classification: Platform
Component: Mailer
2.32.x (obsolete)
Other Linux
: High normal
: ---
Assigned To: evolution-mail-maintainers
Evolution QA team
Depends on:
Blocks:
 
 
Reported: 2006-10-20 15:49 UTC by Matthew Barnes
Modified: 2021-05-19 11:04 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Proposed patch for evolution (7.44 KB, patch)
2006-10-20 18:28 UTC, Matthew Barnes
none Details | Review
Proposed patch for evolution-data-server (18.59 KB, patch)
2006-10-20 18:32 UTC, Matthew Barnes
none Details | Review
Revised patch for evolution (7.16 KB, patch)
2006-10-21 17:15 UTC, Matthew Barnes
none Details | Review
Revised patch for evolution (10.90 KB, patch)
2006-10-25 20:08 UTC, Matthew Barnes
none Details | Review
Revised patch for evolution-data-server (45.27 KB, patch)
2006-10-25 21:15 UTC, Matthew Barnes
none Details | Review
Revised patch for evolution-data-server (45.31 KB, patch)
2006-10-26 03:31 UTC, Matthew Barnes
reviewed Details | Review
Revised patch for evolution (10.87 KB, patch)
2007-05-01 15:58 UTC, Matthew Barnes
reviewed Details | Review

Description Matthew Barnes 2006-10-20 15:49:05 UTC
libedataserver provides two data-structures for efficient string storage: EStrv and EPoolv.  In the interest of removing old cruft, I believe these should be dropped or at least deprecated.


EStrv
-----

EStrv provides memory-efficient storage of strings similar to GLib's GStringChunk [1], except the strings can be "unpacked" into a string vector where they can modified freely, and then re-packed into a string "chunk".  The API seems rather clumsy, and it's limited to 254 strings per instance.

EStrv is no longer used by Evolution, Evolution-Data-Server, or Evolution-Exchange.


EPoolv
------

An EPoolv is a vector of strings that live in a global string pool.  The idea is to reduce memory consumption by sharing constant strings that are used frequently (such as email headers).  This is similar to GLib's GQuark [2] except that EPoolv "quarks" (or indexes) are only unique to an EPoolv instance, whereas GQuarks are globally unique.

There is a single case of EPoolv usage in Evolution's message-list.c.  It looks like it's trying to maintain a unique copy of normalized "from", "to", and "subject" strings for each CamelMessageInfo.  I think this could be replaced with a simple g_string_chunk_insert_const() [3], which maintains a unique copy of the string and also stores it efficiently.


[1] http://developer.gnome.org/doc/API/2.0/glib/glib-String-Chunks.html
[2] http://developer.gnome.org/doc/API/2.0/glib/glib-Quarks.html
[3] http://developer.gnome.org/doc/API/2.0/glib/glib-String-Chunks.html#g-string-chunk-insert-const
Comment 1 Matthew Barnes 2006-10-20 18:28:49 UTC
Created attachment 75093 [details] [review]
Proposed patch for evolution

This patch replaces the EPoolv use case in message-list.c with a GStringChunk.

There's a few performance and memory usage trade-offs at work here:

- Each MessageList now has its own GStringChunk instead of a global one
  shared by all.  So strings are not shared as much as they could be, but
  they do get freed along with their MessageList rather than lingering in
  memory until program termination.

- The get_normalised_string() function always normalises the string now,
  instead of trying to fetch a cached copy first.  But the logic is much
  shorter and simpler.

I haven't taken any serious measurements comparing this patch against the EPoolv approach, but I would expect it to be close to a wash.
Comment 2 Matthew Barnes 2006-10-20 18:32:33 UTC
Created attachment 75094 [details] [review]
Proposed patch for evolution-data-server

This patch just removes EStrv and EPoolv from libedataserver.  We can defer this patch if the data structures are to be deprecated but not yet removed.
Comment 3 Jeffrey Stedfast 2006-10-20 19:02:37 UTC
the main point of that code was to reduce processing time needed to sort the message-list, I'm afraid that your new implementation will balloon the processing time a huge amount.

There's no use for a string pool if you're gonna recalculate the collated strings for each comparison.
Comment 4 Matthew Barnes 2006-10-21 17:15:30 UTC
Created attachment 75141 [details] [review]
Revised patch for evolution

Jeff is right, I midjudged the intent of the code and especially the expense of calling g_utf8_collate_key().

Here's a revised patch that brings back the normalised string cache, but as a simple string-to-collation-key mapping.  This will make the cache grow larger during runtime and precludes us from doing garbage collection on it, but the comparison should be faster than the EPoolv version.  Both key and value strings are stored in the local string chunk, which lives and dies with the MessageList instance.

This seems like an acceptable trade-off to me.  Any thoughts Jeff?
Comment 5 Jeffrey Stedfast 2006-10-23 17:30:56 UTC
this patch looks good to me (I haven't tested it, but the logic looks right to me this time)

I'll have to leave it up to the Evo team to accept tho
Comment 6 Matthew Barnes 2006-10-23 19:19:16 UTC
I'm adding EMemPool to my hit list here, since it looks like all current uses of it could easily be replaced by GStringChunk.
Comment 7 Matthew Barnes 2006-10-25 20:08:10 UTC
Created attachment 75406 [details] [review]
Revised patch for evolution

This patch contains all previous changes and also replaces an EMemPool with a GStringChunk.
Comment 8 Matthew Barnes 2006-10-25 21:15:39 UTC
Created attachment 75412 [details] [review]
Revised patch for evolution-data-server

This patch contains all previous changes and also replaces all EMemPool instances with GStringChunk and removes EMemPool from libedataserver.
Comment 9 Philip Van Hoof 2006-10-25 21:49:15 UTC
After updating tinymail's camel-lite with #75412

Thread NaN (LWP 26323)

  • #0 IA__g_log
    at gmessages.c line 516
  • #1 IA__g_return_if_fail_warning
    at gmessages.c line 532
  • #2 IA__g_string_chunk_free
    at gstring.c line 149
  • #3 folder_pull_part
    at camel-mime-parser.c line 1004
  • #4 camel_mime_parser_finalise
    at camel-mime-parser.c line 1414
  • #5 camel_object_unref
    at camel-object.c line 893
  • #6 parse_content
    at camel-multipart-signed.c line 290
  • #7 signed_get_number
    at camel-multipart-signed.c line 410
  • #8 camel_multipart_get_number
    at camel-multipart.c line 348
  • #9 summary_build_content_info_message
    at camel-folder-summary.c line 2403
  • #10 camel_folder_summary_info_new_from_message
    at camel-folder-summary.c line 1097
  • #11 camel_imap_folder_changed
    at camel-imap-folder.c line 2307
  • #12 imap_rescan
    at camel-imap-folder.c line 629
  • #13 camel_imap_folder_selected
    at camel-imap-folder.c line 402
  • #14 imap_refresh_info
    at camel-imap-folder.c line 528
  • #15 disco_refresh_info
    at camel-disco-folder.c line 268
  • #16 camel_folder_refresh_info
    at camel-folder.c line 298
  • #17 tny_camel_folder_refresh_async_thread
    at tny-camel-folder.c line 554
  • #18 g_thread_create_proxy
    at gthread.c line 582
  • #19 start_thread
    from /lib/tls/i686/cmov/libpthread.so.0
  • #20 clone
    from /lib/tls/i686/cmov/libc.so.6
Continuing.

(tinymail:26174): GLib-CRITICAL **: g_string_chunk_free: assertion `chunk != NULL' failed

Breakpoint 1, IA__g_log (log_domain=0xb75981d7 "GLib", log_level=-1218870825, format=0xb75981d7 "GLib") at gmessages.c:516
516     in gmessages.c
(gdb)

Comment 10 Philip Van Hoof 2006-10-25 22:05:18 UTC
It also looks like this (probably caused by whathever is causing the the warning too) introduces a memory-leak of ~200kb per 2,000 headers received (by imap-rescan, for example)
Comment 11 Matthew Barnes 2006-10-26 03:31:31 UTC
Created attachment 75421 [details] [review]
Revised patch for evolution-data-server

Here's a revised patch that fixes the GLib-CRITICAL warning, and hopefully the memory leak that Philip observed.  Philip, can you confirm this?
Comment 12 Philip Van Hoof 2006-10-27 18:29:12 UTC
ps. I haven't yet tested this version. I will try to do this tomorrow.
Comment 13 Philip Van Hoof 2006-10-29 17:48:08 UTC
The warning is gone, but I'm still experiencing the memory leak. Memory consumption when downloading new messages is also something like 200% higher.
Comment 14 Matthew Barnes 2006-11-07 20:36:03 UTC
See Jeff's analysis of the pitfalls of GStringChunk [1] and subsequent discussion on gtk-devel-list starting at [2].  The lesson I'm taking away is that the chunk size needs to be significantly larger than the average size of the strings being stored.

My patch tends to allocate string chunks of only 1024 bytes, a number I didn't give much consideration to.  This is probably causing large amounts of memory to be wasted (as Jeff pointed out), which may account for the growth in memory consumption that Philip is seeing.  Perhaps the chunk size should be more in the neighborhood of 16K or larger.

I don't see anything else in the patch that would be causing a memory leak.

[1] http://mail.gnome.org/archives/evolution-hackers/2006-November/msg00003.html
[2] http://mail.gnome.org/archives/gtk-devel-list/2006-November/msg00015.html
Comment 15 Matthew Barnes 2007-05-01 15:58:05 UTC
Created attachment 87336 [details] [review]
Revised patch for evolution

A flaw in my Evolution patch was the cause of [1] and [2] for Fedora users.
This revision corrects the flaw found by Peter Jones (see [2] for details).

[1] http://bugzilla.gnome.org/show_bug.cgi?id=373713
[2] http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=238497
Comment 16 Philip Van Hoof 2007-05-13 11:54:32 UTC
The latest patch that you have sent in Comment #15 doesn't change Attachment #75421 [details], right? Because I tested 75421 with an unmodified Tinymail, which Attachment #87336 [details] of course wont change, yet I experienced this 'more memory usage'.

So I'm queuing this patch for Tinymail's camel-lite (as it needs more investigation on why this memory increase took place) before I apply it. More memory consumption (either on average or caused by a leak) is not interesting for me :-(. Sorry: not surprisingly is my priority of memory consumption higher than the one for reducing old cruft.

Bookmarked, though.
Comment 17 Matthew Barnes 2007-05-13 21:40:40 UTC
(In reply to comment #16)
> The latest patch that you have sent in Comment #15 doesn't change Attachment
> #75421 [edit], right?

Correct.  The Evolution patch only modifies mail/message-list.c by replacing EMemPool with GStringChunk.  The chunk size needs to be tuned though.  1K chunks are too wasteful, as Jeff pointed out (see comment #14).


> So I'm queuing this patch for Tinymail's camel-lite (as it needs more
> investigation on why this memory increase took place) before I apply it. More
> memory consumption (either on average or caused by a leak) is not interesting
> for me :-(. Sorry: not surprisingly is my priority of memory consumption higher
> than the one for reducing old cruft.

Here again, I'm thinking the string chunk size just needs to be tuned in the Evolution-Data-Server patch.  In some spots in that patch we're allocating chunks of only 256 bytes, so it's no wonder the memory consumption doubled!  My guess is a chunk size of 16K or 32K will at least get the numbers moving in the right direction.
Comment 18 Philip Van Hoof 2007-05-14 20:14:53 UTC
So I hereby invite you to experiment with Tinymail's camel-lite. 

On this page you can get some information on how I measured its memory consumption:

http://tinymail.org/trac/tinymail/wiki/MemoryStats

Although the consumption will differ quite a lot from Evolution's Camel (mostly due to the mmap patch being applied), you will have a much more fine grained environment to perform regression testing on memory consumption, with Tinymail and its camel-lite implementation/ adaptations of the Camel API.
Comment 19 Matthew Barnes 2007-05-14 20:54:57 UTC
Awesome, thanks!  I've been meaning to ask you how you generated those measurements for camel-lite.  I'll try to reproduce what you observed in comment #13 and take it from there.
Comment 20 Philip Van Hoof 2007-10-07 17:10:08 UTC
Bookmarked this bug for synchronizing it with Tinymail's camel-lite
Comment 21 Matthew Barnes 2008-03-11 00:59:28 UTC
Bumping version to a stable release.
Comment 22 Akhil Laddha 2009-09-17 07:30:53 UTC
@ Philip is there any update ?
Comment 23 Philip Van Hoof 2009-09-17 08:06:48 UTC
No updates from my side
Comment 24 Milan Crha 2009-11-27 10:55:41 UTC
Patches doesn't apply to actual sources these day. I guess a little rethink on the bug itself, and definitely an update for patches is necessary.
Comment 25 Matthew Barnes 2009-11-27 14:44:31 UTC
I certainly wouldn't advocate the use of GStringChunk these days, but it would still be nice to see if anything is still using them and deprecate them if not.  (Or move them to Camel if appropriate.  I've been trying to kill Camel's libedataserver dependency for years now.)
Comment 26 Kjartan Maraas 2010-12-16 18:58:10 UTC
Looks like EStrv is the only one left these days.
Comment 27 André Klapper 2021-05-19 11:04:04 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. 
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow
  https://wiki.gnome.org/Community/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/evolution-data-server/-/issues/

Thank you for your understanding and your help.