Bug 558883 – Evolution 2.24 is terribly slow with large IMAP folders

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 558883 - Evolution 2.24 is terribly slow with large IMAP folders


Summary:	Evolution 2.24 is terribly slow with large IMAP folders


Status:	RESOLVED FIXED

Product:	evolution-data-server
Classification:	Platform
Component:	Mailer
Version:	2.24.x (obsolete)
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	---
Assigned To:	evolution-mail-maintainers
QA Contact:	Evolution QA team

URL:
Whiteboard:	evolution[disk-summary]

Depends on:
Blocks:	543389

Reported:	2008-11-02 03:50 UTC by Ricardo Correia
Modified:	2009-03-14 23:37 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Proposed patch (1.34 KB, patch) 2008-11-03 08:56 UTC, Srinivasa Ragavan	committed	Details \| Review
Fix for build uids detection during expunge than sequential. (1.25 KB, patch) 2008-11-04 07:02 UTC, Srinivasa Ragavan	committed	Details \| Review
GDB debugging of send-receive session with 2 imap accounts. (29.28 KB, text/plain) 2008-11-09 12:37 UTC, Fabio Muzzi		Details
Test patch (992 bytes, patch) 2008-11-10 06:39 UTC, Srinivasa Ragavan	committed	Details \| Review
GDB debugging of send-receive session with 2 imap accounts, with debugging symbols loaded. (100.17 KB, text/plain) 2008-11-13 11:12 UTC, Fabio Muzzi		Details
proposed eds patch (folder content download on startup) (4.99 KB, patch) 2008-11-18 16:39 UTC, Milan Crha	committed	Details \| Review
proposed eex patch (feature remove from here) (6.05 KB, patch) 2008-11-18 17:55 UTC, Milan Crha	committed	Details \| Review
gdb-insert-fsync-loop-backtrace.txt (40.04 KB, text/plain) 2008-11-25 19:19 UTC, Ricardo Correia		Details
Patch to avoid dbsort for IMAP (1.33 KB, patch) 2008-12-04 13:16 UTC, Srinivasa Ragavan	committed	Details \| Review
Patch to avoid dbsort for IMAP (1.33 KB, patch) 2008-12-04 13:33 UTC, Srinivasa Ragavan	rejected	Details \| Review

Description Ricardo Correia 2008-11-02 03:50:31 UTC

Please describe the problem:
Since I've upgraded to Evolution 2.24.1 (due to the Kubuntu 8.04 -> 8.10 upgrade), Evolution has become very slow, to the point of being almost unusable.

This is in part due to one large folder that I have, with hundreds of thousands of messages, which causes this new Evolution version to choke.

I've been able to track down and even fix a few of the problems.

Even though I don't use vfolders, when Evolution starts it seems to load 2 internal vfolders: the Trash and the Junk folders (btw, I don't have spam filtering enabled, either).

When Evolution starts, these 2 vfolders seem to generate queries on the summary database for every folder, which takes a very long time due to the large size of my folders.db file (~640 MB) and due to the fact that its not using any sqlite indexes.

Here's the relevant part of the backtrace when loading the Trash folder:

+ Trace 209075


Again, this can be solved by creating an index on the column "read":
CREATE INDEX i3 ON 'mailing lists/linux-kernel' (read);

Now it takes less than 2 seconds:
Running 'sqlite3 folders.db SELECT COUNT (*) FROM 'mailing lists/linux-kernel' WHERE read = 0':
234043
real 1.20
user 0.17
sys 0.09

So I would again suggest this index to be created by default and also when upgrading Evolution.

I would also suggest Evolution developers to regularly do performance testing of such large folders (hundreds of thousands of messages), which seems to be a big regression in 2.24.x, and which used to work acceptably before the sqlite backend was implemented..

Steps to reproduce:
1. 
2. 
3. 


Actual results:


Expected results:


Does this happen every time?


Other information:

Comment 1 Ricardo Correia 2008-11-02 05:22:15 UTC

Here's another one. When loading Evolution, I'm seeing again lots of IO.

This is the backtrace I've captured:

+ Trace 209077

Thread 2 (Thread 0xb34ffb90 (LWP 20633))

#0 __kernel_vsyscall
#1 read
from /lib/tls/i686/cmov/libpthread.so.0
#2 unixRead
at /usr/include/bits/unistd.h line 45
#3 readDbPage
at sqlite3.c line 10571
#4 pagerAcquire
at sqlite3.c line 27475
#5 sqlite3BtreeGetPage
at sqlite3.c line 27520
#6 getAndInitPage
at sqlite3.c line 31007
#7 moveToChild
at sqlite3.c line 33309
#8 moveToLeftmost
at sqlite3.c line 33440
#9 sqlite3VdbeExec
at sqlite3.c line 45707
#10 sqlite3_step
at sqlite3.c line 41212
#11 sqlite3_exec
at sqlite3.c line 60363
#12 camel_db_select
at camel-db.c line 460
#13 camel_db_get_folder_uids
at camel-db.c line 587
#14 camel_folder_summary_load_from_db
at camel-folder-summary.c line 951
#15 camel_imap_summary_new
at camel-imap-summary.c line 193
#16 camel_imap_folder_new
at camel-imap-folder.c line 246
#17 get_folder
at camel-imap-store.c line 2041
#18 camel_store_get_folder
at camel-store.c line 330
#19 imap_can_refresh_folder
at camel-imap-store.c line 3080
#20 camel_store_can_refresh_folder
at camel-store.c line 1366
#21 get_folders
at mail-send-recv.c line 778
#22 get_folders
at mail-send-recv.c line 782
#23 refresh_folders_exec
at mail-send-recv.c line 809
#24 mail_msg_proxy
at mail-mt.c line 520
#25 g_thread_pool_thread_proxy
at /build/buildd/glib2.0-2.18.2/glib/gthreadpool.c line 265
#26 g_thread_create_proxy
at /build/buildd/glib2.0-2.18.2/glib/gthread.c line 635
#27 start_thread
from /lib/tls/i686/cmov/libpthread.so.0
#28 clone
from /lib/tls/i686/cmov/libc.so.6


So it appears Evolution is trying to get the list of uids of this folder.

The reason this generates lots of IO seems to be that sqlite cannot use the primary key index (i.e. the index on the "uid" column), because Evolution is using a custom collate function for this query.

I really don't understand why Evolution needs to load this information when starting up.
AFAICS, I've disabled all the options that I could find saying "check mail in this folder" or "check mail in all folders", or "check mail in all subscribed folders" (or whatever).

Comment 2 Srinivasa Ragavan 2008-11-03 04:00:09 UTC

I will look into this asap.

Comment 3 Srinivasa Ragavan 2008-11-03 08:32:23 UTC

Ricardo, you are just awesome. Great data and help. 

I realized that the code missesthe index creation on delete/junk/unread columns, which are used to count as well as for the trash/junk functionalities.

On the INSERT, Evolution does it in a transaction. The problem with the fsync due to the ext3 or something it takes time. The fsync was for the journal creation and not the DB. Otherwise, Evo does all insert in a transaction.

On the uid-collate. Evolution need to load the uids in a sorted fashion. Where as it couldn't be sorted inside the db.

Comment 4 Srinivasa Ragavan 2008-11-03 08:56:22 UTC

Created attachment 121851 [details] [review]
Proposed patch

This adds indices for del/junk/unread . Thanks a lot for the bug.

Comment 5 Ricardo Correia 2008-11-03 11:23:43 UTC

(In reply to comment #3)
> On the uid-collate. Evolution need to load the uids in a sorted fashion. Where
> as it couldn't be sorted inside the db.

Ok, I have a question and a suggestion.

1) Why does Evolution need to load the whole list of message uids of *every* folder, when it is starting?

2) Even if this would be necessary, it could still be optimized.

AFAICS, the collate function is being set in camel/providers/imap/camel-imap-summary.c:camel_imap_summary_new(), in this line:

camel_db_set_collate (folder->parent_store->cdb_r, "uid", "imap_uid_sort", (CamelDBCollate)sort_uid_cmp);

So the imap_uid_sort collate function is the sort_uid_cmp() function defined in camel/providers/imap/camel-imap-summary.c.

Now, AFAICS, the only thing the sort_uid_cmp() function does is copy the uids to some static temporary buffers, convert them into unsigned longs and compare them as integers. This strikes me as a bit odd and unnecessary..

Here's how a table is being created:

CREATE TABLE 'lustre/perfbench' (  uid TEXT PRIMARY KEY , flags INTEGER , msg_type INTEGER , read INTEGER , deleted INTEGER , replied INTEGER , important INTEGER , junk INTEGER , attachment INTEGER , msg_security INTEGER , size INTEGER , dsent NUMERIC , dreceived NUMERIC , subject TEXT , mail_from TEXT , mail_to TEXT , mail_cc TEXT , mlist TEXT , followup_flag TEXT , followup_completed_on TEXT , followup_due_by TEXT , part TEXT , labels TEXT , usertags TEXT , cinfo TEXT , bdata TEXT );

Alas, the uid field is indeed a TEXT field instead of an INTEGER field.

So, I wonder, why not use an INTEGER type instead of a TEXT type? Note that in sqlite3, the INTEGER type can store up to 64-bit signed numbers, which is a range larger than the unsigned longs on 32-bit platforms.

By using an INTEGER "uid" field, the sort_uid_cmp collate function could be removed and the "ORDER BY" SQL statement could be used instead. This would allow sqlite3 to use the primary key index to sort and retrieve the uids..

Comment 6 Srinivasa Ragavan 2008-11-03 11:29:07 UTC

The UID  is long int only for IMAP, but for other providers it is a alpha numeric.

It would load the uids of folders, for which messages are downloaded or so. If you don't have 'check in all folders' enabled, then only the currently viewed fodlers's uids are loaded. Its required to compare with servers copy.

Comment 7 Ricardo Correia 2008-11-03 11:41:36 UTC

In that case, I've h(In reply to comment #6)
> It would load the uids of folders, for which messages are downloaded or so. If
> you don't have 'check in all folders' enabled, then only the currently viewed
> fodlers's uids are loaded. Its required to compare with servers copy.

If that is the case, then there must be a bug, because I have the "check mail in all folders" and "check mail in subscribed folders" options disabled on all of my accounts, and the "Always check for new mail in this folder" disabled on all of my folders, but when starting up Evolution still loads the whole list of uids of every folder that I have, including the big linux-kernel folder.

Looking at the backtrace that I've posted, maybe the bug is in camel_store_can_refresh_folder() or imap_can_refresh_folder()?

Comment 8 Ricardo Correia 2008-11-03 12:18:44 UTC

Also another suggestion:

AFAICS all the providers use integer comparison functions to sort the uid fields, except for the maildir provider which uses alphanumeric uids (and therefore, no collate function).

So, why not create the tables with INTEGER uid columns, except when using the maildir provider which would use a TEXT column?

So, in essence, the format of the table creation would be dependent on the provider, but it would be possible to keep all the other code unchanged.

This would allow sqlite to optimize the sorting and retrieval of the uids by using the index.

Note that this is a real performance issue on large folders. My linux-kernel folder, with about 200,000 emails or so, is solely responsible for an increase of ~400 MB of the summary db.

If the SELECT statements issued by Evolution don't use an index, then it means that sqlite will have to do a table scan, which in other words means that it needs to read the full 400 MB of data, which takes a very long time considering that the data is internally (in the DB itself) and externally fragmented (in the filesystem)..

Comment 9 Jeffrey Stedfast 2008-11-03 14:16:20 UTC

I think this would be a good idea. I would have suggested this ages ago if I had known that the INTEGER type was 64bit, I thought it was only 32bit which would have been too short :(

Comment 10 Ricardo Correia 2008-11-03 15:41:47 UTC

Here's another one.

I just tried to close Evolution and for some reason it decided to read the entire summary of my linux-kernel folder, message by message:

+ Trace 209163

Thread 4 (Thread 0xb33ffb90 (LWP 11338))

#0 __kernel_vsyscall
#1 read
from /lib/tls/i686/cmov/libpthread.so.0
#2 unixRead
at /usr/include/bits/unistd.h line 45
#3 readDbPage
at sqlite3.c line 10571
#4 pagerAcquire
at sqlite3.c line 27475
#5 sqlite3BtreeGetPage
at sqlite3.c line 27520
#6 getAndInitPage
at sqlite3.c line 31007
#7 moveToChild
at sqlite3.c line 33309
#8 sqlite3BtreeMoveto
at sqlite3.c line 33705
#9 sqlite3VdbeCursorMoveto
at sqlite3.c line 40102
#10 sqlite3VdbeExec
at sqlite3.c line 43912
#11 sqlite3_step
at sqlite3.c line 41212
#12 sqlite3_exec
at sqlite3.c line 60363
#13 camel_db_select
at camel-db.c line 460
#14 camel_db_read_message_info_record_with_uid
at camel-db.c line 894
#15 message_info_from_uid
at camel-folder-summary.c line 577
#16 camel_folder_summary_uid
at camel-folder-summary.c line 624
#17 camel_folder_summary_index
at camel-folder-summary.c line 398
#18 imap_expunge
at camel-imap-folder.c line 1593
#19 imap_sync
at camel-imap-folder.c line 1433
#20 camel_folder_sync
at camel-folder.c line 303
#21 store_sync
at camel-store.c line 720
#22 camel_store_sync
at camel-store.c line 744
#23 sync_store_exec
at mail-ops.c line 1605
#24 mail_msg_proxy
at mail-mt.c line 520
#25 g_thread_pool_thread_proxy
at /build/buildd/glib2.0-2.18.2/glib/gthreadpool.c line 265
#26 g_thread_create_proxy
at /build/buildd/glib2.0-2.18.2/glib/gthread.c line 635
#27 start_thread
from /lib/tls/i686/cmov/libpthread.so.0
#28 clone
from /lib/tls/i686/cmov/libc.so.6


Needless to say, this is taking a long time..

Comment 11 Fabio Muzzi 2008-11-03 18:05:49 UTC

I have just run into the same issue after upgrading my Ubuntu. I have 4 accounts on a Dovecot imapd server, with 50 folders and 150.000 emails (total). Evolution used to be the fastest imap client I have ever used, and now it's the slowest (not counting Outlook).

I have manually created the indexes for every folder, using a homebrew script, but still Evo is horribly slow, even if not as before creating the indexes.

Since I use server side filtering, I needed the "check for new messages in every folder" option, and it worked incredibly fast in the old version of Evo, on a GB LAN and even over a 512K adsl link. Now I have disabled it for testing purposes, and still it's unusable even on the LAN.

I also see that the unread message count, shown next to the folders in the folder tree, does not get updated as I read the messages.

Comment 12 Srinivasa Ragavan 2008-11-04 03:04:13 UTC

Ricardo, I understand your suggestion, but I think it might be quite a bit of changes to change the uid as dependent on the provider. But I will consider this some time in future. Meanwhile, I will see, if I can do any other optimization to load uid faster I will consider. UID is already indexed afair.

I will look into the other issue you pasted with the trace and the all-folder-uids-loaded.

Comment 13 Srinivasa Ragavan 2008-11-04 07:02:10 UTC

Created attachment 121933 [details] [review]
Fix for build uids detection during expunge than sequential.

Comment 14 Srinivasa Ragavan 2008-11-04 07:02:46 UTC

s/build/bulk

This should fix that issue with trace during quit. Thanks again for a very nice/useful trace.

Comment 15 Srinivasa Ragavan 2008-11-04 07:53:08 UTC

Ricardo, did you press Send/Receive or so? I see that could be a reason for loading all folder uids. Anyways Lemme debug more.

Comment 16 Srinivasa Ragavan 2008-11-04 07:54:45 UTC

Ricardo, did you press Send/Receive or so? I see that could be a reason for loading all folder uids. Anyways Lemme debug more.

Comment 17 Srinivasa Ragavan 2008-11-04 07:56:45 UTC

> I have manually created the indexes for every folder, using a homebrew script,
> but still Evo is horribly slow, even if not as before creating the indexes.

What indices did you create? Try the patch http://bugzilla.gnome.org/attachment.cgi?id=121851 it should be much better

> 
> Since I use server side filtering, I needed the "check for new messages in
> every folder" option, and it worked incredibly fast in the old version of Evo,
> on a GB LAN and even over a 512K adsl link. Now I have disabled it for testing
> purposes, and still it's unusable even on the LAN.
>
With this patch, It should be a better experience. Lemme know how it goes after the patch 
 
> I also see that the unread message count, shown next to the folders in the
> folder tree, does not get updated as I read the messages.

IMAP or Local folders? ANy steps to reproduce it ?

TIA

Comment 18 Fabio Muzzi 2008-11-04 09:46:37 UTC

(In reply to comment #17)

> What indices did you create? Try the patch
> http://bugzilla.gnome.org/attachment.cgi?id=121851 it should be much better

My script creates three indices per folder, like this:

create index "index_deleted_$a" on "$a" (deleted)
create index "index_deleted_$a" on "$a" (junk)
create index "index_deleted_$a" on "$a" (read)

which if I understand correctly is the same as your patch, only it does it externally so it has to be run for every folder I create.

I have not tried your patch because I did not want to set everything up to recompile Evo, but if it's needed, I'll do it.

 
> > I also see that the unread message count, shown next to the folders in the
> > folder tree, does not get updated as I read the messages.
> 
> IMAP or Local folders? ANy steps to reproduce it ?

IMAP, I don't use local folders at all.

I have now set again the "check for mail in every folder" switch, because I need it. I also have set "check for new mail at startup" and "check for new mail every 2 minutes" (which was my default setup before upgrading)

It seems that this behaviour of not updating unread counters is easy to reproduce, because it happens every time. I start Evo, I NEVER touch the send and receive button. 

After about 5 minutes of furious disk activity, while Evo refreshes folders, I can see that the unread counters in every folder (including INBOX) has not been updated. (shows LESS unread messages than the acutal number)

Then I enter the folder (inbox in this example) and I see some new messages (that were received between the last time I have shut down Evo and now) The counter does not get updated.

Now I can read messages, mark them as unread again, do what I like, and the counter still does not get updated. I can exit and enter folders, and the counters still do not get updated.

To update the counter, I have to right click on the folder name in the folder tree. As soon as the context menu appears, the counter for that folder gets updated, even if I don't click on "refresh" or any other menu item.

I have found that this behaviour is consistent for every folder in every account. 

Maybe this is a different bug, not related to the one we are discussing about.

Comment 19 Ricardo Correia 2008-11-04 14:59:08 UTC

(In reply to comment #16)
> Ricardo, did you press Send/Receive or so? I see that could be a reason for
> loading all folder uids. Anyways Lemme debug more.

No, Evolution loads all the uids right when it starts. The only thing I do is type in the passwords for my accounts, nothing else.
Typically it takes about 5-10 minutes until I can start using Evolution..

Also, Send/Receive shouldn't load the whole list of uids, right?

If I configured my folders not to check received mail in there, then it shouldn't (unless I open the folder, of course).

Comment 20 Fabio Muzzi 2008-11-04 15:48:03 UTC

In comment #18 I have obviously reported the (pseudo) script wrong. The three indices I create have three different names (deleted, junk, read) and not the same name as reported.


Also, I have talked to people that use sqlite and they told me to pay attention to the use of transactions: sqlite is faster (even 100 times faster) if you do lots of operations (inserts, deletions, updates) in a single transaction, because if you don't, for every database modification the whole database file gets rewritten to the disk.  If you do a big transaction with 1000 inserts in it, the file gets rewritten only once instead of 1000 times!

Comment 21 Fabio Muzzi 2008-11-04 18:53:00 UTC

An additional information: Evo leaks memory. Given enough time (three hours) it managed to fill up completely 1 GB o RAM and 2 GB o swap.

root@vongola:/etc/udev# free
             total       used       free     shared    buffers     cached
Mem:       1033920    1021844      12076          0       1960      23372
-/+ buffers/cache:     996512      37408
Swap:      2281220    2281220          0

Comment 22 Srinivasa Ragavan 2008-11-05 15:46:27 UTC

Fabio, Evo uses transactions well. We don't write individually. There are some random  places, where things are called sequentially, which is what Im fixing up. Otherwise, 99% of the code must use transactions to write. 

On your memory leaks, can you tell me you account setup? How many folders? How many vfolders? Does it remain always or does it reduce over a period of time? Can you report memory consumed at start up and may be after 15-30 mins?

Comment 23 Fabio Muzzi 2008-11-05 18:52:59 UTC

(In reply to comment #22)

> Fabio, Evo uses transactions well. We don't write individually. There are some
> random  places, where things are called sequentially, which is what Im fixing
> up. Otherwise, 99% of the code must use transactions to write. 

So I'm left wondering why does it access the disk furiously...


> On your memory leaks, can you tell me you account setup? How many folders? How
> many vfolders? Does it remain always or does it reduce over a period of time?
> Can you report memory consumed at start up and may be after 15-30 mins?

I have no vfolders, I have disabled the "search folders".

I currently have disabled two big accounts to speed up things.

I'm left with 3 imap accounts, with a total of 30 folders. One of these accounts has "check for new mail in all folders" enabled, because it uses server-side filtering. Total number of emails is 80.000. The biggest folder has 25.000 emails in it.

Memory usage does not seem to reduce over time, I have left it open for 3 hours and had to exit it because it filled up my memory completely.

Now some tests:


10 minutes after startup:
Evolution: 255 MB
Evolution-data-server: 19 MB
 
20 minutes:
Evolution: 327 MB
Evolution-data-server: 19 MB

30 minutes:
Evolution: 441 MB
Evolution-data-server: 19 MB


These tests were run while I was NOT using evo, I mean I did not read eny email.

Comment 24 Srinivasa Ragavan 2008-11-06 03:37:59 UTC

Fabio,

On your first issue, it could be due to the ext3/fsync issue. Sqlite stores the transaction into a .db-journal file, which needs fsync and fsync has issues on ext3 iirc [don't ask me what issue, I have heard that that is the reason]

Can you get me a valgrind report?

'valgrind --tool=memcheck --leak-check=full evolution'

Just get me the log, I can check out on the memleaks and thanks a lot for the help. 30 folders is small and shouldn't grow this big. Btw, are you looking at the 'top' or system-monitor? They lie you. Try pmap. Also report just the resident memory.

Comment 25 Fabio Muzzi 2008-11-06 18:08:58 UTC

(In reply to comment #24)

> On your first issue, it could be due to the ext3/fsync issue. Sqlite stores the
> transaction into a .db-journal file, which needs fsync and fsync has issues on
> ext3 iirc [don't ask me what issue, I have heard that that is the reason]

So, what can I do? Ext3 is the most common file system, I doubt that everyone will switch to another file system just to use Evo. Maybe I should try, just for testing purposes, to mount a ramdisk for Evo to store its database?


> Can you get me a valgrind report?
> 
> 'valgrind --tool=memcheck --leak-check=full evolution'

I suppose that you need only the last part, that summarizes memory leaks:

==11467== LEAK SUMMARY:
==11467==    definitely lost: 150,591 bytes in 3,225 blocks.
==11467==    indirectly lost: 210,818 bytes in 8,965 blocks.
==11467==      possibly lost: 44,556,411 bytes in 58,477 blocks.
==11467==    still reachable: 69,736,799 bytes in 4,426,584 blocks.
==11467==         suppressed: 0 bytes in 0 blocks.
==11467== Reachable blocks (those to which a pointer was found) are not shown.
==11467== To see them, rerun with: --leak-check=full --show-reachable=yes

If you need the whole log, I can rerun it. This was a 10-minute run with just a little activity (I have browsed some folders) because it is terribly slow running under valgrind.


> Btw, are you looking at
> the 'top' or system-monitor? They lie you. Try pmap. Also report just the
> resident memory.

The values in the previous post are from pmap; I have reported the  "total" memory from pmap.

Comment 26 Fabio Muzzi 2008-11-06 19:04:12 UTC

I have tried disabling and enabling accounts (leaving database files on the hard disk) and I have noticed a strange behaviuor: enabling 2 accounts made Evo a lot slower, while leaving only one enabled made it a lot faster.

I'll try to explain:

I have two accounts, A and B. Both have about 15 folders inside and both have "check for mail in every folder" enabled.

A holds about 30% of the emails, B the other 70%. (80.000 total)

If I enable only A, and click "send/receive" it quickly scans folders (I see it in the status bar), taking less than 2 seconds.

If I enable only B, and click "send/receive" as before, it scans folders, taking about 5 seconds (B is bigger).

If I enable both, and click "send/receive", it scans folders, accessing the disk furiously (it flashed just a little when I had only one account enabled) and it takes 49 seconds!

I think that this is really important. Something is really different when using 2 or more accounts, they seem to slow down a lot and trash the disk, too.

Maybe since they are both checked for new mail at the same time, the calls to sqlite3 collide, forcing the sqlite3 library to fsync constantly, while with a single folder it does not and runs in memory?

Another test: I have moved the database file for account B to a ramdisk, and again run the "send/receive" procedure: time went to 12 seconds, but the disk was still heavily accessed. (folders.db for account a was still on hard disk)

Yet another test: enabling only account B with folders.db saved to the ramdisk reduced its "send/receive" time from 5 to 4 seconds, so I can say that when only one folder is enabled, disk access is quite optimized (I mean, it works mainly in RAM) because using a ramdisk instead of a phisical HDD only led to a little speed-up.

You should definitely investigate the two accounts issue: using 2 accounts really slows everything down.

Comment 27 Srinivasa Ragavan 2008-11-07 03:26:10 UTC

Committed these patches to stable/trunk

Comment 28 Srinivasa Ragavan 2008-11-07 03:37:38 UTC

The design is like this, every account has a separate sqlite database and a separate handle. Sqlite is multithreaded and Im wondering how this happens.

Fabio, can you run Evo is gdb and when you say it takes 5 mins, break into gdb 'control +C' and type 'thread apply all bt' and collect the traces. 5 samples in a gap of a minute. So that I can see, if some locks are held or so. Thanks a lot of your help.

Comment 29 Fabio Muzzi 2008-11-07 14:13:50 UTC

(In reply to comment #28)

> The design is like this, every account has a separate sqlite database and a
> separate handle. Sqlite is multithreaded and Im wondering how this happens.

I don't know where the problem is, but I think you should be able to easily reproduce this bug, just add another account and load some random messages in both accounts, I suspect that even with 1000 messages per account the problem should be evident.

 
> Fabio, can you run Evo is gdb and when you say it takes 5 mins, break into gdb
> 'control +C' and type 'thread apply all bt' and collect the traces. 5 samples
> in a gap of a minute. So that I can see, if some locks are held or so. Thanks a
> lot of your help.

It was 50 seconds, not 5 minutes (with indices). I'll try to run this test as soon as possible and I'll report to you.

Comment 30 Fabio Muzzi 2008-11-09 12:37:03 UTC

Created attachment 122259 [details]
GDB debugging of send-receive session with 2 imap accounts.

Here is the debugging that you asked for. I have run Evo, waited for it to run all of its folders updates, then hit the "send/receive" button and then started interrupting the process every 10 seconds. It took abount 80 seconds to complete.

Hope this helps somehow.

Comment 31 Srinivasa Ragavan 2008-11-10 06:34:23 UTC

Fabio, You don't have the debuginfo packages installed. its difficult to understand the traces without it. Atleast install debug info of evolution evolution-data-server glib sqlite  and reget the traces. Thanks

Comment 32 Srinivasa Ragavan 2008-11-10 06:38:06 UTC

Fabio, Also, if you are interested, I can give you a test patch. I have a redundant lock, which I added because it was bad during the early days of development. I can remove it as I see that its harming in your case. You can try without that. Lemme know how its after this.

Comment 33 Srinivasa Ragavan 2008-11-10 06:39:14 UTC

Created attachment 122308 [details] [review]
Test patch

I know that this patch wasn't required, but to be too sure I added it months back. It sort of makes one transaction at any point. It should be fine with out this also. But Fabia, lemme know how it goes with it

Comment 34 Srinivasa Ragavan 2008-11-12 09:09:59 UTC

Fabio, did you happen to try with it?

Comment 35 Fabio Muzzi 2008-11-12 09:23:41 UTC

No, I had no time to try. I'll try ASAP.

Comment 36 Fabio Muzzi 2008-11-13 11:12:38 UTC

Created attachment 122569 [details]
GDB debugging of send-receive session with 2 imap accounts, with debugging symbols loaded.

This is a trace of a send-receive session, with 2 imap accounts, that was stopped every 5 seconds then continued until the send-receive operation ended.

I have debugging symbols this time, so it should be definitely more useful than the last one.

Comment 37 Fabio Muzzi 2008-11-13 11:16:10 UTC

(In reply to comment #33)
> Created an attachment (id=122308) [edit]
> Test patch
> 
> I know that this patch wasn't required, but to be too sure I added it months
> back. It sort of makes one transaction at any point. It should be fine with out
> this also. But Fabia, lemme know how it goes with it
>

I have attached a dbg trace of Evo without this patch installed.

If I have to test this patch, I'll have to install full sources and then compile from sources, because I am running the binary packages from Ubuntu at the moment.

I'll try to install sources and recompile with the patches as soon as I have a little time to do it.

Comment 38 Srinivasa Ragavan 2008-11-18 13:05:53 UTC

> 
> Looking at the backtrace that I've posted, maybe the bug is in
> camel_store_can_refresh_folder() or imap_can_refresh_folder()?
> 

I think, I have found that we load all uids. We are fixing it.

Comment 39 Milan Crha 2008-11-18 16:39:07 UTC

Created attachment 122953 [details] [review]
proposed eds patch (folder content download on startup)

for evolution-data-server;

This is for the issue I caused with my patch to "Alway check for new mail in this folder", with this applied the IMAP will not download folder content on startup only to check whether should refresh folder or not. Follows patch for the eex.

Comment 40 Milan Crha 2008-11-18 17:55:13 UTC

Created attachment 122967 [details] [review]
proposed eex patch (feature remove from here)

for evolution-exchange;

No summary for exchange store, thus removing the feature from here, as we discussed on IRC with srag.

Comment 41 Srinivasa Ragavan 2008-11-19 05:09:16 UTC

Commited my patch to stable/trunk.

Comment 42 Srinivasa Ragavan 2008-11-19 06:48:39 UTC

Milan looks awesome. It should work well, but I want Akhil/Bharath to test it.

Comment 43 Akhil Laddha 2008-11-19 13:34:45 UTC

Tested both the patches and changes worked fine with IMAP as well as Exchange. Thanks Milan.

Comment 44 Srinivasa Ragavan 2008-11-20 04:09:26 UTC

Milan, its is just within EDS, I think no harm in it, pushing for stable. So take for stable/trunk

Comment 45 Milan Crha 2008-11-20 10:23:36 UTC

eds part committed to trunk. Committed revision 9775.
eds part committed to gnome-2-24. Committed revision 9776.
eex part committed to trunk. Committed revision 1840.
eex part committed to gnome-2-24. Committed revision 1841.

I'm not sure whether all mentioned here had been addressed, thus keeping bug opened.

Comment 46 Ricardo Correia 2008-11-25 19:19:22 UTC

Created attachment 123382 [details]
gdb-insert-fsync-loop-backtrace.txt

Hi,

I have some more data about the extremely slow INSERT statements.
According to my analysis, this seems to be a both a bug in Evolution and in sqlite, but I'm not completely sure.

The perceived behaviour is that Evolution noticeably started writing and fsyncing *a lot* (making the whole system slow), and I wasn't able to do anything useful in Evolution since it seemed to be blocked doing this for a very long time.

If you follow the gdb-insert-fsync-loop-backtrace.txt attachment, you will see that in one thread, Evolution called sqlite to execute a SELECT statement to retrieve all the messages in the zfs/discuss folder. This sqlite function then called an Evolution callback, which in turn blocked while waiting for some lock.

While this thread is blocked, another Evolution thread is INSERTing a value into the same zfs/discuss folder. However, sqlite is unable to obtain a lock on this database file, so it seems to be busy waiting/sleeping until it can obtain the lock.

As you can see, the INSERT statement seems to be trying to insert a single row.

So I wondered, if one thread is waiting for a mutex and the other is busy waiting/sleeping, where are the fsync()s coming from?

As you can see, the fsync() call is coming from the thread doing the INSERT.
It appears that when this thread stops sleeping, it tries again to insert the row (phase 1 commit), then afterwards it discovers that it can't obtain the lock and sleeps again.

This goes on and on, and neither thread seems to be making progress. It might be possible that the SELECT statement is retrieving a new value every now and then (assuming it can get the mutex somehow), but I have confirmed that the INSERT statement is always the same - it is not making progress at all, and is always causing fsync().

So I think these are bugs in both Evolution and in sqlite:

1) In Evolution, because:
1.1) It is reading a table and inserting a row at the same time in different threads?
1.2) Because Evolution's camel_read_mir_callback() function blocks on a mutex. This can (and seems to) cause a deadlock because sqlite is holding a lock on the database, which prevents other Evolution threads from doing database operations (threads which might be holding the same Evolution mutex that the first thread is blocked on).

2) In sqlite, because:
2.1) INSERT loops doing writes and fsync() calls, without being able to make any progress.
2.2) Because it busy loops/sleeps instead of using proper synchronization primitives.

Thanks,
Ricardo

Comment 47 Srinivasa Ragavan 2008-11-26 19:25:53 UTC

I see some issue in Evolution also. Lemme see, if I can avoid that read lock. It actually reads summary, while other is writing summary. Lemme analyse more.

Comment 48 Fabio Muzzi 2008-11-26 22:57:05 UTC

With all the patches installed, I confirm that Evo is a lot faster than without them, but still really slow when I have more than one account enabled, as I have stated before.

Comment 49 Srinivasa Ragavan 2008-12-04 13:16:56 UTC

Created attachment 123946 [details] [review]
Patch to avoid dbsort for IMAP

This ignores the collate sort and does a qsort of the uids later on loading. This should be pretty fast.

Comment 50 Srinivasa Ragavan 2008-12-04 13:33:44 UTC

Created attachment 123948 [details] [review]
Patch to avoid dbsort for IMAP

This ignores the collate sort and does a qsort of the uids later on loading. This should be pretty fast.

Comment 51 Srinivasa Ragavan 2008-12-04 13:34:10 UTC

Oops, ignore the last thing.

Comment 52 Srinivasa Ragavan 2008-12-05 06:24:58 UTC

Committed to stable/trunk

Comment 53 Srinivasa Ragavan 2008-12-08 09:43:42 UTC

Ricardo,

I need some help. I wanted to load uids+flags during start up instead of just uids. I made it index flags as well.

sragavan@sragavan ~ $ bench sqlite3 .evolution/mail/groupwise/sragavan\@prv1-3.novell.com/folders.db "select uid,flags from INBOX" > /dev/null
real 38.48
user 0.05
sys 0.14


It takes 39 secs to load ~17K record. I changed the query to just select uid even that takes similar time. I drop vm cache before every query. Is there a way I can improve the speed?

Comment 54 Ricardo Correia 2008-12-08 17:08:18 UTC

(In reply to comment #53)
> sragavan@sragavan ~ $ bench sqlite3
> .evolution/mail/groupwise/sragavan\@prv1-3.novell.com/folders.db "select
> uid,flags from INBOX" > /dev/null
> real 38.48
> user 0.05
> sys 0.14
> 
> 
> It takes 39 secs to load ~17K record. I changed the query to just select uid
> even that takes similar time. I drop vm cache before every query. Is there a
> way I can improve the speed?

I'm not an expert in sqlite or SQL databases, but I would argue that creating an index on (uid, flags) should make such a query much faster, because the information would be condensed in much fewer disk blocks. However, I've tried doing this and it doesn't seem to have any effect.

An EXPLAIN statement indicates that the index is not used for this query:

sqlite> CREATE INDEX index_test ON 'mailing lists/linux-kernel' (uid,flags);
sqlite> .explain on
sqlite> EXPLAIN SELECT uid,flags FROM 'mailing lists/linux-kernel';
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Trace          0     0     0     explain select uid,flags from 'mailing lists/linux-kernel';  00
1     Goto           0     11    0                    00
2     SetNumColumns  0     2     0                    00
3     OpenRead       0     49045  0                    00
4     Rewind         0     9     0                    00
5     Column         0     0     1                    00
6     Column         0     1     2                    00
7     ResultRow      1     2     0                    00
8     Next           0     5     0                    00
9     Close          0     0     0                    00
10    Halt           0     0     0                    00
11    Transaction    0     0     0                    00
12    VerifyCookie   0     436   0                    00
13    TableLock      0     49045  0     mailing lists/linux-kernel  00
14    Goto           0     2     0                    00

This seems to go a bit against the sqlite optimizer documentation (point 7.0 in http://www.sqlite.org/optoverview.html ), which states that if all the columns are inside the index, then it should be able to retrieve that information without doing a table scan (although it doesn't state that this is done when we request all the rows).

Point 6.0 in that page also suggests that sqlite doesn't update the statistics over time, so we must manually run ANALYZE every now and then, so that the optimizer selects the right indexes.

However, I've also tried doing that without any effect.

So I'm a bit out of ideas and I don't really have the time to dig into the sqlite sources, so it might be good to contact the sqlite developers or the mailing list to see if they can help..

On another note, it would also be really nice if it were possible to configure Evolution to use other SQL database engines, such as PostgreSQL, which I believe is known to be more robust and scalable (personal anecdote: my folders.db file somehow got badly corrupted, which I confirmed by running the command 'PRAGMA integrity_check' -- this is a bit worrying considering that I've never even had a power loss, given that this is a laptop)...

Comment 55 Srinivasa Ragavan 2009-01-29 16:37:55 UTC

The main issue is fixed already.

Comment 56 Kevin R. Page 2009-01-29 17:46:47 UTC

(In reply to comment #55)
> The main issue is fixed already.

Srinivasa, fixed in which release?

Running evolution-data-server-2.24.3-1.fc10, if I let my desktop settle until there's no disk activity, then start evolution and do nothing else, it typically takes 5 minutes until Evo has finished thrashing the disk.

I certainly have some big IMAP folders and many many subfolders over several accounts, but although Evo has always taken a while to start up it never took anywhere near this long before the new on-disk indexes were implemented. And of course the disk access slows the whole desktop down for the duration of loading.

Comment 57 Fabio Muzzi 2009-01-29 18:33:54 UTC

I agree with Kevin. While the speed improved with the patches, it si still a lot slower than the previous version (without sqlite3). This whole sqlite thing is just a huge regression, if it can't be faster than this.

Comment 58 Milan Crha 2009-01-29 18:44:29 UTC

(In reply to comment #56)
> Srinivasa, fixed in which release?

I think srag meant mainly commit from bug #568332, which hit trunk and gnome-2-24 just few minutes ago.

Comment 59 Srinivasa Ragavan 2009-01-30 08:59:10 UTC

Thanks Milan :-)

Fabio,

Sqlite3 introduction brought three slowness:

1) Loading folders

This usually is fast, but vacuuming the folders.db at intervals makes a lot of difference.
http://www.gnome.org/~sragavan/evolution-rebuild-summarydb is a script that finds all folders.db and vacuum's. I would be putting this to the build soon.

2) Storing folders
   - Would be fixed by the sqlite-vfs that Milan made in #568332. It has ~75% improvement. From 10m to 2m improvements with the sqlite-vfs thing.

3) Switching folders
   - I commited patches just after 2.24.3 [Expect them in 2.24.4 to be released in few mins]. Which takes care of sync ing counts faster.
   select count(*) from Foo take 1m where as 'select count(*) from Foo where read=0 or read=1' takes 1s. This was the major contributor for slowness in folder-switch, fixed in branch already.


I hope I'm clear.

Comment 60 Kevin R. Page 2009-01-31 14:35:35 UTC

Thanks Srinivasa, running evolution-rebuild-summarydb helps enormously at startup - though I would note that this isn't a problem that has built up over months but has been present since I either a) migrated from the previous Evo version, or b) changed some account settings (these both happened at the same time, about a month ago).

Either way, I'd guess there are a good number of people for whom running this script will make a major difference. It's probably something you want to get out there pretty soon.

Comment 61 Srinivasa Ragavan 2009-02-02 06:09:47 UTC

(In reply to comment #60)
> Thanks Srinivasa, running evolution-rebuild-summarydb helps enormously at
> startup - though I would note that this isn't a problem that has built up over
> months but has been present since I either a) migrated from the previous Evo
> version, or b) changed some account settings (these both happened at the same
> time, about a month ago).

Oh, this is very specific to sqlite. When you drop/add...drop/add tables the db grows big and is better to vacuum it once in a while. It has nothing to do with older version of evo or so. Since evo does table/record creation/deletion often, once a month or two, it might make sense to run this. But I would integrate it in Evo and automate it soon.

Comment 62 Kevin R. Page 2009-02-02 08:25:26 UTC

(In reply to comment #61)
> Oh, this is very specific to sqlite. When you drop/add...drop/add tables the db
> grows big and is better to vacuum it once in a while. It has nothing to do with
> older version of evo or so.

Yes, I understand.

I'm suggesting that either the upgrade from F9 to F10 (when I got the updating mailbox format progress bars), or tweaking my settings (the server login ID, which seems to move/regenerate everything under .evolution/mail/imap/), must have caused a large number of the drop/add you speak of. So while it's technically a sqlite problem, Evo certainly triggered it.

In other words, it didn't take a month, week, or even days after upgrading for this problem to build up - Evo startup was terribly slow straight after the update. I'd suggest I won't be the only one - and I wouldn't want you to under-estimate how useful your script will probably be to a significant number of people  (who've probably spent the last month or two cursing Evolution ;)  )

F9 to F10 was from 2.22.3-2.fc9 (not sqlite?) to 2.24.3-1.fc10. If the generation of sqlite summaries on upgrade doesn't make such big db, it suggests it must have been changing the account settings that triggered it? If so, I recommend you do the vacuuming after settings are changed as a matter of course.

Comment 63 David Moore 2009-03-14 23:37:39 UTC

(In reply to comment #60)
> Either way, I'd guess there are a good number of people for whom running this
> script will make a major difference. It's probably something you want to get
> out there pretty soon.
> 

Evolution has been terribly slow for me also since the day I upgraded from F9 to F10.  Even deleting my .evolution/ directory and starting from scratch, it was still dog slow.  I then switched to Thunderbird for a while.

Recently switched back to Evolution, and found this bug report.
evolution-rebuild-summarydb completely solved the speed problems for me.  It used to take about 60 seconds to startup Evolution with my mailbox of 20,000 messages, now startup is nearly instantaneous.  Definitely try to get this integrated!  I think a lot of users have abandoned Evolution in the meanwhile.