After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 725720 - Livelock with 3.11.91
Livelock with 3.11.91
Status: RESOLVED INCOMPLETE
Product: evolution-data-server
Classification: Platform
Component: Mailer
3.11.x (obsolete)
Other Linux
: Normal normal
: ---
Assigned To: evolution-mail-maintainers
Evolution QA team
Depends on:
Blocks:
 
 
Reported: 2014-03-05 12:28 UTC by Yanko Kaneti
Modified: 2015-08-18 10:50 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Yanko Kaneti 2014-03-05 12:28:34 UTC
This started happening after the upgrade from 3.11.90 to 3.11.91
evolution-3.11.91-1.fc21.x86_64
evolution-data-server-3.11.91-1.fc21.x86_64

Here is the pstack of the evolution process and below it the CAMEL_DEBUG=all log when it happens.







[imapx:B] Have token 'FETCH' id 25421
[imapx:B] got untagged response
[imapx:B] Have token 'FETCH' id 25422
[imapx:B] got untagged response
[imapx:B] Have token 'FETCH' id 25423
[imapx:B] got untagged response
[imapx:B] Have token 'FETCH' id 25424

(evolution:10605): GLib-CRITICAL **: Source ID 215180 was not found when attempting to remove it
[imapx:B] adding command, format = 'UID FETCH %t (BODY.PEEK[]'
[imapx:B] adding command, format = '<%u.%u>'
[imapx:B] got gint '0'

(evolution:10605): GLib-GObject-CRITICAL **: g_closure_unref: assertion 'closure->ref_count > 0' failed
[imapx:B] got gint '20480'
[imapx:B] adding command, format = ')'
[imapx:B] completing command buffer is [39] 'UID FETCH 639128 (BODY.PEEK[]<0.20480>)'
[imapx:B] enqueue job 'UID FETCH 639128 (BODY.PEEK[]<0.20480>)'
[imapx:B] ** Starting next command
[imapx:B] - we're selected on 'INBOX.gnome.CVS', current jobs?
[imapx:B] -    0 'FETCH'
[imapx:B] -- Checking job queue
[imapx:B] -- 100 'FETCH'?
[imapx:B] Selecting mailbox 'INBOX' for command 'FETCH'(0x7fea01f6fde0)
[imapx:B] adding command, format = 'UID FETCH %t (BODY.PEEK[]'
[imapx:B] adding command, format = '<%u.%u>'
[imapx:B] got gint '20480'
[imapx:B] got gint '20480'
[imapx:B] adding command, format = ')'
[imapx:B] completing command buffer is [43] 'UID FETCH 639128 (BODY.PEEK[]<20480.20480>)'
[imapx:B] enqueue job 'UID FETCH 639128 (BODY.PEEK[]<20480.20480>)'
[imapx:B] ** Starting next command
[imapx:B] - we're selected on 'INBOX.gnome.CVS', current jobs?
[imapx:B] -    0 'FETCH'
[imapx:B] -- Checking job queue
[imapx:B] -- 100 'FETCH'?
[imapx:B] -- 100 'FETCH'?
[imapx:B] Selecting mailbox 'INBOX' for command 'FETCH'(0x47ceb30)
[imapx:B] adding command, format = 'UID FETCH %t (BODY.PEEK[]'
[imapx:B] adding command, format = '<%u.%u>'
[imapx:B] got gint '40960'
[imapx:B] got gint '20480'
[imapx:B] adding command, format = ')'
[imapx:B] completing command buffer is [43] 'UID FETCH 639128 (BODY.PEEK[]<40960.20480>)'
[imapx:B] enqueue job 'UID FETCH 639128 (BODY.PEEK[]<40960.20480>)'
[imapx:B] ** Starting next command
[imapx:B] - we're selected on 'INBOX.gnome.CVS', current jobs?
[imapx:B] -    0 'FETCH'
[imapx:B] -- Checking job queue
[imapx:B] -- 100 'FETCH'?
[imapx:B] -- 100 'FETCH'?
[imapx:B] -- 100 'FETCH'?
[imapx:B] Selecting mailbox 'INBOX' for command 'FETCH'(0x7fe9fc004a00)
Comment 1 Yanko Kaneti 2014-03-05 12:34:35 UTC
Forgot to add that this is not really reproducible, happens after some time of normal use.
The GLib-*-CRITICAL stuff has been something ongoing for the last releases and unrealated to the lock I think.
Comment 2 Matthew Barnes 2014-03-05 13:57:22 UTC
Hard to tell a livelock from a static backtrace -- but in this case all the threads seem to be in a blocking poll() or waiting on a condition variable, except Thread 2 is using a timed wait.

Thread 2 (Thread 0x7fea2e1f3700 (LWP 22678))

  • #0 pthread_cond_timedwait
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S line 238
  • #1 g_cond_wait_until
    at gthread-posix.c line 892
  • #2 g_async_queue_pop_intern_unlocked
    at gasyncqueue.c line 422
  • #3 g_async_queue_timeout_pop
    at gasyncqueue.c line 543
  • #4 g_thread_pool_wait_for_new_pool
    at gthreadpool.c line 167
  • #5 g_thread_pool_thread_proxy
    at gthreadpool.c line 364
  • #6 g_thread_proxy
    at gthread.c line 764
  • #7 start_thread
    at pthread_create.c line 309
  • #8 clone
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S line 111

If that's the source of the livelock, frame #4 suggests perhaps this is a case of thread pool starvation -- like it's waiting for an available worker thread to handle some async task, but I can't tell what the task is.

Not clear that this is IMAPX related.  Did you upgrade other packages at the same time, like GLib or WebKit/GTK?
Comment 3 Yanko Kaneti 2014-03-05 14:24:09 UTC
Its rawhide. Most of the gnome stack has been updated in the last couple days.

When it happens the UI is responsive, eg you can scroll the message list or the preview pane, its just the background mail operations that are not working, the status stuck with one or two ongoing operations one of them "refreshing folder XX" and the camel debug log also stops.
Comment 4 Matthew Barnes 2014-03-05 14:43:54 UTC
Okay, I'll keep a lookout for it.  I'm not on rawhide, in fact I intentionally build against older base libraries to help weed out these kinds of things.
Comment 5 Yanko Kaneti 2014-03-06 16:22:08 UTC
You might be right about this not really locking. I've waited one such "lockup" for 5-10 minutes and eventually it finished with the refreshing of folder XX and everything was back to normal. The camel log was still stalled during all that waiting.

I am not sure how to go about this further...
Comment 6 Milan Crha 2014-05-09 09:00:38 UTC
Could you retest with 3.12.2, when it'll be released the next week, please? I'll update Fedora rawhide shortly after the release. I made some fixes in IMAPx which should help here too. Thanks in advance.
Comment 7 Milan Crha 2014-07-10 13:09:10 UTC
Yanko pointed me here from bug #730788, when I mentioned a regression with changes from that bug, reported within bug #732366. Looking in the backtrace I'd guess that the eds with changes from the bug #732366 should address this. The 3.12.4 will be released the next week (what a coincidence with the comment #6).
Comment 8 Yanko Kaneti 2015-08-18 10:50:59 UTC
Following the 3.17.x development and after the imapx rework in 3.17.90 I don't see any more stalls in the queue processing. 
Will close this one as incomplete for the old branch, but its definitely fixed in master.
Thanks