Bug 738545 – Busy loop under task_thread_cancelled()

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 738545 - Busy loop under task_thread_cancelled()


Summary:	Busy loop under task_thread_cancelled()


Status:	RESOLVED OBSOLETE

Product:	glib
Classification:	Platform
Component:	gio
Version:	2.42.x
Hardware:	Other Linux

Importance:	Normal major
Target Milestone:	---
Assigned To:	gtkdev
QA Contact:	gtkdev

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2014-10-14 18:51 UTC by Christian Stadelmann
Modified:	2018-02-06 17:00 UTC

See Also:
GNOME target:	---
GNOME version:	3.11/3.12

Attachments
the backtrace from gdb with command `bt full` (12.08 KB, text/plain) 2014-10-14 18:51 UTC, Christian Stadelmann	Details

Description Christian Stadelmann 2014-10-14 18:51:46 UTC

Created attachment 288543 [details]
the backtrace from gdb with command `bt full`

I am using GNOME 3.14 with Evolution 3.12.6.
After logging in evolution-calendar-factory runs at 100% CPU usage (one core/thread) and slowly aquires more memory. This happens most if not every time I log in.

Evolution and california are running fine without noticeable problems.
I can stop (pause) the process which makes calendars unavailable in california and evolution.
Whenever I terminate the process it will be started again.

Running evolution-calendar-factory with debug output enabled as specified in https://wiki.gnome.org/Apps/Evolution/Debugging didn't help much, there was only one single line logged:

Bus name 'org.gnome.evolution.dataserver.Calendar4' acquired.

I attached a backtrace generated by gdb. If you need more info just tell me what to do.

Comment 1 rogier.delporte 2014-10-16 16:02:54 UTC

Same problem here, also very similar reports on https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=755981 . Very annoying bug, renders evolution completely useless.

Comment 2 Christian Stadelmann 2014-11-04 09:44:40 UTC

According to downstream bug reports
https://bugzilla.redhat.com/show_bug.cgi?id=1148247
and
https://bugzilla.redhat.com/show_bug.cgi?id=1151665
this is NM-related.

It seems like this has getting worse. Now this bug renders evolution's calendar view completely useless, reminders don't work at all, the gnome-shell calendar view does not display any events and california is barely usable because it takes so long.

Comment 3 Milan Crha 2014-11-05 07:43:05 UTC

(In reply to comment #2)
> It seems like this has getting worse. Now this bug renders evolution's calendar
> view completely useless, reminders don't work at all, the gnome-shell calendar
> view does not display any events and california is barely usable because it
> takes so long.

Makes sense, when the "server" (evolution-calendar-factory) is busy, it influences each client connecting to it.

Comment 4 Milan Crha 2014-11-05 07:47:18 UTC

Philip, is it possible your changes around task_thread_cancelled() from bug #736806 cause the issue here?

+ Trace 234304

#0 g_cancellable_is_cancelled
at gcancellable.c line 288
#1 g_task_compare_priority
at gtask.c line 1759
#2 g_async_queue_invert_compare
at gasyncqueue.c line 357
#3 g_list_sort_merge
at glist.c line 1122
#4 g_list_sort_real
at glist.c line 1168
#5 g_list_sort_real
at glist.c line 1168
#6 g_list_sort_real
at glist.c line 1168
#7 g_list_sort_real
at glist.c line 1168
#8 g_list_sort_real
at glist.c line 1168
#9 g_list_sort_real
at glist.c line 1168
#10 g_list_sort_real
at glist.c line 1168
#11 g_list_sort_real
at glist.c line 1168
#12 g_list_sort_real
at glist.c line 1168
#13 g_list_sort_with_data
at glist.c line 1238
#14 g_queue_sort
at gqueue.c line 324
#15 g_async_queue_sort_unlocked
at gasyncqueue.c line 783
#16 g_thread_pool_set_sort_function
at gthreadpool.c line 958
#17 g_task_thread_pool_resort
at gtask.c line 1784
#18 task_thread_cancelled
at gtask.c line 1229
#19 _g_closure_invoke_va
at gclosure.c line 831
#20 g_signal_emit_valist
at gsignal.c line 3218
#21 g_signal_emit
at gsignal.c line 3365
#22 g_cancellable_cancel
at gcancellable.c line 499
#23 backend_update_online_state_idle_cb
at e-backend.c line 149
#24 g_main_dispatch
at gmain.c line 3111
#25 g_main_context_dispatch
at gmain.c line 3710
#26 g_main_context_iterate
at gmain.c line 3781
#27 g_main_loop_run
at gmain.c line 3975
#28 dbus_server_run_server
at e-dbus-server.c line 230
#29 ffi_call_unix64
at ../src/x86/unix64.S line 76
#30 ffi_call
at ../src/x86/ffi64.c line 525
#31 g_cclosure_marshal_generic_va
at gclosure.c line 1541
#32 _g_closure_invoke_va
at gclosure.c line 831
#33 g_signal_emit_valist
at gsignal.c line 3218
#34 g_signal_emit
at gsignal.c line 3365
#35 e_dbus_server_run
at e-dbus-server.c line 419
#36 main
at evolution-calendar-factory.c line 135

Comment 5 Dan Winship 2014-11-05 13:00:00 UTC

I am pretty sure this is still just the NM bug. Only a limited number of task threads can be running at once, so if GNetworkMonitor signals are resulting in evo queuing a run-in-thread task, and the signals are being emitted at a ludicrous speed, then the queue of pending tasks would keep getting longer and longer, and so at some point, it becomes pretty likely that whenever you interrupt the process, it's going to be in the middle of re-sorting the queue, not because there's any bug in the sorting, but because the queue is huge.

If it is the NM bug, then killing NM should make it go away. (Eventually... I don't know how long it would take evo to work through all its pending tasks.)

Comment 6 Milan Crha 2014-11-05 14:32:45 UTC

Just a side note, evolution is not running the tasks on its own, the tasks are run as part of a g_network_monitor_can_reach_async() call. The thing is that these are piled on each "network-changed" signal receive for each opened backend (calendar/book/...), thus if a network change invoked multiple "network-changed" signals, then it made the pile larger. The upcoming 3.12.8 version will have a workaround for this, to postpone these calls by several seconds, eventually group multiple "network-changed" signals into one g_network_monitor_can_reach_async() call.

Comment 7 James 2014-11-26 15:15:06 UTC

I can confirm this issue as well.

It doesn't happen all of the time, but it happens enough that it makes my machine too often unusable.

Comment 8 Christian Stadelmann 2018-02-06 17:00:55 UTC

I haven't seen this bug in quite a long time, so I guess it is gone and has been fixed somewhere somehow.

In case you still see this bug, please comment or reopen this bug report.