Bug 337332 – Fix podcast manager locking insanity

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 337332 - Fix podcast manager locking insanity


Summary:	Fix podcast manager locking insanity


Status:	RESOLVED FIXED

Product:	rhythmbox
Classification:	Other
Component:	general
Version:	0.9.3
Hardware:	Other FreeBSD

Importance:	Normal critical
Target Milestone:	---
Assigned To:	RhythmBox Maintainers
QA Contact:	RhythmBox Maintainers

URL:
Whiteboard:

Duplicates:	362883 363124 377581 433542 (view as bug list)
Depends on:
Blocks:

Reported:	2006-04-05 10:10 UTC by Andrew Turner
Modified:	2007-04-26 16:45 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Output from strace. (11.12 KB, text/plain) 2006-04-05 11:35 UTC, Andrew Turner		Details
Backtrace from crash when downloading a podcast (5.91 KB, text/plain) 2006-04-05 11:45 UTC, Andrew Turner		Details
Avoid mutex free abort (680 bytes, patch) 2006-10-26 22:04 UTC, Joe Marcus Clarke	none	Details \| Review
"Fix" all mutex issues (1.33 KB, patch) 2006-10-26 23:25 UTC, Joe Marcus Clarke	none	Details \| Review
unthreadify patch (9.15 KB, patch) 2006-11-08 12:43 UTC, James "Doc" Livingston	none	Details \| Review
updated patch (32.88 KB, patch) 2006-11-24 08:14 UTC, Jonathan Matthew	none	Details \| Review
further updated patch (34.78 KB, patch) 2006-11-24 10:45 UTC, Jonathan Matthew	committed	Details \| Review

Description Andrew Turner 2006-04-05 10:10:50 UTC

Steps to reproduce:
1. Start Rhythmbox with "GNOME_DISABLE_CRASH_DIALOG=1 rhythmbox"
2. Close Rhythmbox

Stack trace:
Core was generated by `rhythmbox'.
Program terminated with signal 6, Aborted.

+ Trace 67473

Thread 4 (Thread 0x8130000 (LWP 100169))

#0 _umtx_op
from /lib/libc.so.6
#1 pthread_cleanup_pop
from /usr/lib/libthr.so.2
#2 pthread_cond_destroy
from /usr/lib/libthr.so.2
#3 g_async_queue_pop_intern_unlocked
at gasyncqueue.c line 231
#4 IA__g_async_queue_pop
at gasyncqueue.c line 271
#5 rhythmdb_shutdown
at rhythmdb.c line 649
#6 rb_shell_finalize
at rb-shell.c line 899
#7 IA__g_object_unref
at gobject.c line 1702
#8 closure_invoke_notifiers
at gclosure.c line 187
#9 IA__g_closure_invoke
at gclosure.c line 498
#10 signal_emit_unlocked_R
at gsignal.c line 2485
#11 IA__g_signal_emit_valist
at gsignal.c line 2254
#12 IA__g_signal_emit
at gsignal.c line 2288
#13 gtk_widget_event_internal
at gtkwidget.c line 3732
#14 IA__gtk_widget_event
at gtkwidget.c line 3538
#15 IA__gtk_main_do_event
at gtkmain.c line 1356
#16 gdk_event_dispatch
at gdkevents-x11.c line 2291
#17 g_main_dispatch
at gmain.c line 1934
#18 IA__g_main_context_dispatch
at gmain.c line 2484
#19 g_main_context_iterate
at gmain.c line 2565
#20 IA__g_main_loop_run
at gmain.c line 2769
#21 bonobo_main
from /usr/local/lib/libbonobo-2.so.0
#22 main
at main.c line 398



Other information:
It appears pthread_mutex_destroy() is returning EBUSY. This would indicate the
mutex is locked by another thread.

The output on the console is:
GThread-ERROR **: file gthread-posix.c: line 161 (): error 'Device busy' during
'pthread_mutex_destroy ((pthread_mutex_t *) mutex)'
aborting...
Abort trap: 6 (core dumped)

Comment 1 Alex Lancaster 2006-04-05 10:21:21 UTC

This is probably the same problem as described in bug #332302.  Can you run strace  -f -p <pid-of-rhythmbox>?

Comment 2 Alex Lancaster 2006-04-05 10:26:37 UTC

This is assuming that the "crash" is actually a hang.  If not, you should be able to strace rhythmbox.

Comment 3 Andrew Turner 2006-04-05 11:35:36 UTC

Created attachment 62786 [details]
Output from strace.

strace was started after just before I closed rhythmbox.

Comment 4 Andrew Turner 2006-04-05 11:40:09 UTC

It appears to be a threading problem with podcasts. I can maje Rhythmbox crash with the same backtrace as thread 1 by downloading a podcast.

Comment 5 Andrew Turner 2006-04-05 11:45:09 UTC

Created attachment 62789 [details]
Backtrace from crash when downloading a podcast

This backtrace is when I attempt to download a podcast.

The terminal window has the same error message as described in the original bug. Thread 1 has the same backtrace going through abort().

Comment 6 Jonathan Matthew 2006-04-06 11:47:58 UTC

The locking in the podcast manager code is moderately insane, and it needs to be reworked.  I'm (slowly) working on it.  I don't think there's much point investigating this particular issue further.

Comment 7 James "Doc" Livingston 2006-06-07 05:03:20 UTC

Retitling the bug to be clearer.

Comment 8 Joe Marcus Clarke 2006-10-26 22:03:48 UTC

I noticed a problem on FreeBSD that I think is related.  The mutex_working mutex is locked before the call to gnome_vfs_async_xfer().  However, this mutex is unlocked in the GnomeVFSXferProgressCallback function for this transfer.  That function is called in another thread from the one that locked the mutex, and thus it cannot be unlocked.  Then, when g_mutex_free() is called on mutex_working, rb aborts.

I will attach a patch that corrects this, but I think it's a big kludge.  Basically, mutex_working is still locked start_job(), but then it's immediately unlocked after the call to gnome_vfs_async_xfer().  However, I lock the mutex again in download_progress_cb if the phase is GNOME_VFS_XFER_PHASE_INITIAL.  Not ideal by any means, but it does the job of making sure the callback thread owns the mutex lock.

Comment 9 Joe Marcus Clarke 2006-10-26 22:04:52 UTC

Created attachment 75475 [details] [review]
Avoid mutex free abort

Comment 10 Joe Marcus Clarke 2006-10-26 23:25:09 UTC

Created attachment 75481 [details] [review]
"Fix" all mutex issues

The previous patch fixed the crash, but still resulted in the hang from a locked mutex_job.  This patch corrects that as well, but it's quite hackish, and prone to races.  It should work for people in the meantime, until locking can be fixed for real.

Comment 11 James "Doc" Livingston 2006-11-08 12:43:52 UTC

Created attachment 76202 [details] [review]
unthreadify patch

The only bits that run in other threads are download_progress_cb() and the podcast parser. The latter of which doesn't touch any data used by other threads,

This patch does any thread-unsafe work from the former in idle callbacks, and so lets us remove all the locks. I'm fairly sure this fixes it in a less hacky way.

Comment 12 Joe Marcus Clarke 2006-11-10 00:14:52 UTC

The patch seems to help with the initial crash problem, but there is some strange behavior.  If you try to download two podcasts at once, the progress bar from the first downloading podcast, blinks back and forth between the status for the first download, and the status for the second.  Meanwhile, the second podcast remains in a Waiting state.  Once the first one completes, rhythmbox crashes on a SIGSEGV with the following message:

(rhythmbox:33074): RhythmDB-CRITICAL **: rhythmdb_entry_get_string: assertion `entry != NULL' failed

Comment 13 James "Doc" Livingston 2006-11-21 08:31:28 UTC

*** Bug 377581 has been marked as a duplicate of this bug. ***

Comment 14 Jonathan Matthew 2006-11-24 08:14:06 UTC

Created attachment 77090 [details] [review]
updated patch

more cleanup, fixes issues mentioned in comment 12, fixes a few minor memory leaks, fixes user-visible spelling mistakes.

Comment 15 Jonathan Matthew 2006-11-24 10:45:51 UTC

Created attachment 77093 [details] [review]
further updated patch

also makes download cancellation work.

Comment 16 James "Doc" Livingston 2006-11-29 07:54:05 UTC

Looks okay to me.

Comment 17 Jonathan Matthew 2006-11-30 13:20:35 UTC

Committed (with some further work on download cancellation).

I haven't actually tested this on FreeBSD; if it turns out this doesn't fix it, please reopen the bug.

Comment 18 James "Doc" Livingston 2007-01-21 13:16:17 UTC

*** Bug 363124 has been marked as a duplicate of this bug. ***

Comment 19 James "Doc" Livingston 2007-03-12 07:42:33 UTC

*** Bug 362883 has been marked as a duplicate of this bug. ***

Comment 20 palfrey 2007-04-26 16:45:17 UTC

*** Bug 433542 has been marked as a duplicate of this bug. ***