GNOME Bugzilla – Bug 162999
Thread unsafe image loaders can cause a deadlock
Last modified: 2010-07-10 04:07:59 UTC
Yesterday, I noticed a crash when executing 'vino-preferences'. It triggered an
assertion in NetBSD's libpthread, and then crashed with a SIGABRT:
assertion "next != 0" failed: file
"/home/jmmv/NetBSD/src/lib/libpthread/pthread_run.c", line 130, function
Although cryptic, this means (or at least I think so) that the program has
reached a deadlock condition (i.e., there are no more available threads to run,
hence "next is 0").
Running gdb showed:
Which made me think about some problem in the current theme (Nuvola).
Effectively, I switched to another one, Wasp, and everything started to work
again. Tried to copy a .svg image inside Wasp's directory, using the
gnome-lockscreen.svg name, and the problem reappeared.
I tracked down the problem and found that there is some code in gdk-pixbuf that
issues a "nested lock", i.e., it tries to lock the same mutex from two different
functions from the same calling thread. This only happens when the underlying
image loader (in this case, the svg one) is not thread safe on its own, so
gdk-pixbuf has to ensure mutual exclusive access to it.
More specifically, the gdk_pixbuf_animation_new_from_file (which happens when
the original program uses gtk_image_set_from_file) calls _gdk_pixbuf_lock.
Then, in the critical section, it calls _gdk_pixbuf_generic_image_load, which in
turn calls _gdk_pixbuf_lock again. There it is, the nested lock.
I've fixed this issue by narrowing the critical code block in
gdk_pixbuf_animation_new_from_file. As I see it, the lock is only needed to
ensure mutual exclusive access to the image_module->load_animation function.
All other accesses to the image_module structure in the same function are either
to check if load_animation is not NULL or to get the module's name (module_name
field). These accesses do not need to be in the critical section, because they
are read only (filled at module load time).
Unfortunately, I haven't been able to reproduce this problem with a little
testcase (that initializes threads and calls the offending problem). However,
it's 100% reproducible using vino-preferences and Nuvola. (Dunno if it will
happen in Linux, though, because I'm using NetBSD.)
Created attachment 35464 [details] [review]
This patch is against GTK+ 2.6.0 code. I would have preferred to provide it
against CVS head, but anoncvs.gnome.org seems to be down, so I can't fetch nor
diff the code.
Ok... now that anoncvs seems to be up again, I have updated my sources and found:
2004-12-27 Matthias Clasen <email@example.com>
* gdk-pixbuf-animation.c (gdk_pixbuf_animation_new_from_file)
Avoid deadlock. Pointed out by Callum McKenzie.
So this is already fixed.
However, what do you think about adapting this patch to the existing code and
applying it? This one removes a goto that becomes useless after the fix, while
the current code still keeps it.
Created attachment 35469 [details] [review]
Patch against CVS
This new patch only cleans up the existing code to remove an useless goto
construction (used before the problem was fixed, but not after). Should not
I'd rather leave the code as is