GNOME Bugzilla – Bug 473862
Using non-threadsafe pixbuf loaders at the same time causes deadlocks
Last modified: 2013-06-28 16:03:09 UTC
For example, when you try to use SVG pixbuf loader to load a SVG image with a TIFF image embedded/referenced: boom! We worked around some critical problems in EOG caused by this bug (see bug #447063 and bug #449409) but this is a kind of fundamental problem in the pixbuf loading infrastructure.
yes, it is. Hard to fix, though.
*** Bug 581747 has been marked as a duplicate of this bug. ***
I was able to reproduce this issue from Evolution, as stated in bug #581747 comment #3, but when I try with actual gdk-pixbuf git master, slightly after 2.22.1 version bump, then I do not see the issue. Might it be that most (if not all) pixbuf loaders are thread safe now? I'm also wondering what the lock was supposed to be good for. If it is supposed to prevent using the loader from two threads simultaneously, but it's safe to use it recursively, then it might be enough to use a recursive lock, doesn't it? Of course, it depends on the cause of a thread unsafety. If it's due to some global state variables or something, then the code should be rather changed, than using the lock. But I've no knowledge of gdk-pixbuf internals, so this can be a total off-topic.
(In reply to comment #3) > I was able to reproduce this issue from Evolution, as stated in bug #581747 > comment #3, but when I try with actual gdk-pixbuf git master, slightly after > 2.22.1 version bump, then I do not see the issue. Might it be that most (if not > all) pixbuf loaders are thread safe now? From the loaders that ship with gdk-pixbuf the TIFF loader seems to be the only one left. Looking at gdk-pixbuf-query-loaders' output on my maching there's also librsvg's and libopenraw's loaders which at least don't advertise themselves as threadsafe. > I'm also wondering what the lock was supposed to be good for. If it is supposed > to prevent using the loader from two threads simultaneously, but it's safe to > use it recursively, then it might be enough to use a recursive lock, doesn't > it? > > Of course, it depends on the cause of a thread unsafety. If it's due to some > global state variables or something, then the code should be rather changed, > than using the lock. But I've no knowledge of gdk-pixbuf internals, so this can > be a total off-topic. IIRC with the TIFF loader the problem wasn't the loader but rather they way how libtiff deals with errors (only a single, global error handler without providing any context information?). Concurrently using the loader (actually libtiff) would make it impossible to say in which image loading context the error happened. Not sure if things changed with libtiff-4 there.
(In reply to comment #4) > Looking at gdk-pixbuf-query-loaders' output on my maching there's also > librsvg's and libopenraw's loaders which at least don't advertise themselves > as threadsafe. I see. I had an issue with svg image, as shown in the backtrace at bug #581747 comment #3, but it is strange that only the update to gdk-pixbuf helped, or not?
Similar downstream bug report in Fedora 14: https://bugzilla.redhat.com/show_bug.cgi?id=664453
Hi, all. I'm the one who reported the Fedora 14 bug that Milan linked to. Thanks, Milan, for pointing me to this upstream bug. To summarize my problem, I can consistently reproduce a crash in Evolution when trying to display contacts whose IM info has been used within Pidgin (I use the Evolution plugin). So: Take a contact "Joe Smith" who has an MSN name of "joesmith" listed in the Evolution contact information, but no image (I don't use images for anyone other than myself, appearing in the screensaver lock dialog). This contact "Joe Smith" will be pulled in by Pidgin. Pidgin then seems to do something that adds an image to the Evolution data for that contact. If I try to view the contact within Evolution, it freezes completely. The linked Fedora bug has two pairs of before/after backtraces. I've installed debug info packages for EDS, Evolution and GtkHtml and can install more if you'd like me to do more backtraces. I currently have these packages installed: evolution-2.32.1-1.fc14.x86_64 evolution-data-server-2.32.1-1.fc14.x86_64 pidgin-evolution-2.7.7-1.fc14.x86_64 gtk2-2.22.0-1.fc14.1.x86_64 glib2-2.26.0-2.fc14.x86_64 gdk-pixbuf2-2.22.0-1.fc14.x86_64 evolution-data-server-debuginfo-2.32.1-1.fc14.x86_64 evolution-debuginfo-2.32.1-1.fc14.x86_64 gtkhtml3-debuginfo-3.32.1-1.fc14.x86_64
(In reply to comment #7) > This contact "Joe Smith" will be pulled in by Pidgin. Pidgin then seems to do > something that adds an image to the Evolution data for that contact. If I try > to view the contact within Evolution, it freezes completely. Oh, I forgot to mention, the contact is added from Pidgin by bbdb Evolution plugin, which is named "Automatic Contacts" in Evolution's Edit->Plugins, and its setup is in Edit->Preferences->Mail Preferences->tab Automatic Contacts, section Instant messaging contacts. You've right that the plugin adds images to contacts if they are available from Pidgin. This can be also related to IM information stored on the contact, as most protocols has its own icon shown in the preview pane of the contact, where I believe this is freezing. (Maybe contacts with more than one IM protocol filled?)
(In reply to comment #8) > (In reply to comment #7) > > This contact "Joe Smith" will be pulled in by Pidgin. Pidgin then seems to do > > something that adds an image to the Evolution data for that contact. If I try > > to view the contact within Evolution, it freezes completely. > > Oh, I forgot to mention, the contact is added from Pidgin by bbdb Evolution > plugin, which is named "Automatic Contacts" in Evolution's Edit->Plugins, and > its setup is in Edit->Preferences->Mail Preferences->tab Automatic Contacts, > section Instant messaging contacts. You've right that the plugin adds images to > contacts if they are available from Pidgin. Hm, I hadn't realized that it was two way (Pidgin pulls in contacts, Evolution pulls in avatar images). Maybe my previous version of Evo was old enough that it was only in one direction (Pidgin plugin to get Evo contacts)? > This can be also related to IM information stored on the contact, as most > protocols has its own icon shown in the preview pane of the contact, where I > believe this is freezing. > > (Maybe contacts with more than one IM protocol filled?) Good point! I can confirm that it seems to lock only for contacts with more than one IM protocol. When I view contacts with only one IM username, they display in the preview pane and I can open them in the edit dialog. But any contact with two or more IM usernames seems to lock up. I'm not sure about the case where a contact has more than one username defined for the same IM protocol but not for other protocols. My own contact has two MSN usernames, but it also has other protocols like Yahoo and AIM, so I can't test a "pure" multi-username, mono-protocol case.
(In reply to comment #9) > Hm, I hadn't realized that it was two way (Pidgin pulls in contacts, Evolution > pulls in avatar images). Maybe my previous version of Evo was old enough that > it was only in one direction (Pidgin plugin to get Evo contacts)? I'm not sure what the evolution plugin in Pidgin does, to be honest, but the one in Evolution, I described above, reads list of contacts from the pidgin's contact list and adds these contacts into configured address book with as much information as it is able to gather. If similar contact already exists, then it merges the information into it.
(In reply to comment #10) > (In reply to comment #9) > > Hm, I hadn't realized that it was two way (Pidgin pulls in contacts, Evolution > > pulls in avatar images). Maybe my previous version of Evo was old enough that > > it was only in one direction (Pidgin plugin to get Evo contacts)? > > I'm not sure what the evolution plugin in Pidgin does, to be honest, but the > one in Evolution, I described above, reads list of contacts from the pidgin's > contact list and adds these contacts into configured address book with as much > information as it is able to gather. If similar contact already exists, then it > merges the information into it. The Pidgin's Evo plugin, to my understanding, basically reads the Evo contacts and compares IM usernames to its buddy list. If the username doesn't exist as a buddy, it automates the invitation to that person; else, it doesn't do anything. I looked at the Evo plugin preferences (Preferences : Mail Preferences : Automatic Contacts). I'd forgotten to try that before replying earlier. As it turns out, neither option is checked (neither "Create address book entries when sending mails" nor "Synchronize contact info and images from Pidgin buddy list"). And I know that I haven't clicked the button "Synchronize with buddy list now". So I'm uncertain how it happened, despite the freeze consistently resulting from the same type of contact (having more than one IM protocol). On a tangent, I'm wondering if an endless loop couldn't possibly result from Evo's "Automatic Contacts" plugin syncing with Pidgin, and Pidgin's "Evolution Integration" syncing with Evo. At least both are disabled on my computer for now...
(In reply to comment #11) > On a tangent, I'm wondering if an endless loop couldn't possibly result from > Evo's "Automatic Contacts" plugin syncing with Pidgin, and Pidgin's "Evolution > Integration" syncing with Evo. At least both are disabled on my computer for > now... No, either plugin is only a possibility how was created a contact with multiple IM accounts in Evolution. Otherwise the plugin doesn't do anything special and has no role. You can do the same when you create a new contacts with manually filled IM protocols. Also, it's not an endless loop, it's waiting for a lock until it's unlocked, but the unlock is missing, somehow. And that's what this bug is about.
(In reply to comment #12) > (In reply to comment #11) > > On a tangent, I'm wondering if an endless loop couldn't possibly result from > > Evo's "Automatic Contacts" plugin syncing with Pidgin, and Pidgin's "Evolution > > Integration" syncing with Evo. At least both are disabled on my computer for > > now... > > No, either plugin is only a possibility how was created a contact with multiple > IM accounts in Evolution. Otherwise the plugin doesn't do anything special and > has no role. You can do the same when you create a new contacts with manually > filled IM protocols. > > Also, it's not an endless loop, it's waiting for a lock until it's unlocked, > but the unlock is missing, somehow. And that's what this bug is about. Sorry, my aside was meant to be about the sync process and pondering a potential loop because each client observes changes by the other (i.e. [new] Pidgin contact -> [causes update of] Evo contact -> [causes update of] Pidgin contact -> ...) I didn't mean to confuse the topic, which is a deadlock in the image processing. I'll stick to talking the bug in the future. ;-)
*** Bug 640631 has been marked as a duplicate of this bug. ***
*** Bug 634180 has been marked as a duplicate of this bug. ***
Hi, all. Just wanted to check on about any progress on this bug. Anything I could test? The impact of this bug in Evolution has pretty much made it impossible for me to use Evo. I'm hoping that there could be a fix in time for Gnome 3, but honestly at this point I'd settle for a workaround.
The fix for this from the Evolution side is to stop using threads for this stuff and do it all asynchronously from the main loop. Unfortunately that's not gonna land in time for 3.0.
(In reply to comment #17) > The fix for this from the Evolution side is to stop using threads for this > stuff and do it all asynchronously from the main loop. Unfortunately that's > not gonna land in time for 3.0. I do not think it's a point here, especially if you look into the real backtrace, then you can see evo *does* this asynchronously, in the main thread (taken from bug at comment #6, other threads in that backtrace are just camel-db sync threads in idle). Not everything can be fixed by async, really. :) A workaround is to change IM protocol icons to not use svg format, in that case will not be used svg pixbuf loader and everything will work as expected (supposing the new format pixbuf loader does not suffer of the same issue). In other words, if I recall correctly, the image format contains inside it another image, both loaders are marked as thread unsafe, and both are locking same lock, though they are different loaders, which results in this freeze. (I think they are different loaders, because having embedded same image format into image itself (like svg in svg) seems pretty strange to me. Nonetheless, I'm not any expert in gdk-pixbuf and this all is just taken from my poor memory from times when I was investigating this.)
+ Trace 226240
Thread 1 (Thread 0x7fd876ef0980 (LWP 5213))
heh, better backtrace can be seen in bug #581747
In my opinion the best start for a workaround would be to make the global pixbuf lock that is held by the non-threadsafe loaders bound to the single loaders and only lock it during gdk-pixbuf-loader-interface calls. I think would leave only the usecase where non-threadsafe loaders use themselves recursively (which I think happens pretty seldom) and resolve the more common use case that a non-threadsafe loader uses another non-threadsafe loader causing a deadlock (which is what caused this bug originally). Even the recursive case could possibly be resolved using a recursive lock (depends on the loader I guess). Besides, one could also investigate if the non-threadsafe loaders on one's system are really non-threadsafe or if they were just not marked as threadsafe.
Any progress or news on this issue? The problem is currently blocking me from properly using the contacts in Evolution on Fedora 14 (x86_64) and I am unable to find any workaround for it (except possibly removing all contacts and not adding any with multiple IMs, which is not something I would like to do). It used to work just fine in Fedora 8 with evolution 2.12.3. I would be happy to test any packages for Fedora or try workarounds.
Wanted to check in again to see if this issue could be fixed for the 3.2 cycle...
(In reply to comment #22) > Wanted to check in again to see if this issue could be fixed for the 3.2 > cycle... ping John
*** Bug 650492 has been marked as a duplicate of this bug. ***
*** Bug 654610 has been marked as a duplicate of this bug. ***
*** Bug 651107 has been marked as a duplicate of this bug. ***
I created commit 473862 in evolution for 3.1.90 as a workaround till this bug is fixed.
*** Bug 651941 has been marked as a duplicate of this bug. ***
(In reply to comment #23) > (In reply to comment #22) > > Wanted to check in again to see if this issue could be fixed for the 3.2 > > cycle... > > ping John Hi, Milan. I'll definitely test this, I just can't right away. I'll probably get to it in the next month, just so you know not to close the bug yet. I only have two partitions that I can test on and both are being used right now.
(In reply to comment #27) > I created commit 473862 in evolution for 3.1.90 as a workaround till this bug > is fixed. 473862 is this bug number, is that the commit too? I wanted to check it out of curiosity, but couldn't find it using that number.
(In reply to comment #30) > (In reply to comment #27) > > I created commit 473862 in evolution for 3.1.90 as a workaround till this bug > > is fixed. > > 473862 is this bug number, is that the commit too? I wanted to check it out of > curiosity, but couldn't find it using that number. Ouch, human error, the commit number is cfd3c3116 in evo's master and ccbe39a7c in gnome-3-0 branch (3.0.3+).
*** Bug 657853 has been marked as a duplicate of this bug. ***
*** Bug 657497 has been marked as a duplicate of this bug. ***
(In reply to comment #31) > (In reply to comment #30) > > (In reply to comment #27) > > > I created commit 473862 in evolution for 3.1.90 as a workaround till this bug > > > is fixed. > > > > 473862 is this bug number, is that the commit too? I wanted to check it out of > > curiosity, but couldn't find it using that number. > > Ouch, human error, the commit number is cfd3c3116 in evo's master and ccbe39a7c > in gnome-3-0 branch (3.0.3+). No worries, better to have humans around. :-) Wow, all these problems for a few decorations?! I thought it had something to do with the dialog, and Evo's fetching the user's avatar from various IM services. But it's just a redundant icon to represent the services themselves. I'd much rather have my data safe than lots of icons... ;-)
(In reply to comment #29) > (In reply to comment #23) > > (In reply to comment #22) > > > Wanted to check in again to see if this issue could be fixed for the 3.2 > > > cycle... > > > > ping John > > Hi, Milan. I'll definitely test this, I just can't right away. I'll probably > get to it in the next month, just so you know not to close the bug yet. I only > have two partitions that I can test on and both are being used right now. Sorry that it took me so long to test, Milan. Thanks for pushing the workaround as soon as you did. It ended up being a case of the mountain coming to Mohammad... I can confirm that Evo no longer freezes for me, as of 3.0.3. I can successfully view and edit my contacts who have more than one IM service. So while the larger issue might remain, I no longer have problems with Evolution because it avoids using the problematic API call. Thanks!
Actually, you can even reproduce it more trivially. I think the core problem is that GdkPixbufLoader takes the lock in one function (gdk_pixbuf_loader_load_module()) and releases it in another (gdk_pixbuf_loader_close()).. This sounds likea big no no... I guess the non-thread safe loaders should probably be ignored and be re-written in a thread safe way. One possibility is to store the pre-parsed data and re-invoke the entire loader every time. Yes, it's more work, but it's probably the only way to do it if the underlying library isn't async/thread safe. #include <gdk-pixbuf/gdk-pixbuf.h> int main(int argc, char **argv) { GFile *file; GFileInputStream *input_s; GdkPixbufLoader *l1, *l2; gchar buf[15000]; gsize bytes; gboolean ret; g_type_init(); /* Note any file with a non-thread safe loader will do, the bug is in gdk-pixbuf */ file = g_file_new_for_path("/usr/share/icons/HighContrast/scalable/emotes/face-plain.svg"); g_assert(file); input_s = g_file_read (file, NULL, NULL); g_assert(input_s); ret = g_input_stream_read_all(G_INPUT_STREAM(input_s), buf, 15000, &bytes, NULL, NULL); g_assert(ret); g_debug("size: %zu", bytes); l1 = gdk_pixbuf_loader_new (); l2 = gdk_pixbuf_loader_new (); ret = gdk_pixbuf_loader_write(l1, buf, bytes, NULL); g_assert (ret); ret = gdk_pixbuf_loader_write(l2, buf, bytes, NULL); g_assert (ret); return 0; }
*** Bug 590788 has been marked as a duplicate of this bug. ***
> One possibility is to store the pre-parsed data and re-invoke the entire > loader every time. I don't think this is possible as we're generally relying on the loader to signal when we don't need to read more. So, i think the realistic plan is to fix pango (bug 377539), mark svg loader as threadsafe and then stop loading non-threadsafe pixbuf loaders with a huge warning that they need to be fixed.
Created attachment 232599 [details] [review] loader: Only take lock when calling into module code Instead of taking the lock on the first write to the loader and releasing it when the loader is closed, take the lock in a finer-grained manner only when calling into the module vfuncs.
Created attachment 232631 [details] [review] ignore non-threadsafe modules
Created attachment 232632 [details] [review] Update docs
Created attachment 232634 [details] [review] Remove locking
Should hold off on this until we have pango and librsvg releases
This should be finally fixed now.
*** Bug 693408 has been marked as a duplicate of this bug. ***
For others coming to this bug, it seems this was released in gdk-pixbuf 2.28.0. pango >= 1.32.6 should include the fixes on the pango side. If you're on Fedora, this means that it should be fixed in the upcoming Fedora 19.