After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 473862 - Using non-threadsafe pixbuf loaders at the same time causes deadlocks
Using non-threadsafe pixbuf loaders at the same time causes deadlocks
Status: RESOLVED FIXED
Product: gdk-pixbuf
Classification: Platform
Component: general
git master
Other Linux
: Normal major
: ---
Assigned To: gdk-pixbuf-maint
gdk-pixbuf-maint
: 581747 590788 634180 640631 650492 651107 651941 654610 657497 657853 693408 (view as bug list)
Depends on: pango-threadsafe 691068
Blocks: 680373
 
 
Reported: 2007-09-05 10:24 UTC by Lucas Rocha
Modified: 2013-06-28 16:03 UTC
See Also:
GNOME target: ---
GNOME version: 2.19/2.20


Attachments
loader: Only take lock when calling into module code (5.26 KB, patch)
2013-01-03 00:35 UTC, Benjamin Otte (Company)
none Details | Review
ignore non-threadsafe modules (891 bytes, patch)
2013-01-03 13:08 UTC, Matthias Clasen
none Details | Review
Update docs (1.17 KB, patch)
2013-01-03 13:09 UTC, Matthias Clasen
none Details | Review
Remove locking (8.84 KB, patch)
2013-01-03 13:09 UTC, Matthias Clasen
none Details | Review

Description Lucas Rocha 2007-09-05 10:24:14 UTC
For example, when you try to use SVG pixbuf loader to load a SVG image with a TIFF image embedded/referenced: boom!

We worked around some critical problems in EOG caused by this bug (see bug #447063 and bug #449409) but this is a kind of fundamental problem in the pixbuf loading infrastructure.
Comment 1 Matthias Clasen 2007-09-06 19:19:53 UTC
yes, it is. Hard to fix, though.
Comment 2 Milan Crha 2010-09-22 15:28:34 UTC
*** Bug 581747 has been marked as a duplicate of this bug. ***
Comment 3 Milan Crha 2010-09-22 15:35:23 UTC
I was able to reproduce this issue from Evolution, as stated in bug #581747 comment #3, but when I try with actual gdk-pixbuf git master, slightly after 2.22.1 version bump, then I do not see the issue. Might it be that most (if not all) pixbuf loaders are thread safe now?

I'm also wondering what the lock was supposed to be good for. If it is supposed to prevent using the loader from two threads simultaneously, but it's safe to use it recursively, then it might be enough to use a recursive lock, doesn't it?

Of course, it depends on the cause of a thread unsafety. If it's due to some global state variables or something, then the code should be rather changed, than using the lock. But I've no knowledge of gdk-pixbuf internals, so this can be a total off-topic.
Comment 4 Felix Riemann 2010-09-22 16:15:13 UTC
(In reply to comment #3)
> I was able to reproduce this issue from Evolution, as stated in bug #581747
> comment #3, but when I try with actual gdk-pixbuf git master, slightly after
> 2.22.1 version bump, then I do not see the issue. Might it be that most (if not
> all) pixbuf loaders are thread safe now?

From the loaders that ship with gdk-pixbuf the TIFF loader seems to be the only one left.

Looking at gdk-pixbuf-query-loaders' output on my maching there's also librsvg's and libopenraw's loaders which at least don't advertise themselves as threadsafe.

> I'm also wondering what the lock was supposed to be good for. If it is supposed
> to prevent using the loader from two threads simultaneously, but it's safe to
> use it recursively, then it might be enough to use a recursive lock, doesn't
> it?
> 
> Of course, it depends on the cause of a thread unsafety. If it's due to some
> global state variables or something, then the code should be rather changed,
> than using the lock. But I've no knowledge of gdk-pixbuf internals, so this can
> be a total off-topic.

IIRC with the TIFF loader the problem wasn't the loader but rather they way how libtiff deals with errors (only a single, global error handler without providing any context information?). Concurrently using the loader (actually libtiff) would make it impossible to say in which image loading context the error happened. Not sure if things changed with libtiff-4 there.
Comment 5 Milan Crha 2010-09-23 07:22:27 UTC
(In reply to comment #4)
> Looking at gdk-pixbuf-query-loaders' output on my maching there's also
> librsvg's and libopenraw's loaders which at least don't advertise themselves
> as threadsafe.

I see. I had an issue with svg image, as shown in the backtrace at bug #581747
comment #3, but it is strange that only the update to gdk-pixbuf helped, or not?
Comment 6 Milan Crha 2011-01-05 14:34:04 UTC
Similar downstream bug report in Fedora 14:
https://bugzilla.redhat.com/show_bug.cgi?id=664453
Comment 7 John Keller 2011-01-05 15:49:26 UTC
Hi, all. I'm the one who reported the Fedora 14 bug that Milan linked to. Thanks, Milan, for pointing me to this upstream bug.

To summarize my problem, I can consistently reproduce a crash in Evolution when trying to display contacts whose IM info has been used within Pidgin (I use the Evolution plugin).

So: Take a contact "Joe Smith" who has an MSN name of "joesmith" listed in the Evolution contact information, but no image (I don't use images for anyone other than myself, appearing in the screensaver lock dialog).

This contact "Joe Smith" will be pulled in by Pidgin. Pidgin then seems to do something that adds an image to the Evolution data for that contact. If I try to view the contact within Evolution, it freezes completely.


The linked Fedora bug has two pairs of before/after backtraces. I've installed debug info packages for EDS, Evolution and GtkHtml and can install more if you'd like me to do more backtraces.


I currently have these packages installed:

evolution-2.32.1-1.fc14.x86_64
evolution-data-server-2.32.1-1.fc14.x86_64

pidgin-evolution-2.7.7-1.fc14.x86_64

gtk2-2.22.0-1.fc14.1.x86_64

glib2-2.26.0-2.fc14.x86_64

gdk-pixbuf2-2.22.0-1.fc14.x86_64

evolution-data-server-debuginfo-2.32.1-1.fc14.x86_64
evolution-debuginfo-2.32.1-1.fc14.x86_64
gtkhtml3-debuginfo-3.32.1-1.fc14.x86_64
Comment 8 Milan Crha 2011-01-05 18:50:19 UTC
(In reply to comment #7)
> This contact "Joe Smith" will be pulled in by Pidgin. Pidgin then seems to do
> something that adds an image to the Evolution data for that contact. If I try
> to view the contact within Evolution, it freezes completely.

Oh, I forgot to mention, the contact is added from Pidgin by bbdb Evolution plugin, which is named "Automatic Contacts" in Evolution's Edit->Plugins, and its setup is in Edit->Preferences->Mail Preferences->tab Automatic Contacts, section Instant messaging contacts. You've right that the plugin adds images to contacts if they are available from Pidgin.

This can be also related to IM information stored on the contact, as most protocols has its own icon shown in the preview pane of the contact, where I believe this is freezing.

(Maybe contacts with more than one IM protocol filled?)
Comment 9 John Keller 2011-01-05 19:02:19 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > This contact "Joe Smith" will be pulled in by Pidgin. Pidgin then seems to do
> > something that adds an image to the Evolution data for that contact. If I try
> > to view the contact within Evolution, it freezes completely.
> 
> Oh, I forgot to mention, the contact is added from Pidgin by bbdb Evolution
> plugin, which is named "Automatic Contacts" in Evolution's Edit->Plugins, and
> its setup is in Edit->Preferences->Mail Preferences->tab Automatic Contacts,
> section Instant messaging contacts. You've right that the plugin adds images to
> contacts if they are available from Pidgin.

Hm, I hadn't realized that it was two way (Pidgin pulls in contacts, Evolution pulls in avatar images). Maybe my previous version of Evo was old enough that it was only in one direction (Pidgin plugin to get Evo contacts)?

> This can be also related to IM information stored on the contact, as most
> protocols has its own icon shown in the preview pane of the contact, where I
> believe this is freezing.
> 
> (Maybe contacts with more than one IM protocol filled?)

Good point! I can confirm that it seems to lock only for contacts with more than one IM protocol. When I view contacts with only one IM username, they display in the preview pane and I can open them in the edit dialog. But any contact with two or more IM usernames seems to lock up.

I'm not sure about the case where a contact has more than one username defined for the same IM protocol but not for other protocols. My own contact has two MSN usernames, but it also has other protocols like Yahoo and AIM, so I can't test a "pure" multi-username, mono-protocol case.
Comment 10 Milan Crha 2011-01-05 19:35:50 UTC
(In reply to comment #9)
> Hm, I hadn't realized that it was two way (Pidgin pulls in contacts, Evolution
> pulls in avatar images). Maybe my previous version of Evo was old enough that
> it was only in one direction (Pidgin plugin to get Evo contacts)?

I'm not sure what the evolution plugin in Pidgin does, to be honest, but the one in Evolution, I described above, reads list of contacts from the pidgin's contact list and adds these contacts into configured address book with as much information as it is able to gather. If similar contact already exists, then it merges the information into it.
Comment 11 John Keller 2011-01-05 21:07:36 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > Hm, I hadn't realized that it was two way (Pidgin pulls in contacts, Evolution
> > pulls in avatar images). Maybe my previous version of Evo was old enough that
> > it was only in one direction (Pidgin plugin to get Evo contacts)?
> 
> I'm not sure what the evolution plugin in Pidgin does, to be honest, but the
> one in Evolution, I described above, reads list of contacts from the pidgin's
> contact list and adds these contacts into configured address book with as much
> information as it is able to gather. If similar contact already exists, then it
> merges the information into it.

The Pidgin's Evo plugin, to my understanding, basically reads the Evo contacts and compares IM usernames to its buddy list. If the username doesn't exist as a buddy, it automates the invitation to that person; else, it doesn't do anything.

I looked at the Evo plugin preferences (Preferences : Mail Preferences : Automatic Contacts). I'd forgotten to try that before replying earlier.

As it turns out, neither option is checked (neither "Create address book entries when sending mails" nor "Synchronize contact info and images from Pidgin buddy list"). And I know that I haven't clicked the button "Synchronize with buddy list now".

So I'm uncertain how it happened, despite the freeze consistently resulting from the same type of contact (having more than one IM protocol).

On a tangent, I'm wondering if an endless loop couldn't possibly result from Evo's "Automatic Contacts" plugin syncing with Pidgin, and Pidgin's "Evolution Integration" syncing with Evo. At least both are disabled on my computer for now...
Comment 12 Milan Crha 2011-01-06 10:09:07 UTC
(In reply to comment #11)
> On a tangent, I'm wondering if an endless loop couldn't possibly result from
> Evo's "Automatic Contacts" plugin syncing with Pidgin, and Pidgin's "Evolution
> Integration" syncing with Evo. At least both are disabled on my computer for
> now...

No, either plugin is only a possibility how was created a contact with multiple IM accounts in Evolution. Otherwise the plugin doesn't do anything special and has no role. You can do the same when you create a new contacts with manually filled IM protocols.

Also, it's not an endless loop, it's waiting for a lock until it's unlocked, but the unlock is missing, somehow. And that's what this bug is about.
Comment 13 John Keller 2011-01-06 11:10:14 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > On a tangent, I'm wondering if an endless loop couldn't possibly result from
> > Evo's "Automatic Contacts" plugin syncing with Pidgin, and Pidgin's "Evolution
> > Integration" syncing with Evo. At least both are disabled on my computer for
> > now...
> 
> No, either plugin is only a possibility how was created a contact with multiple
> IM accounts in Evolution. Otherwise the plugin doesn't do anything special and
> has no role. You can do the same when you create a new contacts with manually
> filled IM protocols.
> 
> Also, it's not an endless loop, it's waiting for a lock until it's unlocked,
> but the unlock is missing, somehow. And that's what this bug is about.

Sorry, my aside was meant to be about the sync process and pondering a potential loop because each client observes changes by the other (i.e. [new] Pidgin contact -> [causes update of] Evo contact -> [causes update of] Pidgin contact -> ...)

I didn't mean to confuse the topic, which is a deadlock in the image processing. I'll stick to talking the bug in the future. ;-)
Comment 14 Milan Crha 2011-02-21 10:24:50 UTC
*** Bug 640631 has been marked as a duplicate of this bug. ***
Comment 15 Christian Persch 2011-02-21 20:51:24 UTC
*** Bug 634180 has been marked as a duplicate of this bug. ***
Comment 16 John Keller 2011-03-08 16:24:41 UTC
Hi, all. Just wanted to check on about any progress on this bug. Anything I could test?

The impact of this bug in Evolution has pretty much made it impossible for me to use Evo. I'm hoping that there could be a fix in time for Gnome 3, but honestly at this point I'd settle for a workaround.
Comment 17 Matthew Barnes 2011-03-08 17:12:38 UTC
The fix for this from the Evolution side is to stop using threads for this stuff and do it all asynchronously from the main loop.  Unfortunately that's not gonna land in time for 3.0.
Comment 18 Milan Crha 2011-03-09 07:32:44 UTC
(In reply to comment #17)
> The fix for this from the Evolution side is to stop using threads for this
> stuff and do it all asynchronously from the main loop.  Unfortunately that's
> not gonna land in time for 3.0.

I do not think it's a point here, especially if you look into the real backtrace, then you can see evo *does* this asynchronously, in the main thread (taken from bug at comment #6, other threads in that backtrace are just camel-db sync threads in idle). Not everything can be fixed by async, really. :)

A workaround is to change IM protocol icons to not use svg format, in that case will not be used svg pixbuf loader and everything will work as expected (supposing the new format pixbuf loader does not suffer of the same issue). In other words, if I recall correctly, the image format contains inside it another image, both loaders are marked as thread unsafe, and both are locking same lock, though they are different loaders, which results in this freeze. (I think they are different loaders, because having embedded same image format into image itself (like svg in svg) seems pretty strange to me. Nonetheless, I'm not any expert in gdk-pixbuf and this all is just taken from my poor memory from times when I was investigating this.)

Thread 1 (Thread 0x7fd876ef0980 (LWP 5213))

  • #0 __lll_lock_wait
    from /lib64/libpthread.so.0
  • #1 _L_lock_997
    from /lib64/libpthread.so.0
  • #2 pthread_mutex_lock
    from /lib64/libpthread.so.0
  • #3 ??
    from /usr/lib64/libgdk_pixbuf-2.0.so.0
  • #4 ??
    from /usr/lib64/libgdk_pixbuf-2.0.so.0
  • #5 gdk_pixbuf_loader_write
    from /usr/lib64/libgdk_pixbuf-2.0.so.0
  • #6 web_view_request_stream_read_cb
    at e-web-view.c line 239
  • #7 ??
    from /lib64/libgio-2.0.so.0
  • #8 ??
    from /lib64/libgio-2.0.so.0
  • #9 g_main_context_dispatch
    from /lib64/libglib-2.0.so.0
  • #10 ??
    from /lib64/libglib-2.0.so.0
  • #11 g_main_loop_run
    from /lib64/libglib-2.0.so.0
  • #12 gtk_main
    from /usr/lib64/libgtk-x11-2.0.so.0
  • #13 main
    at main.c line 679

Comment 19 Milan Crha 2011-03-09 07:34:25 UTC
heh, better backtrace can be seen in bug #581747
Comment 20 Felix Riemann 2011-03-09 17:25:52 UTC
In my opinion the best start for a workaround would be to make the global pixbuf lock that is held by the non-threadsafe loaders bound to the single loaders and only lock it during gdk-pixbuf-loader-interface calls. I think would leave only the usecase where non-threadsafe loaders use themselves recursively (which I think happens pretty seldom) and resolve the more common use case that a non-threadsafe loader uses another non-threadsafe loader causing a deadlock (which is what caused this bug originally).

Even the recursive case could possibly be resolved using a recursive lock (depends on the loader I guess).

Besides, one could also investigate if the non-threadsafe loaders on one's system are really non-threadsafe or if they were just not marked as threadsafe.
Comment 21 henning.noren 2011-04-19 23:57:14 UTC
Any progress or news on this issue?
The problem is currently blocking me from properly using the contacts in Evolution on Fedora 14 (x86_64) and I am unable to find any workaround for it (except possibly removing all contacts and not adding any with multiple IMs, which is not something I would like to do).

It used to work just fine in Fedora 8 with evolution 2.12.3.

I would be happy to test any packages for Fedora or try workarounds.
Comment 22 John Keller 2011-05-15 22:30:43 UTC
Wanted to check in again to see if this issue could be fixed for the 3.2 cycle...
Comment 23 Milan Crha 2011-08-18 07:04:26 UTC
(In reply to comment #22)
> Wanted to check in again to see if this issue could be fixed for the 3.2
> cycle...

ping John
Comment 24 Milan Crha 2011-08-18 07:04:35 UTC
*** Bug 650492 has been marked as a duplicate of this bug. ***
Comment 25 Milan Crha 2011-08-18 07:05:07 UTC
*** Bug 654610 has been marked as a duplicate of this bug. ***
Comment 26 Milan Crha 2011-08-18 07:08:19 UTC
*** Bug 651107 has been marked as a duplicate of this bug. ***
Comment 27 Milan Crha 2011-08-18 07:22:32 UTC
I created commit 473862 in evolution for 3.1.90 as a workaround till this bug is fixed.
Comment 28 Milan Crha 2011-08-22 13:45:20 UTC
*** Bug 651941 has been marked as a duplicate of this bug. ***
Comment 29 John Keller 2011-08-30 07:49:09 UTC
(In reply to comment #23)
> (In reply to comment #22)
> > Wanted to check in again to see if this issue could be fixed for the 3.2
> > cycle...
> 
> ping John

Hi, Milan. I'll definitely test this, I just can't right away. I'll probably get to it in the next month, just so you know not to close the bug yet. I only have two partitions that I can test on and both are being used right now.
Comment 30 John Keller 2011-08-30 07:50:16 UTC
(In reply to comment #27)
> I created commit 473862 in evolution for 3.1.90 as a workaround till this bug
> is fixed.

473862 is this bug number, is that the commit too? I wanted to check it out of curiosity, but couldn't find it using that number.
Comment 31 Milan Crha 2011-08-31 05:48:40 UTC
(In reply to comment #30)
> (In reply to comment #27)
> > I created commit 473862 in evolution for 3.1.90 as a workaround till this bug
> > is fixed.
> 
> 473862 is this bug number, is that the commit too? I wanted to check it out of
> curiosity, but couldn't find it using that number.

Ouch, human error, the commit number is cfd3c3116 in evo's master and ccbe39a7c in gnome-3-0 branch (3.0.3+).
Comment 32 Sam Morris 2011-09-05 12:57:48 UTC
*** Bug 657853 has been marked as a duplicate of this bug. ***
Comment 33 Milan Crha 2011-09-07 11:39:04 UTC
*** Bug 657497 has been marked as a duplicate of this bug. ***
Comment 34 John Keller 2011-09-14 08:36:02 UTC
(In reply to comment #31)
> (In reply to comment #30)
> > (In reply to comment #27)
> > > I created commit 473862 in evolution for 3.1.90 as a workaround till this bug
> > > is fixed.
> > 
> > 473862 is this bug number, is that the commit too? I wanted to check it out of
> > curiosity, but couldn't find it using that number.
> 
> Ouch, human error, the commit number is cfd3c3116 in evo's master and ccbe39a7c
> in gnome-3-0 branch (3.0.3+).

No worries, better to have humans around. :-)

Wow, all these problems for a few decorations?! I thought it had something to do with the dialog, and Evo's fetching the user's avatar from various IM services. But it's just a redundant icon to represent the services themselves. I'd much rather have my data safe than lots of icons... ;-)
Comment 35 John Keller 2011-09-14 08:39:28 UTC
(In reply to comment #29)
> (In reply to comment #23)
> > (In reply to comment #22)
> > > Wanted to check in again to see if this issue could be fixed for the 3.2
> > > cycle...
> > 
> > ping John
> 
> Hi, Milan. I'll definitely test this, I just can't right away. I'll probably
> get to it in the next month, just so you know not to close the bug yet. I only
> have two partitions that I can test on and both are being used right now.

Sorry that it took me so long to test, Milan. Thanks for pushing the workaround as soon as you did. It ended up being a case of the mountain coming to Mohammad...

I can confirm that Evo no longer freezes for me, as of 3.0.3. I can successfully view and edit my contacts who have more than one IM service.

So while the larger issue might remain, I no longer have problems with Evolution because it avoids using the problematic API call. Thanks!
Comment 36 Olivier Crête 2011-12-19 06:17:19 UTC
Actually, you can even reproduce it more trivially. I think the core problem is that GdkPixbufLoader takes the lock in one function (gdk_pixbuf_loader_load_module()) and releases it in another (gdk_pixbuf_loader_close()).. This sounds likea big no no...

I guess the non-thread safe loaders should probably be ignored and be re-written in a thread safe way.  One possibility is to store the pre-parsed data and re-invoke the entire loader every time. Yes, it's more work, but it's probably the only way to do it if the underlying library isn't async/thread safe.

#include <gdk-pixbuf/gdk-pixbuf.h>

int main(int argc, char **argv)
{
  GFile *file;
  GFileInputStream *input_s;
  GdkPixbufLoader *l1, *l2;
  gchar buf[15000];
  gsize bytes;
  gboolean ret;

  g_type_init();

  /* Note any file with a non-thread safe loader will do, the bug is in gdk-pixbuf */
  file = g_file_new_for_path("/usr/share/icons/HighContrast/scalable/emotes/face-plain.svg");
  g_assert(file);

  input_s = g_file_read (file, NULL, NULL);
  g_assert(input_s);

  ret = g_input_stream_read_all(G_INPUT_STREAM(input_s), buf, 15000, &bytes, NULL, NULL);
  g_assert(ret);
  g_debug("size: %zu", bytes);

  l1 = gdk_pixbuf_loader_new ();
  l2 = gdk_pixbuf_loader_new (); 

  ret = gdk_pixbuf_loader_write(l1, buf, bytes, NULL);
  g_assert (ret);

  ret = gdk_pixbuf_loader_write(l2, buf, bytes, NULL);
  g_assert (ret);

  return 0;
}
Comment 37 Christian Persch 2012-01-29 18:21:10 UTC
*** Bug 590788 has been marked as a duplicate of this bug. ***
Comment 38 Alexander Larsson 2012-09-20 16:01:18 UTC
> One possibility is to store the pre-parsed data and re-invoke the entire 
> loader every time.

I don't think this is possible as we're generally relying on the loader to signal when we don't need to read more.

So, i think the realistic plan is to fix pango (bug 377539), mark svg loader as threadsafe and then stop loading non-threadsafe pixbuf loaders with a huge warning that they need to be fixed.
Comment 39 Benjamin Otte (Company) 2013-01-03 00:35:36 UTC
Created attachment 232599 [details] [review]
loader: Only take lock when calling into module code

Instead of taking the lock on the first write to the loader and
releasing it when the loader is closed, take the lock in a finer-grained
manner only when calling into the module vfuncs.
Comment 40 Matthias Clasen 2013-01-03 13:08:58 UTC
Created attachment 232631 [details] [review]
ignore non-threadsafe modules
Comment 41 Matthias Clasen 2013-01-03 13:09:32 UTC
Created attachment 232632 [details] [review]
Update docs
Comment 42 Matthias Clasen 2013-01-03 13:09:56 UTC
Created attachment 232634 [details] [review]
Remove locking
Comment 43 Matthias Clasen 2013-01-03 13:10:35 UTC
Should hold off on this until we have pango and librsvg releases
Comment 44 Cosimo Cecchi 2013-02-08 21:59:48 UTC
This should be finally fixed now.
Comment 45 Cosimo Cecchi 2013-02-08 22:00:29 UTC
*** Bug 693408 has been marked as a duplicate of this bug. ***
Comment 46 Robin Stocker 2013-06-28 16:03:09 UTC
For others coming to this bug, it seems this was released in gdk-pixbuf 2.28.0. pango >= 1.32.6 should include the fixes on the pango side. If you're on Fedora, this means that it should be fixed in the upcoming Fedora 19.