GNOME Bugzilla – Bug 336803
thread expander drawn in wrong place, occasionally causes crash with libX11 'BadAlloc' error
Last modified: 2010-05-20 20:31:38 UTC
Steps to reproduce: Run evolution (from Fedora Core 5) for any length of time. The program 'evolution' received an X Window System error. This probably reflects a bug in the program. The error was 'BadAlloc (insufficient resources for operation)'. (Details: serial 2108444 error_code 11 request_code 53 minor_code 0) Stack trace: Other information:
This seems to be happening quite frequently. Especially when using large mailboxes.
Assigning this to the mailer hackers for follow-up.
without a backtrace, nothing can be done. no one else seems to be able to replicate the problem except you.
Not entirely sure how to get such a backtrace. GDB isn't being wonderfully helpful... The program 'evolution-2.6' received an X Window System error. This probably reflects a bug in the program. The error was 'BadAlloc (insufficient resources for operation)'. (Details: serial 1351269 error_code 11 request_code 53 minor_code 0) (Note to programmers: normally, X errors are reported asynchronously; that is, you will receive the error a while after causing it. To debug your program, run it with the --sync command line option to change this behavior. You can then get a meaningful backtrace from your debugger if you break on the gdk_x_error() function.) Program exited with code 01. (gdb) break gdk_x_error Function "gdk_x_error" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (gdk_x_error) pending. (gdb) run
Breakpoint on exit() works a little better. gdk_x_error() is a static function so I'd need the gtk debuginfo installed for that breakpoint to work. The program 'evolution-2.6' received an X Window System error. This probably reflects a bug in the program. The error was 'BadAlloc (insufficient resources for operation)'. (Details: serial 1489767 error_code 11 request_code 53 minor_code 0) (Note to programmers: normally, X errors are reported asynchronously; that is, you will receive the error a while after causing it. To debug your program, run it with the --sync command line option to change this behavior. You can then get a meaningful backtrace from your debugger if you break on the gdk_x_error() function.)
+ Trace 68200
Thread 4159016960 (LWP 8521)
More useful backtrace if I actually give it '--sync' as requested... Breakpoint 2, 0x0f5f8b48 in exit () from /lib/libc.so.6 (gdb) bt
+ Trace 68201
(In reply to comment #6) > #3 0x0f3612e4 in _XError () from /usr/lib/libX11.so.6 > #4 0x0f363a38 in _XReply () from /usr/lib/libX11.so.6 > #5 0x0f358a6c in XSync () from /usr/lib/libX11.so.6 > #6 0x0f358cb8 in XSetAfterFunction () from /usr/lib/libX11.so.6 > #7 0x0f330bd8 in XCreatePixmap () from /usr/lib/libX11.so.6 > #8 0x0eddd38c in cairo_xlib_surface_create_with_xrender_format () > from /usr/lib/libcairo.so.2 > #9 0x0edcd06c in cairo_surface_destroy () from /usr/lib/libcairo.so.2 > #10 0x0edcd130 in cairo_surface_destroy () from /usr/lib/libcairo.so.2 > #11 0x0edbd098 in cairo_create () from /usr/lib/libcairo.so.2 Installing debuginfo for cairo and libX11 might clear up this part of the trace. BadAlloc on XCreatePixmap generally means you're asking to create something pathologically huge, would be good to verify that here.
The program 'evolution-2.6' received an X Window System error. This probably reflects a bug in the program. The error was 'BadAlloc (insufficient resources for operation)'. (Details: serial 1521248 error_code 11 request_code 53 minor_code 0) (Note to programmers: normally, X errors are reported asynchronously; that is, you will receive the error a while after causing it. To debug your program, run it with the --sync command line option to change this behavior. You can then get a meaningful backtrace from your debugger if you break on the gdk_x_error() function.)
+ Trace 68555
Thread 4158828544 (LWP 15073)
Hm, upon installing debuginfo for gtk, I find that there are three versions of gtk2 package installed on each. Yum has been doing something very strange. One has these: gtk2-2.6.10-2.fc4.4.ppc gtk2-2.8.17-1.fc5.1.ppc gtk2-2.8.17-1.fc5.1.ppc64 Another has these: gtk2-2.6.10-2.fc4.4.ppc gtk2-2.8.15-1.ppc gtk2-2.8.17-1.fc5.1.ppc I'll fix that, mutter darkly at yum for a little while, and see if I can reproduce with gtk2-debuginfo installed.
Still happens with only the current gtk2 package installed. With gtk2-debuginfo, frames #14-#18 look like this...
+ Trace 68557
Running with this patch to see if it's enlightening, and will poke at it with gdb if/when it triggers.... --- widgets/table/e-table-item.c~ 2006-05-03 15:02:45.000000000 +0100 +++ widgets/table/e-table-item.c 2006-06-01 15:59:16.000000000 +0100 @@ -2092,6 +2092,7 @@ find_cell (ETableItem *eti, double x, do *view_row_res = eti->grabbed_row; *x1_res = x - eti->x1 - e_table_header_col_diff (eti->header, 0, eti->grabbed_col); *y1_res = y - eti->y1 - e_table_item_row_diff (eti, 0, eti->grabbed_row); + g_assert(eti->grabbed_col != -1); return TRUE; } @@ -2138,6 +2139,7 @@ find_cell (ETableItem *eti, double x, do *view_row_res = row; if (y1_res) *y1_res = y - y1; + g_assert(col != -1); return TRUE; }
That doesn't trigger -- row and column are actually set to -1 by the call to e_table_item_get_cell_geometry(), from about line 515 of e-cell-tree.c. Adding a couple of debugging printfs around that call shows this... Will get geometry. row 16471, view_col 4 motion. tmp_row -1, view_col -1, x 238, y 263536, width 78 height 16 Will get geometry. row 16471, view_col 4 motion. tmp_row -1, view_col -1, x 238, y 263536, width 78 height 16 Although row and col seem to be set to -1 _every_ time this debugging printf is triggered, it's the final one which crashes.... Will get geometry. row 2394, view_col 4 motion. tmp_row -1, view_col -1, x 238, y 38304, width 30 height 16 The program 'evolution-2.6' received an X Window System error. This probably reflects a bug in the program. I don't know if that '-1' is normal or if I'm chasing a red herring -- I know nothing about this code I'm poking at; I'm just having to do it for myself because nobody _else_ seems to care about Evolution crashing...
(In reply to comment #12) > > I don't know if that '-1' is normal or if I'm chasing a red herring -- I know > nothing about this code I'm poking at; I'm just having to do it for myself > because nobody _else_ seems to care about Evolution crashing... > I don't know if the '-1' is interesting at all or not. What is interesting is the following section of the stack trace which shows what is causing the actual crash:
+ Trace 68604
I would be interested to see more details about what that computation looks like. In particular, what the clipping geometry is going into this, (that is, what cairo calls GTK's draw_expander is calling before cairo_clip or else what the contents of 'traps' is in _cairo_clip_intersect_mask). -Carl
I've worked out how to reproduce it reliably. I have to be viewing a folder in threaded mode, and it happens whenever I let the mouse move over the little triangle which allows me to expand or collapse a thread. It happens every time I do this in my fedora-devel-list folder, but not repeatably in the linux-kernel folder (although it _has_ happened there in the past).
There is strangeness here. When you hover over the thread 'expander' in about the first 1030-or-so rows of the folder index, the hollow triangle gets filled in to become a solid triangle. Between about rows 1031 and 1040 or so, when you over over the 'expander', it draws that solid triangle in the _wrong_ place -- it's drawn too far down. That seems to be random -- sometimes for those rows it _is_ correct, and sometimes it's not. I haven't worked out why. Then for a while it's not drawn at all (or it's drawn _so_ far out of place that it's not actually visible). Then at row 2048 it starts to cause the crash, in my fedora-devel-list folder of 2461 messages. In the linux-kernel folder with 17110 messages, I can't trigger the crash at the moment. I'm going to disable the call to draw_expander() and hopefully that'll 'fix' it.
Created attachment 66723 [details] [review] Disable the call to draw_expander() This avoids the problematic code and seems to 'fix' the problem for me.
Actually, it fixes only the problem when I move the mouse over the expander. There's still a little animation if I actually _click_ on it, as it switches between its expanded and collapsed states. That animation also suffers the same problems -- it's either in the wrong place, or it causes the crash.
I can confirm this. It's reproducible on my Ubuntu installation (both Dapper with Evo2.6 and Edgy with 2.7.4) on ppc. Evolution crashes with BadAlloc whenever I browse my debian-powerpc archives (>5000 mails) in threaded mode for a while, but it works well for other folders with similar quantity of messages inside. I can attach backtrace if you want (with debugging symbols installed, of course), but I'm not a coder and I can't interpret it.
Andrzej: can u attach backtrace? TIA.
Im not able to get the crash at all. I have a message list of 40K+ messages. But it behaves very oddly, wrt MOUSEOVER and Animation. I guess, a corruption there is leading to the X Crash. Im on for disabling the ANIMATION and MOUSEOVER, if we arent close to fix this at all before the code freeze(STABLE/HEAD). Mails are more important than the animation IMO. We can enable them back, when we fix them to the core. David, but does this MOUSEOVER disabling solve one part of the problem completely for you?
how come I don't see this problem? is it specific to a particular theme? arch?
Fejj, I see it with SLED , with the default THEME.
That's weird, I can't get any crash right now. I could yesterday, but today (26.VII.06) it just work without any problems (I changed nothing in my system). I'll notice you if threaded mode starts crashing again.
Ubuntu bug about that: https://launchpad.net/distros/ubuntu/+source/evolution/+bug/21582 "... http://librarian.launchpad.net/3008608/gdb-evolution.txt I just noticed that the problem still occurs on Dapper. I'm attaching a new backtrace, this time with debugging symbols. evolution version: 2.6.1-0ubuntu7 libcairo version: 1.0.4-0ubuntu1 ..."
Ok, my Evolution has started crashing once again (which is bad for me, but good for debugging). The backtrace is at: http://kelner.ath.cx/~kelner/evolutioncrash.txt evolution 2.6.1-0ubuntu7 libcairo 1.0.4-0ubuntu1
Created attachment 70304 [details] Backtrace with debuggging symbols
I can confirm this with Evolution 2.8.1 running on a GNOME 2.16 desktop. I haven't seen a crash yet but the expander arrows are misbehaving in a manner similar to comment #15.
I haven't seen this for a while. Any thing recently? If it happens even now, I don't mind committing a reworked patch which disables this in a better way.
I dropped the patch from my own builds of Evolution a little while ago, to see if the crash still happened -- and it doesn't. The underlying problem still occurs though -- other than in the first thousand or so mails, the animation doesn't show. And in a small range (probably still around row 1030 but I haven't counted this time), it's drawn in the wrong place. If allowed to persist, this will probably start causing crashes again some time, on some platform.
David, good that crash doesn't happen for you. I think I will comment off the animation if we get any crashers beyond this point. Thanks for your support.
Matthias Clasen provided this clue in the downstream bug [1], but I've yet to acquire the expertise to investigate the issue further: Whats happening here is that this is one of the few places where ETable uses gtk_paint functions to draw on a big window (ie one which extends beyond the X 16 bit limitation). It works fine when done inside an expose handler, since GTK calls gdk_begin_paint for you which sets up some device offset magic to draw to a small pixmap and copy the result back (of course, this fails when drawing large primitives, but that is not the issue here). Apparently the drawing on the GnomeCanvas that ETable does is not inside a gdk_begin_paint/gdk_end_paint pair. [1] http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=187565
Any updates here? looks like gtk+ issue. Some relevant traces in bug 379950.
Not a GTK issue, it's an ETable issue. I've not made any progress on this lately, but I believe David has confirmed that the crashes have gone away and we're just left with the drawing glitch.
Yeah, I don't see the crash any more -- but still the animations only work in the top part of the mail folder. After that they break.
Please close this bug as obsolete