GNOME Bugzilla – Bug 599574
Crash in _cairo_surface_set_error at cairo-surface.c line 128
Last modified: 2014-12-22 18:18:35 UTC
What were you doing when the application crashed? switching folders to view Distribution: Debian squeeze/sid Gnome Release: 2.28.0 2009-09-27 (Debian) BugBuddy Version: 2.28.0 System: Linux 2.6.32-rc3-wl-39587-g329c2e6-dirty #94 SMP PREEMPT Fri Oct 9 11:23:21 CEST 2009 x86_64 X Vendor: The X.Org Foundation X Vendor Release: 10605000 Selinux: No Accessibility: Disabled GTK+ Theme: Simple Icon Theme: Mist GTK+ Modules: globalmenu-gnome, gnomebreakpad, canberra-gtk-module Memory status: size: 1098719232 vsize: 1098719232 resident: 420450304 share: 25333760 rss: 420450304 rss_rlim: 18446744073709551615 CPU usage: start_time: 1256493823 rtime: 2804 utime: 2094 stime: 710 cutime:9 cstime: 10 timeout: 0 it_real_value: 0 frequency: 100 Backtrace was generated from '/usr/bin/evolution' [Thread debugging using libthread_db enabled] [New Thread 0x7f09bfdfc910 (LWP 25693)] [New Thread 0x7f09b7fff910 (LWP 25692)] [New Thread 0x7f09c72cd910 (LWP 25414)] [New Thread 0x7f09c4ea8910 (LWP 25413)] [New Thread 0x7f09c60bf910 (LWP 25412)] [New Thread 0x7f09c68c0910 (LWP 25410)] [New Thread 0x7f09c84f8910 (LWP 25407)] [New Thread 0x7f09c8cf9910 (LWP 25406)] 0x00007f09de11852d in __libc_waitpid (pid=26563, stat_loc=<value optimized out>, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:41 in ../sysdeps/unix/sysv/linux/waitpid.c
+ Trace 218559
Thread 1 (Thread 0x7f09e31bb810 (LWP 25398))
Inferior 1 [process 25398] will be detached. Quit anyway? (y or n) [answered Y; input not from terminal] ---- Critical and fatal warnings logged during execution ---- ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ** evolution **: atk_object_set_name: assertion `name != NULL' failed ----------- .xsession-errors (15653 sec old) --------------------- (evolution:13556): camel-WARNING **: Error storing 'jberg1X@imapmail.intel.com:Journal': Host lookup failed: imapmail.intel.com: Name or service not known (evolution:13556): camel-WARNING **: Error storing 'jberg1X@imapmail.intel.com:Junk E-mail': Host lookup failed: imapmail.intel.com: Name or service not known (evolution:13556): camel-WARNING **: Error storing 'jberg1X@imapmail.intel.com:Managed Folders': Host lookup failed: imapmail.intel.com: Name or service not known (evolution:13556): camel-WARNING **: Error storing 'jberg1X@imapmail.intel.com:Managed Folders/Business Important': Host lookup failed: imapmail.intel.com: Name or service not known (evolution:13556): camel-WARNING **: Error storing 'jberg1X@imapmail.intel.com:Notes': Host lookup failed: imapmail.intel.com: Name or service not known (evolution:13556): camel-WARNING **: Error storing 'jberg1X@imapmail.intel.com:Outbox': Host lookup failed: imapmail.intel.com: Name or service not known (evolution:13556): camel-WARNING **: Error storing 'jberg1X@imapmail.intel.com:Quarantine': Host lookup failed: imapmail.intel.com: Name or service not known ...Too much output, ignoring rest... --------------------------------------------------
Looks like a bug in cairo/gdk I guess...
Here's a trace with cairo dbg info installed. System: Linux 2.6.32-rc3-wl-39587-g329c2e6-dirty #94 SMP PREEMPT Fri Oct 9 11:23:21 CEST 2009 x86_64 X Vendor: The X.Org Foundation X Vendor Release: 10605000 Selinux: No Accessibility: Disabled GTK+ Theme: Simple Icon Theme: Mist GTK+ Modules: globalmenu-gnome, gnomebreakpad, canberra-gtk-module Memory status: size: 847548416 vsize: 847548416 resident: 175943680 share: 24903680 rss: 175943680 rss_rlim: 18446744073709551615 CPU usage: start_time: 1256494523 rtime: 1122 utime: 773 stime: 349 cutime:0 cstime: 0 timeout: 0 it_real_value: 0 frequency: 100 Backtrace was generated from '/usr/bin/evolution' [Thread debugging using libthread_db enabled] [New Thread 0x7f1b5ae14910 (LWP 27007)] [New Thread 0x7f1b5c019910 (LWP 26997)] [New Thread 0x7f1b5d9b4910 (LWP 26996)] [New Thread 0x7f1b65805910 (LWP 26814)] [New Thread 0x7f1b5e1b5910 (LWP 26801)] [New Thread 0x7f1b5f7fe910 (LWP 26800)] [New Thread 0x7f1b5effd910 (LWP 26799)] [New Thread 0x7f1b5ffff910 (LWP 26797)] [New Thread 0x7f1b6622f910 (LWP 26794)] [New Thread 0x7f1b66a30910 (LWP 26793)] 0x00007f1b7be4f52d in __libc_waitpid (pid=27008, stat_loc=<value optimized out>, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:41 in ../sysdeps/unix/sysv/linux/waitpid.c
+ Trace 218560
Thread 1 (Thread 0x7f1b80ef2810 (LWP 26787))
Inferior 1 [process 26787] will be detached. Quit anyway? (y or n) [answered Y; input not from terminal]
*** Bug 604831 has been marked as a duplicate of this bug. ***
*** Bug 604283 has been marked as a duplicate of this bug. ***
I just had this pointed to me on #cairo - this can only happen if the pointer passed to _cairo_surface_set_error() is not a cairo surface. So something fishy is going on in the lower layers and an invalid pointer is passed to cairo_xlib_surface_set_size(). So this is definitely not a Cairo bug.
got a similar stacktrace with evolution 2.30.0 / gtk+ 2.20 / cairo 1.9.6 Reassigning to gtk+, from hint from Company (I thought cairo was suspicious). It could be a client-side window issue (trying to resize something it shouldn't) (gdb) thread apply all bt
+ Trace 221305
Thread 1 (Thread 0xb65397e0 (LWP 24417))
got another similar crash in evo 2.30.0 / gtk+ 2.20 / cairo 1.9.6 : (gdb) p *surface $1 = {backend = 0x0, device = 0x0, type = CAIRO_SURFACE_TYPE_IMAGE, content = CAIRO_CONTENT_COLOR, ref_count = {ref_count = -1}, status = CAIRO_STATUS_INVALID_SIZE, unique_id = 0, finished = 0, is_clear = 1, has_font_options = 0, user_data = {size = 0, num_elements = 0, element_size = 0, elements = 0x0, is_snapshot = 0}, mime_data = {size = 0, num_elements = 0, element_size = 0, elements = 0x0, is_snapshot = 0}, device_transform = {xx = 1, yx = 0, xy = 0, yy = 1, x0 = 0, y0 = 0}, device_transform_inverse = {xx = 1, yx = 0, xy = 0, yy = 1, x0 = 0, y0 = 0}, x_resolution = 0, y_resolution = 0, x_fallback_resolution = 0, y_fallback_resolution = 0, snapshot_of = 0x0, snapshot_detach = 0, snapshots = {size = 0, num_elements = 0, element_size = 0, elements = 0x0, is_snapshot = 0}, font_options = {antialias = CAIRO_ANTIALIAS_DEFAULT, subpixel_order = CAIRO_SUBPIXEL_ORDER_DEFAULT, hint_style = CAIRO_HINT_STYLE_DEFAULT, hint_metrics = CAIRO_HINT_METRICS_DEFAULT}}
The surface is in CAIRO_STATUS_INVALID_SIZE state, so was probably created by _cairo_surface_create_in_error (cairo_status_t status) and is a pointer to _cairo_surface_nil_invalid_format which is a constant defined by cairo. _cairo_surface_set_error() tries to write to it and it segfaults because it's read-only (at least that's the only explanation I could find) evolution is at fault for trying to do something using a surface in error state, but I think cairo tries to handle this kind of attempts as no-ops (instead of crashing). openshot + cairo 1.9.x reliably reproduce "this" bug (that is, similar backtrace, and surface in exactly the same state), so I can do some testing if that's needed (see https://qa.mandriva.com/show_bug.cgi?id=57815). If I'm mistaken and these 2 bugs are different, then sorry for the noise :)
You are right. The crash should be fixed in upstream http://cgit.freedesktop.org/cairo/commit/?id=005596907fc9b62fa4bf72ec35e0d1a1a242ef93 (1.9 branch, not sure if it applies without modifications to 1.8). I'm leaving this open because I'm not sure if this hides a problem in gdk.
*** Bug 616563 has been marked as a duplicate of this bug. ***
*** Bug 616676 has been marked as a duplicate of this bug. ***
*** Bug 617086 has been marked as a duplicate of this bug. ***
*** Bug 617758 has been marked as a duplicate of this bug. ***
*** Bug 618465 has been marked as a duplicate of this bug. ***
Running Evolution on MeeGo (gtk+ 2.20) for any length of time yields this exciting crasher (appended). It looks very much as if we are missing a call to gdk_window_ensure_native somewhere - (perhaps around the canvas - no idea). Presumably though, the code calling: gdk_window_move_resize_internal to recompute_visible_regions_internal that calls: if (private->cairo_surface) { ... _gdk_windowing_set_cairo_surface_size (private->cairo_surface, width, height); which calls: void _gdk_windowing_set_cairo_surface_size (cairo_surface_t *surface, int width, int height) { cairo_xlib_surface_set_size (surface, width, height); } is the ultimate cause of the problem here. Presumably we are missing some sort of: gdk_window_ensure_native (window); call at some level here (?). trace:
+ Trace 221905
$2 = 0x85c1000 [GdkWindow] (gdb) p *$2 $3 = {parent_instance = {parent_instance = {g_type_instance = {g_class = 0x8052938}, ref_count = 2, qdata = 0x8ecf540}}, impl = 0x852ad90 [GdkWindowImplX11], parent = 0x85bef38 [GdkWindow], user_data = 0x84fd1f8, x = 0, y = -13, extension_events = 0, filters = 0x0, children = 0x0, bg_color = {pixel = 15658220, red = 61166, green = 60652, blue = 60652}, bg_pixmap = 0x2, paint_stack = 0x0, update_area = 0x0, update_freeze_count = 0, window_type = 2 '\002', depth = 24 '\030', resize_count = 0 '\000', state = 0, guffaw_gravity = 0, input_only = 0, modal_hint = 0, composited = 0, destroyed = 0, accept_focus = 1, focus_on_map = 1, shaped = 0, event_mask = 2162454, update_and_descendants_freeze_count = 0, redirect = 0x0, impl_window = 0x83c02a0 [GdkWindow], abs_x = 281, abs_y = 106, width = 714, height = 125183, clip_tag = 2084, clip_region = 0x8958220, clip_region_with_children = 0x8945460, cursor = 0x0, toplevel_window_type = -1 '\377', synthesize_crossing_event_queued = 0, effective_visibility = 1, visibility = 1, native_visibility = 0, viewable = 1, applied_shape = 0, num_offscreen_children = 0, implicit_paint = 0x0, input_window = 0x0, outstanding_moves = 0x0, shape = 0x0, input_shape = 0x0, cairo_surface = 0x4cebed80, outstanding_surfaces = 0} HTH.
Comment 8 outlines the problem pretty well already. What you will likely find is that GDK is creating a 0x0 window somewhere. As that's an invalid size, an error surface is returned. And then GDK continues to use this as if it were a real surface, even though it's broken. If you want to find the actual place where this happens, setting a breakpoint on _cairo_error() should help.
*** Bug 618658 has been marked as a duplicate of this bug. ***
Michael: You shouldn't really ever have to call gdk_window_ensure_native() to avoid a crash like that. As soon as you get the xid of the GdkWindow or otherwise do some native-only thing on it gdk will automatically convert it to native. gdk_window_ensure_native() is only needed when you rely on some native window semantics that gdk can't really know of (like some other app finding the window in the tree or something). And, anyway, even for non-native windows private->cairo_surface is an xlib surface, it just points to a pixmap drawable rather than a window drawable.
*** Bug 619327 has been marked as a duplicate of this bug. ***
The issue is that various cairo_*_create functions return pointers to static read-only surfaces with an error (that should be checked with cairo_surface_Status()). The first patch indents gdkdrawable-x11.c incidentally. The second adds/corrects error checking for the cairo functions. This stops things crashing, but does not address the reason for the functions failing. (In my case it seems to be because a cairo surface a large multiple of the screen height, and hits an internal limit).
Created attachment 161753 [details] [review] [PATCH 1/2] GNU-style indent for gdk/x11/gdkdrawable-x11.c
Created attachment 161754 [details] [review] [PATCH 2/2] Convert Cairo error object convention to NULL on error.
That is, return pointers to static read-only surfaces instead of NULL on error.
That patch is completely unnecessary, because it works around the very well functioning error handling mechanism in Cairo - returning the error surfaces. And calling any Cairo function on an object in an error status is defined to be a no-op. The idea behind this API design is of course to let the developer check errors when it is convenient to him instead of forcing him to check for errors all the time - note that even functions like cairo_pattern_add_color_stop_rgb() can put the pattern into an error state.
Er, well, that's patently bollocks, because not checking the error and continuing to use the surface causes cairo to segv.
See comment 9: There was a bug in cairo that caused one single function to crash. That bug has been fixed since.
That bug is not related to what I am seeing.
On a slightly different note, as comment 9 does point out, even if cairo can carry on without the caller doing explicit error checking, is that even a sensible thing to do? Surely we'll want to know if things aren't getting rendered...
Yes, it is a sensible thing to do because it frees you of the burden to do error checking. You are free to do it whenever you want to. Which is the reason why this bug is still open: We need to figure out why this problem happens in the first place. Investigating where GTK tries to create a far too large window like you pointed out in comment 20 sounds like the approach we'd want to take. That said, if you know a good place to check the error status and do something useful in that case, that'd be a very useful addition, too.
*** Bug 619013 has been marked as a duplicate of this bug. ***
*** Bug 619876 has been marked as a duplicate of this bug. ***
*** Bug 620707 has been marked as a duplicate of this bug. ***
*** Bug 621231 has been marked as a duplicate of this bug. ***
*** Bug 621569 has been marked as a duplicate of this bug. ***
*** Bug 622258 has been marked as a duplicate of this bug. ***
Created attachment 166153 [details] [review] x11: Query size on real drawable The X11 drawable does not have a clue about the real size of the surface. This might also be the cause for:
*** Bug 617275 has been marked as a duplicate of this bug. ***
*** Bug 616667 has been marked as a duplicate of this bug. ***
*** Bug 624081 has been marked as a duplicate of this bug. ***
Comment on attachment 166153 [details] [review] x11: Query size on real drawable the code has changed entirely in the meantime.
Lets assume this is fixed