GNOME Bugzilla – Bug 592100
Cheese crashes when taking a photo or recording a video in fullscreen
Last modified: 2010-03-28 08:03:52 UTC
Version: 2.26.3 What were you doing when the application crashed? * I opened cheese. * I made cheese full screen. * It crashed. I have a separate issue where cheese seems to display only the first image when it boots and doesn't update the preview after opening, but I can still take photos. Does full screen have issues with dual monitors? That's what I'm using at least. I'll try to update this with a test of it without dual monitors. Distribution: Fedora release 11 (Leonidas) Gnome Release: 2.26.3 2009-07-07 (Red Hat, Inc) BugBuddy Version: 2.26.0 System: Linux 2.6.29.6-217.2.7.fc11.i686.PAE #1 SMP Fri Aug 14 20:52:46 EDT 2009 i686 X Vendor: The X.Org Foundation X Vendor Release: 10601901 Selinux: Enforcing Accessibility: Enabled GTK+ Theme: Glider Icon Theme: gnome GTK+ Modules: canberra-gtk-module, globalmenu-gnome, pk-gtk-module, gail:atk-bridge, gnomebreakpad Memory status: size: 230653952 vsize: 230653952 resident: 29224960 share: 20140032 rss: 29224960 rss_rlim: 18446744073709551615 CPU usage: start_time: 1250522749 rtime: 653 utime: 623 stime: 30 cutime:0 cstime: 0 timeout: 0 it_real_value: 0 frequency: 100 Backtrace was generated from '/usr/bin/cheese' [?1034h[Thread debugging using libthread_db enabled] [New Thread 0xb55feb70 (LWP 4794)] [New Thread 0xb5fffb70 (LWP 4789)] [New Thread 0xaeb8bb70 (LWP 4788)] [New Thread 0xaf5fdb70 (LWP 4787)] [New Thread 0xafffeb70 (LWP 4779)] [New Thread 0xb61dbb70 (LWP 4775)] 0x00a29424 in __kernel_vsyscall ()
+ Trace 216969
Thread 1 (Thread 0xb7f2e770 (LWP 4768))
---- Critical and fatal warnings logged during execution ---- ** Gdk **: gdk_x11_atom_to_xatom_for_display: assertion `atom != GDK_NONE' failed Output of custom script "/usr/libexec/cheese/cheese-bugreport.sh": Cheese log: ----------- .xsession-errors (1047 sec old) --------------------- Nautilus-Share-Message: REFRESHING SHARES Nautilus-Share-Message: ------------------------------------------ Nautilus-Share-Message: spawn arg "net" Nautilus-Share-Message: spawn arg "usershare" Nautilus-Share-Message: spawn arg "info" Nautilus-Share-Message: end of spawn args; SPAWNING Nautilus-Share-Message: returned from spawn: SUCCESS: Nautilus-Share-Message: exit code 255 Nautilus-Share-Message: ------------------------------------------ Nautilus-Share-Message: Called "net usershare info" but it failed: 'net usershare' returned error 255: net usershare: usershares are currently disabled [DEBUG]: EnableDisable Called: enabling... True [DEBUG]: Binding key '<Alt>F12' for '/apps/tomboy/global_keybindings/show_note_menu' [DEBUG]: Binding key '<Alt>F11' for '/apps/tomboy/global_keybindings/open_start_here' --------------------------------------------------
Yah, disabling dual screen doesn't change anything. IBM Thinkpad x41, with 00:02.0 VGA compatible controller: Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller (rev 03)
Hi, I've tested it running cheese on ubuntu(acer aspire one) and fedora 11(eeepc) and I couldn't reproduce it.
Definitely a pulseaudio related issue, it happens when doing audio stuff happening in fullscreen mode: both with the shutter sound event on "take photo" and with the audio recording in "start recording". I can reproduce it here and goes away if I disable the audio device from the audio capplet.
*** Bug 594108 has been marked as a duplicate of this bug. ***
Looks like a libcanberra/pulseaudio-libs bug
Which version of libcanberra is this? Could anyone get me a full bt? Is this rawhide?
FWIW, the trace in the duplicate bug (http://bugzilla-attachments.gnome.org/attachment.cgi?id=142456) comes from an abort in the same function (pa_tls_set) but doesn't seem to contain anything related to libcanberra as this one. I reproduced it on fedora 11, will switch to rawhide in the next few days and try to provide a better trace.
I'm not sure what exactly you need, but here's a backtrace from GDB with some checking of various values at the time it crashed, and a 'thread apply all bt'. ================================================= Assertion 'pthread_setspecific(t->key, userdata) == 0' failed at pulsecore/thread-posix.c:200, function pa_tls_set(). Aborting. Program received signal SIGABRT, Aborted. 0x00b0d424 in __kernel_vsyscall () (gdb) bt
+ Trace 217776
Thread 15 (Thread 0xaeba0b70 (LWP 24347))
Thread 14 (Thread 0xb6efab70 (LWP 24346))
Hmm, pthread_setspecific() failing is something that simply cannot happen. This is really weird, and I wonder if PA is actually the culprit here...
Indeed. I tried again, breaking on pa_tls_set, and then stepped through pthread_setspecific(), and it gets to the "return 0;" at the end. (Perhaps my earlier gdb session that claimed a return value of 22 is due to try trying to setspecific a second time after it was already set?) In theory, pa_assert_se() is failing despite getting 0 == 0? I don't understand :( Here's the excerpt from this latest gdb session, though I don't think it adds much: Breakpoint 1, pa_tls_set (t=0x81c52b0, userdata=0x83eec98) at pulsecore/thread-posix.c:199 199 r = pthread_getspecific(t->key); (gdb) next 200 pa_assert_se(pthread_setspecific(t->key, userdata) == 0); (gdb) step __pthread_setspecific (key=5, value=0x83eec98) at pthread_setspecific.c:36 36 self = THREAD_SELF; (gdb) step 40 if (__builtin_expect (key < PTHREAD_KEY_2NDLEVEL_SIZE, 1)) (gdb) 29 { (gdb) 40 if (__builtin_expect (key < PTHREAD_KEY_2NDLEVEL_SIZE, 1)) (gdb) 43 if (KEY_UNUSED ((seq = __pthread_keys[key].seq))) (gdb) 93 return 0; (gdb) 94 } (gdb) pa_log_level_meta (level=PA_LOG_ERROR, file=0x60998df "pulsecore/thread-posix.c", line=200, func=0x60999af "pa_tls_set", format=0x60998fc "Assertion '%s' failed at %s:%u, function %s(). Aborting.") at pulsecore/log.c:409 409 va_start(ap, format); (gdb) l
Created attachment 143763 [details] Testcase testing output of macros I wrote a small test checking the outputs of PA_UNLIKELY, __builtin_expect and pa_assert_se given that 0 should be being returned, and I get what we would want (sadly?) ----------------------------------- PA_UNLIKELY (returns_zero() == 0) returned 1 __builtin_expect (!! (returns_zero() == 0), 0); returned 1 pa_assert_se (returns_zero() == 0); survived assertion pa_assert_se (returns_zero() != 0); /* should die here */ in pa_log_levelv_meta Aborted -----------------------------------
(In reply to comment #10) > Indeed. > > I tried again, breaking on pa_tls_set, and then stepped through > pthread_setspecific(), and it gets to the "return 0;" at the end. This might simply be misleading because you have optimizations enabled. When you gdb through this make sure you compiled your glibc with -O0.
*** Bug 613419 has been marked as a duplicate of this bug. ***
Lennart, looking at empathy similar bug in bugzilla.redhat.com https://bugzilla.redhat.com/show_bug.cgi?id=532307 I see you managed to fix it removing extra xmlCleanupParser calls. We don't do that but yet we have a quite similar and easily reproducible bug. Do you have any hint about the possible cause?
Just reproduced it with latest cheese, fedora 12, pulseaudio 0.9.21. Steps to reproduce: - enable windows and buttons sounds from gnome-volume-control - start cheese - press F11 It happens *every time*.
Could be something similar to the xmlCleanup thing indeed. Might be worth gdb'ing through cheese and setting a breakpoint on pthread_setspecific() as well as pthread_key_delete() to see if something erronously deletes PA's TLS var. Thats how the empathy issue got tracked down.
(In reply to comment #16) > Could be something similar to the xmlCleanup thing indeed. Might be worth > gdb'ing through cheese and setting a breakpoint on pthread_setspecific() as > well as pthread_key_delete() to see if something erronously deletes PA's TLS > var. Thats how the empathy issue got tracked down. BINGO! Breaking on pthread_key_delete was just the right hint! There is indeed an xmlCleanup call that sneaks in when we call rsvg_term() in cheese-countdown.c (cheese_create_surface_from_svg), commenting out that the crash just goes away. Now I just have to understand how the countdown widget works (honestly I never even opened cheese-countdown.c before...) and see if I can get rid of that call. It seems that it creates the surface on style-set and style-set for some reason is emitted when going fullscreen. Thank you for your precious hint.
you probably should just drop the invocation of rsvg_term(). (And of course prepare a patch for librsvg that documents that this function should not be called unless followed immediately by an exit())
Meh google codesearching for rsvg_term() tells me there's another bunch of projects misusing that call the same way as xmlCleanup().
(In reply to comment #18) > you probably should just drop the invocation of rsvg_term(). Removing rsvg_init() and rsvg_term() completely doesn't seem to cause any harm. I wonder if they are really mandatory. I guess I'll move them in main() together with the other _init functions (g_threads_init, gst_init, etc...).
Created attachment 156945 [details] [review] countdown: move rsvg_init and rsvg_term into main Initialize rsvg at startup and clean it up at exit. rsvg_term is particularly subtle as it calls xmlCleanupParser() triggering nasty crashes (e.g. with PulseAudio) with multithread applications. See http://0pointer.de/blog/projects/beware-of-rsvg-term.html for more info. Rsvg loading seems to work even without these functions so I'm not sure it's worth to keep them.
Filippo, this needs to go in before 2.30.0. I requested a freeze break for it.
(In reply to comment #22) > Filippo, this needs to go in before 2.30.0. > > I requested a freeze break for it. Sure, I already requested a freeze break this morning ;) And also obtained approval. Will commit after lunch probably.
Attachment 156945 [details] pushed as f3d79dd - countdown: move rsvg_init and rsvg_term into main
This crash was reported by several people in Fedora 12. I (perhaps wrongly) blamed it on librsvg2 for calling xmlCleanupParser(): https://bugzilla.redhat.com/show_bug.cgi?id=542277#c5
(In reply to comment #25) > This crash was reported by several people in Fedora 12. I (perhaps wrongly) > blamed it on librsvg2 for calling xmlCleanupParser(): > https://bugzilla.redhat.com/show_bug.cgi?id=542277#c5 I guess it's legitimate for rsvg_term to call xmlCleanupParser, they should just put a big warning in the docs. I opened a bug for this, bug 614157, not sure if it is actively maintained though.