GNOME Bugzilla – Bug 74311
crash on startup/background repaint
Last modified: 2006-12-26 18:25:49 UTC
Package: nautilus Severity: normal Version: 1.1.9 Synopsis: crash closing a window Bugzilla-Product: nautilus Bugzilla-Component: general BugBuddy-GnomeVersion: 2.0 (1.112.1) Description: Description of Problem: i logged in, and nautilus started to draw the desktop and popped up a window list. i closed the window list (it hadn't finished drawing / loading the directory) and got the X error. Debugging Information: [New Thread 1024 (LWP 11223)] [New Thread 2049 (LWP 11241)] [New Thread 1026 (LWP 11242)] [New Thread 2051 (LWP 11243)] 0x40955e29 in __wait4 () from /lib/libc.so.6
+ Trace 19008
Thread 1 (Thread 1024 (LWP 11223))
------- Bug moved to this database by unknown@bugzilla.gnome.org 2002-03-11 22:56 ------- Unknown version 1.1.x in product nautilus. Setting version to the default, "unspecified". Reassigning to the default owner of the component, nautilus-maint@bugzilla.gnome.org.
There are a couple more of these from tonight. Dups coming...
*** Bug 74449 has been marked as a duplicate of this bug. ***
Michael: cc'ing you because of the bonobo bits in the trace. Relevant?
Ok; so the bonobo bit in the trace is irrelevant - it happens with any X error and we chain back to the gdk handler since this is not an error Bonobo is interested in trapping. It _looks_ to me as if gdk_window_process_all_updates has a re-enterency hazard - not in that the 'update_windows' list can change, since this is NULL'd, but that a gdk_window can be destroyed leaving a pointer on the copy of the update_windows list. This might happen via gdk_window_process_updates_internal -> _gdk_event_func -> gdk_main_do_event -> Practically anywhere ... So - when the gdk window is destroyed it would call _gdk_window_clear_update_area, which would remove it from update_windows, but not the copy on the stack frame of 'gdk_window_process_all_updates'. So - therein lies the rub. Not a particularly pretty one to fix. About the only way to go is to hold a ref on each gdk window as we take the list in gdk_window_process_all_updates - since while we could walk back over and remove items from that list later, it's really not obvious how to inform the higher stack frames that this window is now dead. Anyway - that's the bug I'll bet - and it probably explains a good number of these similar 'Window closed, everything died' bugs. ... Owen ?
Well, yes, this seems to be a problem, but could you _please_ track down: - Why is closing windows causing process_all_updates() to be called? - Who is calling gdk_window_destroy() out of an expose handler. [ CORBA traffic causing strange reentrancy? ] Both seem to be asking for trouble without respect to this fix.
*** Bug 74526 has been marked as a duplicate of this bug. ***
*** Bug 74457 has been marked as a duplicate of this bug. ***
Looking at this some more, I see no reason at all to think that these crashes have anything to do with the process_all_updates() reentrancy problem.... the reason why you are getting X errors here is that GTK+ is really stingy about doing round trips to the server, so the gdk_flush() in the process_all_updates() is very frequently the first time GTK+ gets a chance to get anything back from the server.
i'll try to get a stack trace with --sync
*** Bug 74654 has been marked as a duplicate of this bug. ***
Note that an X error at this point also occurs when not closing windows... I frequently get it at Nautilus startup. (The first time I start Nautilus... seems to be some sort of bonobo-activation slowness race condition in this case.)
Yeah, I got this on startup today while lots of other things were going on. restarted later with lower load and no problems. Not sure if that is causative or just correlative. Moving this up to Urgent, mainly for michael's benefit, because it is (1) at startup (2) in a major component and (3) lots of people are seeing it.
i am getting more of these (but in gnome2-settings-daemon) when i change the background. that makes sense as these may be from the bg setting code. michael, is nautilus handling the desktop for you?
*** Bug 74668 has been marked as a duplicate of this bug. ***
Problem Michael noticed filed separately as bug 74708, and fixed in CVS. I'm pretty positive that it wasn't the problem here: - This crash seems unrelated to closing windows - It's an X error, not a segfault as you would get from Michael's reentrancy problem. - It takes a *lot* of work to trigger the reentrancy problem. Assigning back to nautilus.
indeed i think the closing windows was a bogus coincidence and the real bug is with the bg setting code.
Created attachment 7182 [details] Backtrace with --sync
The backtrace above is one crash at this point (if perhaps not the only one); this was occuring becuase I had two copies of nautilus in my session file; apparently the sequence of events was something like: - Nautilus 1 starts, sets background on the root window and the desktop window and stores it in XSETROOT_PMAP_ID, and ESETROOT_PMAP (property names from memory) - Nautilus 2 starts, XKillClient() ESETROOT_PMAP (as you should when setting this property), and then sets it's own pixmap - Nautilus 1 allocates the desktop window, causing GDK to unset/set the background, which causes the bad pixmap error. Fixing this "right" isn't going to be easy ... in fact, the only really right fix is probably not to use the root pixmap window as the background pixmap for another window. But since that would be a big efficiency loss, suggestions would be: - Make sure that the nautilus single-nautilus locking occurs before stuff like fooling around with the background window. - Don't set the background image as the root until the window is allocated to the right size.... in fact, this is in sense a symptom of whatever bug is causing all the file manager windows to come up small and then get resized to the right size.
Re-assigning ownership as well.
*** Bug 74785 has been marked as a duplicate of this bug. ***
*** Bug 74979 has been marked as a duplicate of this bug. ***
*** Bug 75037 has been marked as a duplicate of this bug. ***
testing bz; ignore the spam.
Closing on michael's behalf; for some reason bz is borked for him. --- Thanks so much for looking at this Owen, the solution is really quite trivial. When multiple nautilus' are started they ask the main shell for new windows. Whether or not to create the desktop is determined on new window creation, by whether a) it's enabled and b) we appear to already have a desktop window. The problem with this is that the process of creating the desktop window is slow and re-enterant prone process (via GConf perhaps ?) and thus when we come to determine whether we should create a desktop window for the same view - since it is not yet constructed we build another desktop with the above problems. Just committing the fix here - a trivial re-enterancy guard around creating the (unique anyway) desktop window. I imagine the reason people who were not running multiple nautilus's saw this was a session issue, of nautilus trying to create multiple windows on startup with the same effect. Why closing a window should trigger it is unexplained though.
*** Bug 74734 has been marked as a duplicate of this bug. ***
(i only thought that closing the window triggered it because that's the only thing i did to it - normally it didn't crash on startup)
*** Bug 74938 has been marked as a duplicate of this bug. ***
after my X died, i logged in again and nautilus died with a bad pixmap. whever i change my background the settings daemon dies with a bad pixmap. so i am guessing this is still the bg code? where should this go then?
*** Bug 76173 has been marked as a duplicate of this bug. ***
I just put a patch into libbackground that fixes the gnome2-settings-daemon crash. This might fix the nautilus problem too. Could you give this a try jacob?
*** Bug 75683 has been marked as a duplicate of this bug. ***
Jacob tells me he hasn't run into this lately, so I'm going to close it.
*** Bug 75966 has been marked as a duplicate of this bug. ***
*** Bug 78283 has been marked as a duplicate of this bug. ***
Just found one of these from Nautilus 1.1.13. Reopening, just to be safe.
*** Bug 80790 has been marked as a duplicate of this bug. ***
*** Bug 81051 has been marked as a duplicate of this bug. ***
*** Bug 81229 has been marked as a duplicate of this bug. ***
Adding dave to the cc: for this bug since he just saw this this afternoon.
*** Bug 82508 has been marked as a duplicate of this bug. ***
*** Bug 82437 has been marked as a duplicate of this bug. ***
*** Bug 82705 has been marked as a duplicate of this bug. ***
*** Bug 82788 has been marked as a duplicate of this bug. ***
Eck. Don't know how this one escaped being marked 2.0.0. We absolutely can't ship without this or we'll get thousands of reports.
Everytime I log into GNOME2 now, Nautilus crash (see bug 82508 for stacktrace(marked as duplicate). So if there is any way my system can be of assistance to track this down, let me know.
*** Bug 83031 has been marked as a duplicate of this bug. ***
It doesn't look like a nautilus race condition, spawning several tens of these in quick succession results in only 1 instance, and umpteen windows, and no collision on the desktop / background of any sort.
Christian, the trace is useless unless we can run Nautilus with --sync, can you get gnome-session to pass --sync to nautilus in some way ?
Well tell me how to get get gnome-session to pass --sync and I configure my system that way. Nautilus crash this way for me at every login now so if you provide instructions on how I should be able to give you a useable trace.
I think the problem is between gnome-settings-daemon (gsd) and Nautilus, i.e. gsd decides to set the background and destroys Nautilus' background pixmap. There is code in gsd and Nautilus to try to avoid this, but it isn't correct in either of them. I added some gprint's to Nautilus and got this: In make_root_pixmap(): 1024 x 768 Setting background pixmap of GdkWindow: 0x827dbe8 Setting NAUTILUS_DESKTOP_WINDOW_ID In make_root_pixmap(): 1024 x 768 Setting background pixmap of GdkWindow: 0x827dbe8 Note that Nautilus is supposed to set NAUTILUS_DESKTOP_WINDOW_ID *before* it sets the background, so gsd knows not to touch it. It isn't doing that here, and there is a long gap after the first 'Setting background pixmap' - several seconds on a 850Mhz box. Plenty of time for gsd to set its own background, thinking that Nautilus isn't running. Dead easy to reproduce for me: gnome-settings-daemon& nautilus ... BadPixmap Also, the gsd code isn't too good either. It checks for NAUTILUS_DESKTOP_WINDOW_ID, then it nice's itself and creates the pixmap, then it sets it. So if Nautilus starts up while it is creating the pixmap, it will destroy Nautilus' pixmap. Other minor notes: o why does bg_applier_apply_prefs call nice (20). That means any process that uses that function gets niced. I don't think it should do that. o Nautilus recreates the background pixmap way too often, just by hitting 'Reload' or adding a file to the desktop directory. o If Nautilus is running gsd won't set the background. But what if Nautilus has 'Draw Desktop' switched off? The background won't get set? Is that intended?
*** Bug 83463 has been marked as a duplicate of this bug. ***
*** Bug 83471 has been marked as a duplicate of this bug. ***
*** Bug 83547 has been marked as a duplicate of this bug. ***
Created attachment 8892 [details] [review] libbackground patch for gnome-settings-daemon to check for nautilus just before setting the pixmap
Created attachment 8896 [details] [review] Nautilus patch to realize the desktop window ASAP, so the settings daemon knows it is running
the libbackground one looks ok - since nautilus doesn't nice itself (and that's supposed to be the default behaviour) then i don't see why g-s-d should...
Applied both patches. I'm a bit hesitant to close this, but I'm sure someone will remember to reopen it if we still see it :)
Yay
*** Bug 82078 has been marked as a duplicate of this bug. ***
*** Bug 103697 has been marked as a duplicate of this bug. ***
Dave Camp said, "I'm a bit hesitant to close this, but I'm sure someone will remember to reopen it if we still see it :)" We do, so I am.
Since this still isn't marked fixed, targetting 2.2.2
Well. We have no new dupes of this in three months plus. Time to finally close it?
We agreed to do just that. Reopen if someone sees it again.