GNOME Bugzilla – Bug 504600
Crash in handle_cached_dir_changed
Last modified: 2021-05-25 12:46:01 UTC
Version: 2.20.1 What were you doing when the application crashed? composing a mail using thunderbird Distribution: Solaris Express Community Edition snv_78 X86 Gnome Release: 2.21.2 2007-12-03 (Sun Microsystems, Inc.) BugBuddy Version: 2.20.1 X Vendor: Sun Microsystems, Inc. X Vendor Release: 6620 Selinux: No Accessibility: Enabled GTK+ Theme: nimbus Icon Theme: nimbus Memory status: size: 200871936 vsize: 200871936 resident: 22745088 share: 0 rss: 22745088 rss_rlim: 0 CPU usage: start_time: 0 rtime: 221 utime: 1726605 stime: 488640 cutime:0 cstime: 0 timeout: 0 it_real_value: 0 frequency: 0 Backtrace was generated from '/usr/bin/gnome-panel' (no debugging symbols found) sol-thread active. Retry #1: Retry #2: Retry #3: Retry #4: [New LWP 1 ] [New Thread 1 (LWP 1)]
+ Trace 182401
Thread 1 (LWP 1)
Thread 1 (LWP 1 ): #-1 0xceaf4075 in _waitid () from /lib/libc.so.1 No symbol table info available. #-1 0xceaf4075 in _waitid () from /lib/libc.so.1 ----------- .xsession-errors (17204627 sec old) --------------------- /etc/X11/gdm/PreSession/Default: Registering your session with wtmp and utmp /etc/X11/gdm/PreSession/Default: running: /usr/openwin/bin/sessreg -a -w /var/log/wtmp -u /var/run/utmp -x "/var/lib/gdm/129.158.148.26:12.Xservers" -h "129.158.148.26" -l "129.158.148.26:12" "nf14561 /etc/X11/gdm/Xsession: Beginning session setup... /etc/X11/gdm/Xsession: Setup done, will execute: /usr/bin/ctrun -l child -i none /usr/dt/config/Xsession.jds --------------------------------------------------
Thanks for taking the time to report this bug. Unfortunately, that stack trace is missing some elements that will help a lot to solve the problem, so it will be hard for the developers to fix that crash. Can you get us a stack trace with debugging symbols? Please see http://live.gnome.org/GettingTraces for more information on how to do so and reopen this bug or report a new one. Thanks in advance!
*** Bug 513706 has been marked as a duplicate of this bug. ***
*** Bug 513773 has been marked as a duplicate of this bug. ***
*** Bug 513819 has been marked as a duplicate of this bug. ***
*** Bug 514482 has been marked as a duplicate of this bug. ***
*** Bug 514638 has been marked as a duplicate of this bug. ***
*** Bug 514650 has been marked as a duplicate of this bug. ***
*** Bug 514663 has been marked as a duplicate of this bug. ***
*** Bug 514737 has been marked as a duplicate of this bug. ***
*** Bug 514952 has been marked as a duplicate of this bug. ***
*** Bug 514941 has been marked as a duplicate of this bug. ***
*** Bug 514902 has been marked as a duplicate of this bug. ***
*** Bug 515001 has been marked as a duplicate of this bug. ***
*** Bug 515114 has been marked as a duplicate of this bug. ***
*** Bug 515316 has been marked as a duplicate of this bug. ***
*** Bug 515842 has been marked as a duplicate of this bug. ***
*** Bug 516457 has been marked as a duplicate of this bug. ***
*** Bug 516554 has been marked as a duplicate of this bug. ***
*** Bug 516765 has been marked as a duplicate of this bug. ***
*** Bug 516836 has been marked as a duplicate of this bug. ***
Marking as NEEDINFO due to all the dupes.
*** Bug 516881 has been marked as a duplicate of this bug. ***
*** Bug 516940 has been marked as a duplicate of this bug. ***
*** Bug 517011 has been marked as a duplicate of this bug. ***
*** Bug 517113 has been marked as a duplicate of this bug. ***
*** Bug 517126 has been marked as a duplicate of this bug. ***
*** Bug 517245 has been marked as a duplicate of this bug. ***
*** Bug 517557 has been marked as a duplicate of this bug. ***
Matt, Brian: this crash might be opensolaris-specific. Could you give it an eye?
*** Bug 517692 has been marked as a duplicate of this bug. ***
*** Bug 517638 has been marked as a duplicate of this bug. ***
*** Bug 518240 has been marked as a duplicate of this bug. ***
*** Bug 518339 has been marked as a duplicate of this bug. ***
*** Bug 518536 has been marked as a duplicate of this bug. ***
*** Bug 518532 has been marked as a duplicate of this bug. ***
*** Bug 518632 has been marked as a duplicate of this bug. ***
*** Bug 518717 has been marked as a duplicate of this bug. ***
*** Bug 518779 has been marked as a duplicate of this bug. ***
*** Bug 518951 has been marked as a duplicate of this bug. ***
*** Bug 519024 has been marked as a duplicate of this bug. ***
*** Bug 519177 has been marked as a duplicate of this bug. ***
*** Bug 519247 has been marked as a duplicate of this bug. ***
After looking briefly at all the duplicate bug reports (phew). Here's the list of actions people were doing when the crash occured : - Installing Compiz (Three People) - Installing Xfce 4.4.1 - Installed Opera 9.26 - Composing mail in thunderbird (Two People) - Firefox Minimized with one terminal running - Firefox - Running ".exe" after installing Wine - Installing a package via command line. - Performing "pkgrm SUNWnmap" for installation of nmap-4.53 from source file - Installing dev tools e.g. install_devtools.sh - Right clicking calculator menu option. - Installing SUNW0Punching (Three People) - Editing Main Menu - Installing Developer Tools - Installing Netbeans - Nothing (Six People) - Using Terminal (Two People) - pkg-get from Blastwave - Installing citrix client - Login after screensaver activation - Generic application installation - Opening network settings control panel So from above list the majority of people were simply installing applications. Chances are that most of these installs would possibly be installing new .desktop files, for menu items. And looking at the stack trace, the crash appers to be occuring in "handle_cache_dir_changed()" from libgnome-menu.so.2. I need to have a quick look at this function to see what it does, and maybe there is something solaris specific that is going on here that is causing this, as all of the reports are for OpenSolaris only.
handle_cache_dir_changed() is used to update cached menu directory entries. So when a new .desktop file is installed or removed (i think), a menu monitor event is triggered and this caused handle_cache_dir_changed() to be called. Unfortunately there isn't enough debug information in the stack traces to be able to trace exactly where or what in handle_cache_dir_changed() is causing this problem. If any of the bug submitters can recreate this issue on a regular basis I'd be happy to provide a debug version of libgnome-panel.so.2 for solaris which would help in narrowing this issue down further.
(In reply to comment #44) > handle_cache_dir_changed() is used to update cached menu directory entries. > So when a new .desktop file is installed or removed (i think), a menu monitor > event is triggered and this caused handle_cache_dir_changed() to be called. > > Unfortunately there isn't enough debug information in the stack traces to > be able to trace exactly where or what in handle_cache_dir_changed() is > causing this problem. > > If any of the bug submitters can recreate this issue on a regular basis I'd be > happy to provide a debug version of libgnome-panel.so.2 for solaris which > would help in narrowing this issue down further. > Hi Matt, I don't have a specific way to trigger it again but I seem to be getting errors at least once a day. I'd be happy to install your debug and let it ride. Steve
Matt, This is fairly trivial to catch in dbx just by attaching the pid. I've reproduced it several times now, but can't seem to determine any kind of trigger behavior. It seems to happen as soon as I stop watching it. :-) I cannot find a ".desktop" file. This likely won't be useful without symbols, but here's what dbx gives me: (dbx) threads *> t@1 a l@1 ?() signal SIGSEGV in in <can't get PC>() (dbx) where current thread: t@1 =>[1] 0x0(0x82d5bc0, 0x82b3298), at 0x0 [2] handle_cached_dir_changed(0x82a5688, 0x2, 0x832d1c8, 0x822d258), at 0xf6717212 [3] emit_events_in_idle(0x0), at 0xf67201a7 [4] g_idle_dispatch(0x832d640, 0xf67200fc, 0x0), at 0xfec7aedb [5] g_main_dispatch(0x80f4078), at 0xfec77c76 [6] g_main_context_dispatch(0x80f4078), at 0xfec78d85 [7] g_main_context_iterate(0x80f4078, 0x1, 0x1, 0x80d3208), at 0xfec791a2 [8] g_main_loop_run(0x8291b38), at 0xfec797a4 [9] gtk_main(0x8047300, 0xfeffb7cc, 0xfeffb7cc, 0x8047300, 0x8047414, 0x8047338), at 0xfb3ea9de [10] main(0x3, 0x8047344, 0x8047354), at 0x80776e2 (dbx) regs current thread: t@1 current frame: [1] gs 0x000001c3 0x00000000 fs 0x00000000 0x00000000 es 0x0000004b 0x00000000 ds 0x0000004b 0x00000000 ss 0x0000004b 0x00000000 cs 0x00000043 0x00000000 edi 0x0832f4e8 esi 0x0822d258 ebp 0x0804713c esp 0x08047108 ebx 0x00000000 edx 0xfed08c00 ecx 0x00000000 eax 0x082b1a18 eip 0x00000000:0x00000000 <bad address 0x0> trapno 0x0000000e 0x00000000 err 0x00000014 0x00000000 eflags 0x00010206 0x00000000 (dbx) up 0xf6717212: handle_cached_dir_changed+0x0272: addl $0x00000008,%esp (dbx) I can also run a debug libgnome-panel.so.2. Note: I have upgraded my system to build 84. Rob
*** Bug 519432 has been marked as a duplicate of this bug. ***
*** Bug 519430 has been marked as a duplicate of this bug. ***
*** Bug 519456 has been marked as a duplicate of this bug. ***
*** Bug 519504 has been marked as a duplicate of this bug. ***
*** Bug 519494 has been marked as a duplicate of this bug. ***
*** Bug 519585 has been marked as a duplicate of this bug. ***
*** Bug 519701 has been marked as a duplicate of this bug. ***
*** Bug 519850 has been marked as a duplicate of this bug. ***
*** Bug 519959 has been marked as a duplicate of this bug. ***
*** Bug 519996 has been marked as a duplicate of this bug. ***
*** Bug 520004 has been marked as a duplicate of this bug. ***
*** Bug 520081 has been marked as a duplicate of this bug. ***
OK I've managed to recreate what I think is this bug with a recent opensolaris build, bear in mind this appears to be only happening using gnome 2.20. The file monitoring required for panel menu updates in 2.21/22 on solaris is not working correctly and I think this is because of the update of the panel to use GIO, which relies on fam/gamin in the OS, I am investigating into the status of this on Solaris to see what if anything needs to be done inside the panel to help this. Probably nothing. In the mean time I will try and determine the cause of this crash for 2.20.
The crash for me is occuring because of a reference to a null monitor callback. The directory monitor callbacks are stored in a GList under dir>monitors. When an entry within a monitored directory is changed/deleted the directory monitors are called. The GList dir->monitors is traversed and each (dir->monitors)->data->callback is in turn called, for some reason on of these monitor callbacks is NULL and thus the core. A quick fix is to ensure the NULL callback is not being referenced, in function cached_dir_invoke_monitors() e.g. + if (monitor->callback) monitor->callback (monitor->ed, monitor->user_data); i Once I add this line menu monitoring appears to function correctly for me. I am concerned about this rogue list element, how is being added, and why ? I'm not a libgnome-menu expert so I can't answer this easily, anyone of the panel maintainers care to chime in here ?
*** Bug 520389 has been marked as a duplicate of this bug. ***
*** Bug 520592 has been marked as a duplicate of this bug. ***
*** Bug 520590 has been marked as a duplicate of this bug. ***
*** Bug 520579 has been marked as a duplicate of this bug. ***
*** Bug 520604 has been marked as a duplicate of this bug. ***
Created attachment 106693 [details] [review] crasher fix From what I can see looks like this is a bug in entry-directories.c, and would appear to be being casued by incorrect traversal of the onitor linked list. In function : cached_dir_invoke_monitors() the list of monitors is travesed courtesy of a variable next. which is assigned at the start of each iteration. However the tmp variable can actually change during the processing of the loop by the callback function itself. Therefore the next item on the list could actually be NULL when you get to the end of the that loop iteration, but next still points to the memory of the deleted list item, and thus we end up with NULL callback pointer which is attempted to be called. This is likely a bug on Linux aswell but because of the differences between the OS's it's somehow not manifesting itself as a crash. Within the source all processing of GList monitors is done in this fashion so this should be changed. The attached patch applies to both 2.20 branch and Trunk and should be applied to both.
*** Bug 520733 has been marked as a duplicate of this bug. ***
*** Bug 520918 has been marked as a duplicate of this bug. ***
*** Bug 520992 has been marked as a duplicate of this bug. ***
*** Bug 520991 has been marked as a duplicate of this bug. ***
*** Bug 521117 has been marked as a duplicate of this bug. ***
*** Bug 521116 has been marked as a duplicate of this bug. ***
*** Bug 521113 has been marked as a duplicate of this bug. ***
*** Bug 521216 has been marked as a duplicate of this bug. ***
*** Bug 521217 has been marked as a duplicate of this bug. ***
*** Bug 521220 has been marked as a duplicate of this bug. ***
*** Bug 521256 has been marked as a duplicate of this bug. ***
*** Bug 521486 has been marked as a duplicate of this bug. ***
*** Bug 521514 has been marked as a duplicate of this bug. ***
*** Bug 521590 has been marked as a duplicate of this bug. ***
*** Bug 521538 has been marked as a duplicate of this bug. ***
*** Bug 521555 has been marked as a duplicate of this bug. ***
*** Bug 521635 has been marked as a duplicate of this bug. ***
*** Bug 521633 has been marked as a duplicate of this bug. ***
*** Bug 521745 has been marked as a duplicate of this bug. ***
*** Bug 521873 has been marked as a duplicate of this bug. ***
*** Bug 521866 has been marked as a duplicate of this bug. ***
*** Bug 521864 has been marked as a duplicate of this bug. ***
*** Bug 521855 has been marked as a duplicate of this bug. ***
*** Bug 521839 has been marked as a duplicate of this bug. ***
*** Bug 521907 has been marked as a duplicate of this bug. ***
*** Bug 521943 has been marked as a duplicate of this bug. ***
This bug was automatically submitted, most likeley since I registered the OE. Sorry for the clutter if that's the case.
*** Bug 521984 has been marked as a duplicate of this bug. ***
*** Bug 522081 has been marked as a duplicate of this bug. ***
*** Bug 522251 has been marked as a duplicate of this bug. ***
*** Bug 522361 has been marked as a duplicate of this bug. ***
*** Bug 522385 has been marked as a duplicate of this bug. ***
*** Bug 522531 has been marked as a duplicate of this bug. ***
*** Bug 522990 has been marked as a duplicate of this bug. ***
*** Bug 523295 has been marked as a duplicate of this bug. ***
*** Bug 523371 has been marked as a duplicate of this bug. ***
*** Bug 523495 has been marked as a duplicate of this bug. ***
*** Bug 523673 has been marked as a duplicate of this bug. ***
*** Bug 523760 has been marked as a duplicate of this bug. ***
*** Bug 523765 has been marked as a duplicate of this bug. ***
*** Bug 523799 has been marked as a duplicate of this bug. ***
*** Bug 523906 has been marked as a duplicate of this bug. ***
*** Bug 523944 has been marked as a duplicate of this bug. ***
*** Bug 523945 has been marked as a duplicate of this bug. ***
Hi, Since this bug is duplicated or in the process in solving. Please close this case. Thanks! Regards, Johnny
*** Bug 524196 has been marked as a duplicate of this bug. ***
*** Bug 524225 has been marked as a duplicate of this bug. ***
*** Bug 524279 has been marked as a duplicate of this bug. ***
*** Bug 524444 has been marked as a duplicate of this bug. ***
*** Bug 524845 has been marked as a duplicate of this bug. ***
*** Bug 524906 has been marked as a duplicate of this bug. ***
*** Bug 524945 has been marked as a duplicate of this bug. ***
*** Bug 524972 has been marked as a duplicate of this bug. ***
*** Bug 525256 has been marked as a duplicate of this bug. ***
*** Bug 525430 has been marked as a duplicate of this bug. ***
*** Bug 525437 has been marked as a duplicate of this bug. ***
*** Bug 525658 has been marked as a duplicate of this bug. ***
*** Bug 525460 has been marked as a duplicate of this bug. ***
*** Bug 526104 has been marked as a duplicate of this bug. ***
*** Bug 526168 has been marked as a duplicate of this bug. ***
*** Bug 526169 has been marked as a duplicate of this bug. ***
*** Bug 526170 has been marked as a duplicate of this bug. ***
*** Bug 526221 has been marked as a duplicate of this bug. ***
*** Bug 526441 has been marked as a duplicate of this bug. ***
*** Bug 526766 has been marked as a duplicate of this bug. ***
*** Bug 526809 has been marked as a duplicate of this bug. ***
*** Bug 526823 has been marked as a duplicate of this bug. ***
*** Bug 526837 has been marked as a duplicate of this bug. ***
*** Bug 526833 has been marked as a duplicate of this bug. ***
*** Bug 527077 has been marked as a duplicate of this bug. ***
*** Bug 527086 has been marked as a duplicate of this bug. ***
*** Bug 527153 has been marked as a duplicate of this bug. ***
*** Bug 527180 has been marked as a duplicate of this bug. ***
*** Bug 527227 has been marked as a duplicate of this bug. ***
*** Bug 527226 has been marked as a duplicate of this bug. ***
*** Bug 527215 has been marked as a duplicate of this bug. ***
Matt: I'm not totally comfortable with this part of the code, but the patch looks weird to me. If the callback modifies the list, then even with your patch, things could go wrong. I need to take a sheet of paper and a pen to write how things work to be sure that the fix is right...
*** Bug 527236 has been marked as a duplicate of this bug. ***
*** Bug 527246 has been marked as a duplicate of this bug. ***
Vincent, One of the basis for making the changes I did was the simple fact that all other iterations of the list were done in the manner that I've changed this function to do, and when I checked the revision history in SVN, this code has not changed for a long long time, where as all new code written appear to be using the method of traversal that I am changing it to. But go ahead and write down on paper, it's always the best route to take :) BTW this does solve the crash's on OpenSolaris.
*** Bug 528350 has been marked as a duplicate of this bug. ***
*** Bug 529090 has been marked as a duplicate of this bug. ***
*** Bug 529945 has been marked as a duplicate of this bug. ***
*** Bug 529986 has been marked as a duplicate of this bug. ***
*** Bug 530147 has been marked as a duplicate of this bug. ***
*** Bug 530539 has been marked as a duplicate of this bug. ***
*** Bug 530811 has been marked as a duplicate of this bug. ***
How can this be unconfirmed with that many duplicates? confirming report.
*** Bug 530838 has been marked as a duplicate of this bug. ***
How are you doing with the pen and the paper Vincent? :-)
(In reply to comment #156) > How are you doing with the pen and the paper Vincent? :-) Matt's fix fixes the crash but is kind of wrong (since it can remove some events, though unlikely to happen in real life). And a good fix means rewriting the way we handle the events. This is more or less planned for 2.23 :-)
Glad to hear the Pen & Paper method has completed. You mention that the fix I proposed will "possiblly but unlikely in real life" miss out on some events, I'll not argue, but would it make sense to get this integrated into gnome 2.22 ? especially as you point out the even handling is being re-written for 2.23, so therefore the potential even missing fix won't be there for that long :) BTW what is the new even mechanism being done for 2.23/24 any URL pointers to some more info ?
*** Bug 533551 has been marked as a duplicate of this bug. ***
*** Bug 534184 has been marked as a duplicate of this bug. ***
*** Bug 535037 has been marked as a duplicate of this bug. ***
*** Bug 537429 has been marked as a duplicate of this bug. ***
*** Bug 537952 has been marked as a duplicate of this bug. ***
There has not been a single report coming from GNOME 2.22 so far. Also couldn't find a ticket in Launchpad. Is this really still a blocker?
Reason for no 2.22 reports is : - All reports are for OpenSolaris - The patch attached fixes the bug on OpenSolaris and panel is distributed on OpenSolaris with this patch applied, and thus there are no 2.22 reports coming in :) Thus why I'd love to see this patch applied to community SVN :) so the patch could be dropped from OpenSolaris builds.
The reason I didn't apply the patch is that it's wrong (see comment #157).
(In reply to comment #157) > Matt's fix fixes the crash but is kind of wrong (since it can remove some > events, though unlikely to happen in real life). And a good fix means rewriting > the way we handle the events. This is more or less planned for 2.23 :-) This sounds like vuntz wants me to set the target milestone to 2.24 instead of 2.22. I'm fine with that as long as somebody really(TM) works on it. ;-)
*** Bug 540482 has been marked as a duplicate of this bug. ***
*** Bug 541744 has been marked as a duplicate of this bug. ***
*** Bug 543011 has been marked as a duplicate of this bug. ***
*** Bug 544929 has been marked as a duplicate of this bug. ***
Vuntz, any news here with regard to rewriting the way we handle the events, so that Matt could update the patch?
Unfortunately, no news. So this patch is still needed in OpenSolaris for now...
*** Bug 546543 has been marked as a duplicate of this bug. ***
*** Bug 546717 has been marked as a duplicate of this bug. ***
*** Bug 547375 has been marked as a duplicate of this bug. ***
*** Bug 549376 has been marked as a duplicate of this bug. ***
This will not be fixed for 2.24 according to vuntz.
*** Bug 550366 has been marked as a duplicate of this bug. ***
*** Bug 550960 has been marked as a duplicate of this bug. ***
*** Bug 551760 has been marked as a duplicate of this bug. ***
*** Bug 551920 has been marked as a duplicate of this bug. ***
*** Bug 552803 has been marked as a duplicate of this bug. ***
*** Bug 553879 has been marked as a duplicate of this bug. ***
*** Bug 558637 has been marked as a duplicate of this bug. ***
*** Bug 560978 has been marked as a duplicate of this bug. ***
*** Bug 561709 has been marked as a duplicate of this bug. ***
3 dups in the last 4 months => not a blocker anymore.
This problem can still be reproduced on opensolaris. But there is a bug in the attached patch. I will provide a new patch soon.
Created attachment 132325 [details] [review] Updated patch Detail about the patch can be found at http://defect.opensolaris.org/bz/show_bug.cgi?id=7677.
ping - patch available.
(In reply to comment #166) > The reason I didn't apply the patch is that it's wrong (see comment #157). Still valid :-)
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new enhancement request ticket at https://gitlab.gnome.org/GNOME/gnome-menus/-/issues/ Thank you for your understanding and your help.