GNOME Bugzilla – Bug 153331
Crash in gweather when trying to open prefs
Last modified: 2004-12-22 21:47:04 UTC
Distribution: Solaris 8 2/02 s28s_u7wos_08a SPARC Package: gnome-applets Severity: normal Version: GNOME2.8.0 unspecified Gnome-Distributor: GNOME.Org Synopsis: Crash in gweather when trying to open prefs Bugzilla-Product: gnome-applets Bugzilla-Component: gweather Bugzilla-Version: unspecified BugBuddy-GnomeVersion: 2.0 (2.8.0) Description: Description of the crash: Gweather crashes when trying to open its preferences. Steps to reproduce the crash: 1. Right-click on a running gweather. 2. Select "Preferences". 3. Crash. Expected Results: Should get prefs dialog. How often does this happen? Every time. Additional Information: Debugging Information: Backtrace was generated from '/home/poshea/usr/libexec/gweather-applet-2' (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...0xfe21f598 in _waitid ()
+ Trace 50354
------- Bug moved to this database by unknown@bugzilla.gnome.org 2004-09-21 17:01 ------- Unknown platform unknown. Setting to default platform "Other". Unknown milestone "unknown" in product "gnome-applets". Setting to default milestone for this product, '---' Setting to default status "UNCONFIRMED". Setting qa contact to the default for this product. This bug either had no qa contact or an invalid one.
This seems liks a fairly good stack trace. We could be passing NULL to strcmp(), however looking at weather_location_compare(), this could only be possible if one of the untranslated names is NULL. I suspect this is some sort of whacky old Solaris bug, but I think it should be easy enough to fix.
Created attachment 31804 [details] [review] quick fix Try this.
Thanks for the patch. I applied it, recompiled, reinstalled, killed the panel and let it restart, added a new weather applet, tried prefs...still the crash. The stack trace is the same.
Interesting. I suppose we should start with all the obvious questions. You are using Solaris 8, correct? Did you build with gcc, or the Sun Studio compiler, or something else? I assume you're using the Solaris libc.
If i am not missing something here, i think weather_location_equal() should be returning FALSE i.e 0 if the pointers are NULL. but we seem to be returning 1 when code/untrans pointers are NULL. does the patch below work ??
Created attachment 31871 [details] [review] Proposed patch.
Frank, From the strcmp manual: The strcmp() and strncmp() functions return an integer less than, equal to, or greater than zero if s1 (or the first n bytes thereof) is found, respectively, to be less than, to match, or be greater than s2. So we return 1 to say it's not equal (sane enough). The crash is in strcmp(), which is still going to get executed because the condition you impose is the same as the one which we've already trialled. What I suspect the problem is, is that we're seeing some strange solaris specific bug, to do with their implementation of strcmp, or somewhere in the compiler. Otherwise more people would notice this bug.
I tried Frank's patch, and unfortunately, no change. Yes, I'm on sparc solaris 8, compiling with gcc-3.4.2 and solaris libc. Just in case there's a difference, here's the relevant bits from the Solaris 8 man pages for strcmp: int strcmp(const char *s1, const char *s2); int strncmp(const char *s1, const char *s2, size_t n); strcmp(), strncmp() The strcmp() function compares two strings byte-by-byte, according to the ordering of your machine's character set. The function returns an integer greater than, equal to, or less than 0, if the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2 respectively. The sign of a non-zero return value is deter- mined by the sign of the difference between the values of the first pair of bytes that differ in the strings being compared. The strncmp() function makes the same comparison but looks at a maximum of n bytes. Bytes following a null byte are not compared.
Good, at least that's expected with the results I had predicted (I dislike non-deterministic errors). I'm really unsure what the problem could be. I'm going to write up a little program just to test some things. Unfortunately I only have access to Solaris 9. Stay tuned...
Created attachment 31882 [details] Quick and Dirty Test Suite Assuming we're not barking up the wrong tree with where the crash really is. Here is a quick and dirty test suite to run. It should produce results similar to: [madeld01@sun23 solaris_tests]$ uname -a SunOS sun23.ee.uwa.edu.au 5.9 Generic_112233-11 sun4u sparc SUNW,Ultra-5_10 [madeld01@sun23 solaris_tests]$ ./test String 1: 12345 String 2: Undefined String 3: 12345 String 4: 123456 String1 == String2: Uncomparable, NULL string String1 == String3: True String1 == String4: False String2 == String4: Uncomparable, NULL string I also confirmed, passing NULL to strcmp will core dump it. Assuming this works like I think it should, we're probably thinking about this all wrong.
Ok, tried out the test suite on Solaris 8. It needed an additional #include <strings.h> to compile. The output was identical to yours above.
(My system is a perpetually broken box but the symptoms here looked familiar) Don't know if it's related but I was running into a similar problem on a rickety old RH7.3 box. Turned out to be a problem with my perl modules. Updating perl-HTML-Parser from the fedora development SRPM got gnome-weather preferences working again.
Interesting. Since I didn't think we used any perl, I'm not sure how this will help, perhaps it's something to do with merging the translations at build time. It's worth looking into.
This was really obvious in Fedora Core 3 test2 in en_US, for some bizarre reason. See: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=134572 The problem is that we pass the current WeatherLocation to the XML parsing thing, and later set the current selection in the treeview which causes us to free that structure and then we try and use the original structure and boom. Attaching a patch and a stack trace which shows the problem more clearly.
Created attachment 32353 [details] [review] patch
Created attachment 32354 [details] stack trace showing the problem more clearly
Davyd: what do you think?
Very tired... answers may not make sense. Mark, you're definitely sure this is the same crash? Of course it will be, how many crashes are we going to have in the preferences... it's just the traces seem different (although he doesn't have full symbols, so anything could happen I guess). I feel like there is something I'm missing, but if the patch works, commit it and close this bug. Peter, if this bug still persists, even with what Mark's done, please reopen.
I tried Mark's patch, and it works. Thanks! I can look at my own weather, and not Pittsburgh's, again.
Comment on attachment 32353 [details] [review] patch Mark, you can commit this.
Davyd, the stack trace is supposed to illustrate where we re-read the location configuration causing us to free WeatherLocation we're passing around the place. Committed to HEAD and gnome-2-8 2004-10-12 Mark McLoughlin <mark@skynet.ie> Fixes crash when opening preferences dialog - bug #153331 * gweather-pref.c: (load_locations): pass in a copy of the current location since we re-compute the current location when we select it in the treeview.