GNOME Bugzilla – Bug 346588
segfault displaying full group list for newsgroups.comcast.net
Last modified: 2006-08-27 04:27:46 UTC
This is pan 0.102 although I get the same behavior in earlier versions, specifically 0.99 and 0.101. The system is SUSE Linux 10.1. The crash appears with pan compiled on that system, and also on a binary 0.99 linked from pan's web page. To reproduce: 1. Start with fresh build of pan and empty ~/.pan2 2. Start pan, and set up news server newsgroups.comcast.net (with comcast id and password). 3. Wait for pan to finish fetching the group list. At this point "Other Groups" acquires a twisty. 4. Click on the twisty. The first screenful of the list appears and less than a second later, pan has crashed with a segfault. The gdb backtrace goes on for ever with the same thing repeated thousands of times (looks like infinite recursion). I'm only showing the beginning below. I will try to attach the list of groups as suggested by someone in the pan-users mailing list. The list of groups was obtained by running pan 0.14.2 which doesn't have this bug (the group names seem to be null-delimited). I can't get the list from the current pan since it crashes before writing the list to disk. david@linux:~> gdb /usr/local/bin/pan GNU gdb 6.4 Copyright 2005 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i586-suse-linux"...Using host libthread_db library " /lib/libthread_db.so.1". (gdb) run Starting program: /usr/local/bin/pan [Thread debugging using libthread_db enabled] [New Thread -1219594576 (LWP 22997)] GTK Accessibility Module initialized Program received signal SIGSEGV, Segmentation fault.
+ Trace 69192
Thread NaN (LWP 22997)
The above repeats "forever" (i.e. I gave up after getting up to #8000).
Created attachment 68366 [details] Full list of newsgroups from newsgroups.comcast.net The file is default_unsub.dat written by pan 0.14.2 looking at the same news server. It appears to have the group names separated by nulls. I'm guessing there is something "strange" somewhere in that list which is triggering the crash in e.g. pan 0.102. Oops. bugzilla says the file's too big. Trying to gzip it...
Is there any way you could set your comcast password to a temporary so that I could use it for a day or two to try to duplicate the crash?
(In reply to comment #2) > Is there any way you could set your comcast password to a temporary > so that I could use it for a day or two to try to duplicate the crash? > Wouldn't work. The news server is only accessible from a comcast IP address. I'm in Eastern Mass. if you want to make a house call. The server is really operated by giganews so if you can get a hold of a giganews account it might have the same group list. I could also massage the group list to be newline-separated if that would help. I'd be glad to test any patch of course.
Really strange -- I use giganews on a regular basis. also, the gail_tree_view_new calls in the backtrace are red herrings for some other corruption, since pan doesn't use gail at all. Let me think about this. Maybe I should work up a patch to sniff out more clues. Also, what version of gtk+ do you have installed?
Well, doing an rpm -qa | grep gtk I see gtk-1.2.10-907 gtk2-2.8.10-36 (also same versions of the -devel packages).
Darn, I was hoping it was the gtk2-2.9.1 tree bug. :) wild guess: does it work if you remove the gtk-qt-engine package?
gtk-qt-engine was/is not installed.
To try and eliminate some items have you tried running Pan on another system, maybe via a LiveCD? I asked a friend who has Comcast to try the Windows version and he had no issues so if we can confirm that the problem happens on multiple PC's then that eliminates a local problem.
The only other computer I have is an (Intel) iMac so I'm investigating the possibility of building pan there. DarwinPorts and Fink both have pan (0.14.2), so it seems like this should be possible.
I just built pan-0.103 on the iMac (via DarwinPorts etc. using a mod to glib-2.0/glib/gi18n.h as described in pan-users). This does NOT get the segfault reading the comcast group list. And I just checked again and the SUSE 10.1 version still does.
(In reply to comment #10) > I just built pan-0.103 on the iMac (via DarwinPorts etc. using a mod to > glib-2.0/glib/gi18n.h as described in pan-users). This does NOT get the > segfault reading the comcast group list. And I just checked again and the SUSE > 10.1 version still does. > Any chance you could try compiling Pan from source on SUSE to verify that it isn't a package problem?
That's what I've been doing all along (except that I also got the segfault when I tried a binary of pan-0.99 that I got by following a link on the Pan website).
I hate to keep asking you to try various things but could you snag a livecd (Possibly the UBUNTU one) and install Pan to that and try it? That would help narrow the problem down a little further.
Pardon my ignorance. I thought that a Live CD was an inherently read-only situation. So how could I install anything when running one? Also, only my iMac has a drive that can write a CD. Can I download and burn a live CD using that and then boot my PC from it? Obviously, the download route would allow me to respond more quickly to your request. Does the standard Ubuntu CD that you can download from their site have the Live CD capability?
(In reply to comment #14) > Pardon my ignorance. I thought that a Live CD was an inherently read-only > situation. So how could I install anything when running one? Also, only > my iMac has a drive that can write a CD. Can I download and burn a live CD > using > that and then boot my PC from it? Obviously, the download route would allow me > to respond more quickly to your request. Does the standard Ubuntu CD that you > can download from their site have the Live CD capability? > The LiveCD create a RAM-Drive for your Home directory so you should be able to install it. It will be gone when you reboot but it should be enough to get everything installed for testing. I haven't tried PAN specifically but I have installed other apps to the LiveCD (Specifically Network-Manager) and had no problems. The Ubuntu "Desktop" CD should be the live CD.
what qt theme are you using?
(in reply to comment #15): Ok, I'll look into acquiring the Ubuntu CD by some means. (in reply to comment #16): None. I use the gnome desktop. I do have qt libraries, so I can run qt apps (e.g. amarok), however.
(in reply to comment #13) Per your request, I have made an Ubuntu 6.06 Desktop CD and booted from it (running that system right now). I fetched and installed a package for 0.103 from: http://darrenalbers.com/Pan/ (which was linked from the Pan web site). This was able to load the Comcast group list without any segfault, same as on the MacOS X system.
I discovered something interesting while running under gdb (using ddd). If I set a breakpoint at group-pane.cc:320 (inside on_button_pressed()), the bp will first be hit when I click the "twisty" for Other Groups. I hit Cont and come back to the same bp. Hit Cont. again, and the list is populated and I can go on, select a group etc. without any crash. Now as a control, I rerun with that bp cleared, and I get the same old segfault. So whatever is going on here, simply stopping execution via that breakpoint and then letting it go causes the bug not to happen. Does that tell you anything?
David: am I reading comment #10 and comment #18 right that this is working as of 0.103? 0.103 added stronger checks to make sure the data it gets from news servers is converted into UTF-8 (which is GTK's native encoding). I don't know if it's related -- because I've never seen it cause a crash before -- but it was very common pre-0.103 to get a lot of console warnings about invalid text when clicking the "other groups" expander.
I still get the segfault with 0.103 on my SUSE 10.1 setup. The only console output is: david@linux:/work/david> /usr/local/bin/pan GTK Accessibility Module initialized Segmentation fault Comment #10 was that there is no problem on MacOS X Tiger. Comment #18 was that there is no problem with the Ubuntu "live CD" install. So the problem is confined to my SUSE 10.1. It seems somehow timing-related based on Comment #19.
Still getting segfault with pan 0.104, SUSE 10.1, newsgroups.comcast.net.
Is there any kind of SuSE-on-a-disc that I can use to try Pan out on SuSE without having to go through the headache of installing another distro?
There is a LiveDVD available from: http://en.opensuse.org/Released_Version It seems to be VERY slow for me right now but maybe Suse just doesn't like me today! ;-)
Well it looks like you can't install anything to the SUSE LiveDVD so I guess the only option is using VMWARE player. I can give this a try sometime this week if Charles does not have the time.
Daren: that would be geat.
I just tried this and it seems to work fine for me in VMWARE. I tried the 0.99 rpm. This was with Newscene and Highwinds. I don't have a giganews account to test with.
I think what I need to do is get a standard giganews account and see if it happens for me with that (and SUSE 10.1). I should reiterate that I do NOT get the segfault unless BOTH of the following are true: 1. I'm using any beta build of pan on SUSE 10.1. 2. I'm using newsgroups.comcast.net Change either of those factors, and no segfault. I'm away from home now so can't do the giganews experiment until 8/6.
Update: segfault under conditions stated previously is still there in 0.106 and 0.107.
As I promised to do in comment #28, I just signed up for a standard giganews account (the cheapest option with the 2Gb limit). The result is that I DO get the segfault with this server (news.giganews.com) also, not just with newsgroups.comcast.net. My hope, obviously, is that I can find someone who is also running suse 10.1 and who can try to repeat this experiment. Giganews says that if you cancel after 3 days there is no charge. That should be enough time to do the experiment.
Still getting the segfault as before with 0.108.
I don't know what to do about this. It would be nice if someone else could confirm the crash.
Reading all the history again I think we have narrowed the issue down to something related to OpenSuse or OpenSuse's relation to PAN. I have this Friday off so if I get time between mowing the lawn and all the other misc items I will load OpenSuse on VMWARE again and get an account with giganews and see if I get the same thing.
I forgot that I had OpenSuse already setup in WMARE. I signed up for an account with giganews and was able to successfully download all groups. Just to make sure it is not a specific server issue news.giganews.com resolves for me as 216.196.97.131. Charles/David, any ideas on something I can try?
Eureka. Today, I set up the "smart" package manager to try to get my system updated (since the SUSE Zen Updater seems to be broken). This upgraded over 100 packages including gtk2 (which I suspected of being somehow involved in this problem). Didn't help. Then I tried something which in retrospect should have been obvious: I logged in as a recently-created (different) user, launched pan (0.109) and added newsgroups.comcast.net with my usual account. No segfault. I tried deleting everything in /tmp owned by my usual userid and logged in as myself. Still got the segfault. I figure it must be some corrupt dot file (or subdir) in my home directory, so I renamed my home directory, created a new, empty home directory for myself and moved in a few things from the old one (mail folders, .emacs, etc.) Then I logged in and, of course, did not get the segfault. I knew by this time that it had to be "something nasty in the home directory" based on the earlier experiment. It will take me a while to restore my preferred desktop configuration, move more things in from the old home dir, etc. but meanwhile... I think... you can finally CLOSE this sucker! I hope I've brought a bit of sunshine into your day. -- David
Fixed the problem by going to a brand new, empty, home directory. I think the many dotfiles and "dotdirs" created by gnome, gtk and friends must have a way of going bad with time. I would guess that it happened when I upgraded from SUSE 10.0 to 10.1. I updated bug 346588 and unless this is all a dream, I suppose it can now be closed. It is still strange that I only got the segfault with Comcast and giganews, but that will most likely never be explained. -- David