GNOME Bugzilla – Bug 326362
panel crash with a11y enabled
Last modified: 2007-01-14 23:59:19 UTC
0x0081f402 in __kernel_vsyscall ()
+ Trace 65075
Thread 1 (Thread -1208236368 (LWP 20597))
Attaching the full trace since it's huge.
Created attachment 57074 [details] backtrace
Looks like a gail bug.
Kjartan; we need (at least!) * symbol table info for signal handler, which is where the problem probably arises * steps to reproduce Thanks.
Bill, the crash is happening in gtktreemodelfilter.c which in turn causes the signal handler to be called with signum=11, the signal handler is not crashing. The crash happened when using the run dialog IIRC.
Can't reproduce this any more. Closing.
Got it again without trying so hard :-/
*** Bug 331920 has been marked as a duplicate of this bug. ***
Should be fixed in cvs, and in gail 1.8.10.
*** Bug 331463 has been marked as a duplicate of this bug. ***
Upon inspiration from the gail module maintainer, we did some debugging today. We're positive this fix introduces a very bad regression so as to make trees inaccessible. Here's the rundown. The fix in question makes these changes to gailtreeview.c:count_rows: *** gailtreeview.c.orig Wed Mar 8 13:48:01 2006 --- gailtreeview.c Wed Mar 8 13:48:13 2006 *************** *** 4315,4321 **** { GtkTreeIter child_iter; ! if (!model || !iter) return; level++; --- 4315,4321 ---- { GtkTreeIter child_iter; ! if (!model) return; level++; The signature of count_rows looks like this (note that iter is the 2nd parameter): count_rows (GtkTreeModel *model, GtkTreeIter *iter, GtkTreePath *end_path, gint *count, gint level, gint depth) Let's now take a look at get_row_count from this same module. get_row_count calls count_rows: static gint get_row_count (GtkTreeModel *model) { gint n_rows = 1; count_rows (model, NULL, NULL, &n_rows, 0, G_MAXINT); return n_rows; } Notice that it is passing NULL in for iter. This implies that either it is legal to have iter==NULL or this code path is broken, too. Let's assume that it is actually OK to have iter==NULL and dig into what gtk_tree_model_iter_n_children (what count_rows calls first) will do in the presence of iter==NULL. The first stop on our journey is gtk+/gtk/gtktreemodel.c:gtk_tree_model_iter_n_children: gint gtk_tree_model_iter_n_children (GtkTreeModel *tree_model, GtkTreeIter *iter) { GtkTreeModelIface *iface; g_return_val_if_fail (GTK_IS_TREE_MODEL (tree_model), 0); iface = GTK_TREE_MODEL_GET_IFACE (tree_model); g_return_val_if_fail (iface->iter_n_children != NULL, 0); return (* iface->iter_n_children) (tree_model, iter); } Say "XYZZY" three times, and iface->iter_n_children can end up in gtk+/gtk/gtktreestore.c:gtk_tree_store_iter_n_children. Look at the lines I've prefixed with "-->". It looks as though iter=NULL may indeed be legal, and includes the "node = G_NODE (GTK_TREE_STORE (tree_model)->root)->children;" logic to handle this case: static gint gtk_tree_store_iter_n_children (GtkTreeModel *tree_model, GtkTreeIter *iter) { GNode *node; gint i = 0; --> g_return_val_if_fail (iter == NULL || iter->user_data != NULL, 0); --> --> if (iter == NULL) --> node = G_NODE (GTK_TREE_STORE (tree_model)->root)->children; else node = G_NODE (iter->user_data)->children; while (node) { i++; node = node->next; } return i; } Thus, when one goes all the way back to the gailtreeview.c:count_rows, it looks as though preventing this logic from happening is a very very bad thing. In fact, with this change pulled out, the unusable behavior we were seeing in Orca and Gnopernicus with respect to tree tables is gone.
OK, so you've gotten rid of your crash, but regressed the one in this bug :-(
Since you've gone this far, could we have a more appropriate patch than the one in question?
We were only able to analyze why the fix in question was not an appropriate fix in that it caused a more general failure as opposed to an isolated crash. I'm going to assume from your comments, however, that the maintainer of this module will still be unable to investigate this problem further?
As a user of Gnome 2.14, I have serious concerns that any delays in the investigation of this issue could have a long term impact on the over all accessibility of the platform. I understand that Gnome 2.14 is still prerelease software and as such, a certain level of instability is to be expected; However, the problem in this instance is that the issue currently being discussed is so severe as to make any further testing related to other issues involving accessibility virtually impossible. I use Orca as well as Gnopernicus, and find that I am unable to perform even the most basic of functions: In Gedit, I can't bring up the file open dialog and browse to a specific file. Under the login screen accessibility tab, I am unable to effectively select sounds to assign to system events. In Fedora core 5, I am unable to utilize the interface in the Pup package manager to select and deselect updates. The buddy list in Gaim is inaccessible. And the list goes on. At this point, I have completely given up on the thought of doing further testing, let alone any attempts to use Gnome 2.14 as an effective productivity tool, until this issue has been resolved.
Hi Al; Your comment will be useful in convincing the Gnome release team to allow us to revert the patch (since we are in code freeze), if that seems to be the lesser of two evils here. However since the original problem was a crasher, it may not be acceptable. I'll take a look at the issue tomorrow, given the seriousness of the issue.
Will and Rich: Reading gtktreemodelfilter.c, one sees that it's not legal to call gtk_tree_model_filter_iter_has_child with a NULL iter. This is why your suggestion that count_rows allow a NULL iter is problematic, as count_rows calls gtk_tree_model_filter_iter_has_child.
Looks like a gtk+ bug in gtktreemodelfilter.c, line 2070, right after the FIXME comment, where gtktreemodelfilter.c assumes that if iter->user_data2 can be cast successfully to FILTER_ELT, the resulting elt->children->array will be non-NULL if elt->children is non-null. This is the line of code where the SEGV is occurring, and it's well past the point where legal calls have been handed over to gtk+ from gail.
Thanks for looking into this further. I'm not sure from your last two comments, but I'm assuming that you agree the gail patch for iter==NULL should be removed: from the code path in my earlier post, I've believe I've shown that iter==NULL is indeed legal and required for proper behavior for some calls - it just depends upon what GTK_TREE_MODEL_GET_IFACE (tree_model) gives you (e.g., an iface from gtktreemodelfilter, an iface from gtktreestore, etc.). But, thanks to your digging, it looks like gtktreemodelfilter.c also needs to be fixed to solve all these bugs in question.
Yes, ignore comment #16 from me as irrelevant. I think the right thing to do is revert part of the gailtreeview patch (specifically, the part that rejects iter==NULL, as you suggest) and apply the attached gtk+ patch for gtktreemodelfilter.c This should both reinstate accessibility of trees, and prevent the SEGV that is the subject of this bug.
Created attachment 61040 [details] [review] one-line change to gtktreemodelfilter.c which prevents the SEGV
Created attachment 61041 [details] [review] change to gailtreeview (also one line) which restores treeview accessibility
Kjartan, if you could apply the patches and confirm that the original problem is resolved, we would be very grateful. This would make it feasible to apply both patches to gnome 2.14.0 and in so doing make it useful to our visually impaired users again.
The fix in comment #20 looks harmless and obviously correct. If that fixes the a11y problem, I'll do a gtk release with it.
with the gail+ gtk patches, I don't see a crash from the "Add to panel" dialog (if that is where the original crash occurred). gnopernicus does not read the content of the treeview cells to me either, though (not sure if I should expect that...)
I just verified that the gail patch <http://bugzilla.gnome.org/attachment.cgi?id=61041> fixes the global accessibility problem, including being able to read the "Add to panel" box, the GEdit open box, the GAIM buddy list, selecting a system sound to be played, etc. Note that in the "Add to pabel" table, you may need to press Ctrl+<right arrow> to get the appropriate speech output. I'm off to now include/test the gtk patch.
I just verified that the gtk patch <http://bugzilla.gnome.org/attachment.cgi?id=61040> didn't seem to cause any problems. That is, I've not been able to reproduce the SEGV. Furthermore, with both the gail and gtk patches applied, tables work as expected with Gnopernicus and Orca. Mattias, can you please verify that your gail/gailtreeview.c module at line 4318 looks like: if (!model) return; and that you're indeed running/testing with the gail fix in place? That is, after you've built this, you should make sure accessibility support is enabled and that you log out and back in to be extra sure that the new gail module is being used. This can be an easy thing to forget to do.
hmm, gnopernicus does not want to speak to me at the moment. Thus I cannot verify that Ctrl-right works to read the treeview contents.
Does test-speech work? If not, you might try killing all *-synthesis-driver processes and trying again.
I was able to apply the gailtreeview patch with no problem; However, when I attempt to apply the patch to gtktreemodelfilter from a fresh source tree as checked out last evening, I get the following: [root@systemax gtk]# patch -p1 <treemodelfilter-segv-fix.diff patching file gtktreemodelfilter.c Hunk #1 FAILED at 2067. 1 out of 1 hunk FAILED -- saving rejects to file gtktreemodelfilter.c.rej
Al, the patch is for the 2.8 branch of gtk+ (for gnome 2.14).
The GTK+ patch is in 2.8.15, just released
Thanks so much everyone! This definitely helped avert an accessibility disaster for GNOME 2.14.
I finally got a chance to build a smoketesting environment for 2.14.0 now and I'm sad to say I still get this crash with gtk 2.8.15 and gail 1.8.11. Am I missing some fix here? Would be very nice to get this fixed for the release.
Kjartan, is the SEGV still happening here: 0x01016528 in gtk_tree_model_filter_iter_has_child (model=0x8152670, iter=0xbffe2d54) at gtktreemodelfilter.c:2070 or is it somewhere else?
Here's the backtrace I got: [kmaraas@localhost shared-mime-info-0.17]$ cat /tmp/panel-a11y-crash.txt (gdb) bt
+ Trace 66958
Just to make it clear: this is most likely a gail bug, not something in gtktreemodelfilter.c (according to kris)
Let me just add that this happens *every* time I do alt+f2 and type in an app name to start an app (crash happens after running the app)
Hum, it appears I messed up and did not actually commit the gtktreemodelfilter.c workaround in the tree where I did 2.8.15. Too bad
Spinning 2.8.16 now
Matthias: I noted there was one other occurrence of elt->children->array->len without checking if elt->children->array was non-NULL in gtktreemodelfilter.c (in gtk_tree_model_filter_iter_n_children); should it be checked more carefully too?
The patch adds the extra check to both of these. Thats the only change between 2.8.15 and 2.8.16
>Just to make it clear: this is most likely a gail bug, not something in >gtktreemodelfilter.c (according to kris) Hmm, when I talked to kris he didn't say that, only that it _might_ be gail, but was just as likely to be a problem _elsewhere in gtktreemodelfilter_.
maybe I overstated it; yes, the problem could be somewhere else in gtktreemodelfilter.c
Is this still happening with gail+gtk+ from 2.8.HEAD and gail-HEAD?
No, the patch that Matthias commited to gtk+ fixed it for me AFAICS.
I left the bug open, since the committed patch is just a workaround for a problem a some other place, according to kris.
Yep. Would be good if somebody could test if this bug occurs with GTK+ HEAD.
I think we should close this bug until we have a way of reproducing it again. Looking at the code in HEAD, I wonder if this will cause a critical warning to be thrown, since in this case gtk_tree_model_filter_iter_has_child is being called inside the gtk_tree_view_destroy emission...
BTW, the above call should be technically legal, but only a11y is likely to do it (because when we emit the object:state-changed:defunct event on the GtkTreeView ATK peer, we need to gather a little info on the treeview object).
Apologies for spam... ensuring Sun a11y folks are cc'ed on all current accessibility bugs.
Apologies for spam... marking as AP1 to reflect accessibility impact (assuming still reproducible)
I can still reproduce this with gtk 2.8.17, gnome-panel 2.14.1 and gail 1.8.11 - backtrace attached: Backtrace was generated from '/usr/bin/gnome-panel' Using host libthread_db library "/lib/libthread_db.so.1". [Thread debugging using libthread_db enabled] [New Thread 46912584629552 (LWP 23833)] [New Thread 1074006368 (LWP 23836)] 0x00002aaaace1f0ca in waitpid () from /lib/libpthread.so.0
+ Trace 68135
Thread 1 (Thread 46912584629552 (LWP 23833))
*** Bug 336205 has been marked as a duplicate of this bug. ***
*** Bug 346530 has been marked as a duplicate of this bug. ***
Crasher with duplicates, Severity critical.
Did *anyone* test this with GTK+ 2.10 yet? A few months back I asked if somebody could test with GTK+ HEAD, but I didn't get any responses. Since quite a lot of bug fixes for the filter model went into GTK+ 2.10, it would be really interesting to know whether the crash still occurs with 2.10.
I've just tested this and can no longer reproduce the crash in the run dialog with Orca running.
*** Bug 336896 has been marked as a duplicate of this bug. ***
*** Bug 362105 has been marked as a duplicate of this bug. ***