GNOME Bugzilla – Bug 124837
ALL_WINDOWS keylisteners can hang desktop under some circumstances
Last modified: 2004-12-22 21:47:04 UTC
The following user actions will cause the desktop to appear to hang, user can only regain control of the desktop by restarting the X server: * from the 'applications' menu, launch volume control with gnopernicus running; * select 'about' from the volume control menus; * show the 'credits' dialog; * press numpad-0 numpad-3 (layer 3), then numpad 7' * you will hear the contents of the credits window read twice, followed by what seems to be feedback from a focus/context switch * then the system will become unresponsive. Disk activity may ensue for some time, then slow or cease. (at-spi-registryd remains active during this period as evidenced by running at-spi-registryd in an xterm window). only way to do anything useful from then on is to kill X (or perhaps killing gnopernicus from a remote terminal will work, invesigating that now).
The cause of this bug is bug 108664. If, in file kbd_mouse/libke/libke.c, line 39 #define USE_ALL_WINDOWS is replaced with #undef USE_ALL_WINDOWS, then this bug is no longer present. I mark this as NOTABUG.
This is indeed a bug! If a series of user actions hangs the system, that is a bug no matter what the underlying cause.
It does not make sense to have a bug depending on a bug which is closed. I am not suggesting that bug 108664be opened but either the issues shoudl be discussed in this bug or a new bug be opened.
Created attachment 20995 [details] [review] a modified file from gnopernicus to prove that gnopernicus not hangs
After starting gnopenicus and volume control/about/credits dialog and changing to layer 3, presing key 7 from numpad makes both applications to behave strange. Afer first press (only one press), gnopernicus generates the output below: START LAYER KEY "L03K07" END LAYER KEY "L03K07" START LAYER KEY "L03K07" END LAYER KEY "L03K07" START LAYER KEY "L03K07" END LAYER KEY "L03K07" So, if user press the key once, 3 events are comming to srcore. At this time focus indicator for the credits dialog is no longer present. After the second press, focus indicator shows again, but srcore is not receiving any layer event, only a window switch and a focus events. Paul, can you check why for first pressing 3 events comes to srcore and for second none?
I tested it on my system and it has a same behavior. The KeyPad7 event are generated 3 times. It seems that the issue is raised from at-spi or from X window keyboard module (I don't know who), when it use the ALL_WINDOW flags to notify numpad key events. The generated output by the keyboard echo show that the key events are genereted by the at-spi. " (Layer)Received key event: sym 65456 () mods 10 code 90 time -2096851205 keystring "0" type 1 (press = 1, release = 2) (Layer)Received key event: sym 65456 () mods 10 code 90 time -2096851097 keystring "0" type 2 (press = 1, release = 2) (Layer)Received key event: sym 65459 () mods 10 Untitled 1code 89 time -2096850646 keystring "3" type 1 (press = 1, release = 2) (Layer)Received key event: sym 65459 () mods 10 code 89 time -2096850551 keystring "3" type 2 (press = 1, release = 2) (Layer)Received key event: sym 65463 () mods 10 code 79 time -2096839211 keystring "7" type 1 (press = 1, release = 2) (Layer)Received key event: sym 65463 () mods 10 code 79 time -2096839109 keystring "7" type 1 (press = 1, release = 2) (Layer)Received key event: sym 65463 () mods 10 code 79 time -2096839109 keystring "7" type 2 (press = 1, release = 2) (Layer)Received key event: sym 65463 () mods 10 code 79 time -2096838709 keystring "7" type 1 (press = 1, release = 2) (Layer)Received key event: sym 65463 () mods 10 code 79 time -2096797910 keystring "7" type 2 (press = 1, release = 2) " This output is will be generated if apply this test patch for libke.c file what I will attach. The showed info is from structure what it passed when the keyboard callback is called. Near of this patch I will attach the generated debug string by the at-spi-registryd. Can somebody check it what happening in at-spi? Why are the event generated 3 times in this case? (Padraig, Bill)
Created attachment 21002 [details] [review] Patch for generate the output by the libke.c.
Created attachment 21004 [details] Log generated by the at-spi-registryd
the first attachment above seems to be the file main.c; so I don't understand it's purpose. gnopernicus may not actually 'hang' but it seems to stop responding to user actions (either at-spi events or number pad events). So from the user's perspective it is hung from that point on. Is there some key sequence which the user can press at this point to regain control of gnopernicus and the desktop?
Yes, first attachment is srmain.c file. It purpose is to prove that a callback from gnopernicus is called by at-spi, and that callback ends, so the problem is not in gnopernicus, but somewhere else, perhaps in at-spi. So, not gnopernicus hangs, the other app hangs. It stops to respond to user actions.
There is another interesting thing in this dialog (the credits dialog of GNOME Volumen Control): there are three items in the tab sequence: the tab pane, the Close button, and a scroll pane. The scroll pane seems to be a bug.
To reproduce this bug, user must change to layer 3 (by pressing 0 then 3 keys from numpad). After that if key 7 is pressed 3 key events from at-spi come from at-spi to gnopernicus (same result as for 3 presses from user side). Focus indicator for the tested app (Volume Control/Help/About/Credits) disappears, and key navigation for tested app is broken. When user press the key 7 from numpad, no key event (as user doesn't press the key) but a window activate and a focus event come from at-spi. Focus indicator is again where it should be and key navigation works again. From this point, this scenario repeats again. NOTE: After first press of the key 7, no key work except keys from numpad work (user can change layer, ask gnopernicus to do different actions -- goto parent, etc) but still no focus indicator till the second press of key 7.
I will look at this. I think I can reproduce it.
I did some investigation and I succeded in reproducing the bug outside of gnopernicus, too. I am attaching a test program that reproduces the same scenario that triggers the bug in gnopernicus. How to compile it: - the program is based on the "simple-at.c" test program from at-spi/test - in order to compile it copy it in the at-spi/test directory How to use the test-program: * launch it * launch gnome-cd or volume-control (any application that has UI), so you would see the graphical indication of focus. * press CTRL+ALT+m -You will see that the focus dissapears and appears again after a very short period of time. -Note that the callback executes a "sleep" (to simulate a long execution) -note that there are 2 events comming for a single "press" * activate the "NumLock" and than press "m" (this is the scenario from gnopernicus) - you will see that the focus is dissapearing - note that the callback executes a "sleep" (on my machine I can reproduce the bug if the number of seconds is greater than 1) - note that there are 3 events comming for a single "press" - even after the debug messages where displayed the focus is not appearing (graphically) * press again "m" (NumLock us still activated) and the focus will be visible again * press CTRL+ALT+q or NumLock (active) + q and the program will exit Some clarifications for the initial bug: - the bug is reproducible with any application read by gnopernicus, but only with the layer3 key7, because the function that is executed takes a long time.(in the test program this is simulated through a sleep (1) ) Conclusion: - bug appears for a key that was registred with SPI_KEYLISTENER_ALL_WINDOWS flag and SPI_KEYMASK_MOD2, and for which the callback takes a long time to be executed. - in my opinion this is an at-spi bug or below, but not a gnopernicus bug.
Created attachment 21057 [details] Test program (based on simple-at.c from at-spi/test)
I wonder, are the multiple key events due to registration for both press and release?
Additional information: -If the key registred with SPI_KEYLISTENER_ALL_WINDOWS flag and SPI_KEYMASK_MOD2 is not consumed the focus is gained back immediately. - in the case from the test-program (same scenario as in gnopernicus : key registred with SPI_KEYLISTENER_ALL_WINDOWS flag and SPI_KEYMASK_MOD2 and CONSUMED) the focus is regained just after a second press of 'm' (NumLock being active) Please see the following code in the test program: <snip> case 'M': case 'm': { int sec = 1; fprintf (stderr,"\nI will sleep %d seconds", sec); sleep (sec); return TRUE; /*I want to consue this key, like gnopernicus does*/ } </snip>
I have not yet looked at the test program. I found that when I updated gnopernicus to CVS HEAD that I could no longer reproduce this problem. It seems that the fix for bug #124108 fixes the problem that Bill reported. Is it a bug that an ancestor of an AccessibleObject is not an Application? I think that the priority of this bug, the problem shown by the test program should be reduced.
Padraig, are you sure about the bug number you quoted above? That looks to have been a Java bug, which doesn't impact this issue. Adi: thanks for your investigations, they have been very helpful. Padraig: why do you think we should reduce the priority? Do we have reason to think that the problem is really solved, or has the symptom just gone away?
The fix applied for bug #124108 prevents srcore looping if when traversing from an object up the tree one does not find an Accessible for which Accessible_isApplication is TRUE. This was what I found to be happening with mixer_applet. I logged bug #125834 about this. The reason I suggest reducing the priority is that I believe that the problem you reported no longer occurs with gnoipernicus from HEAD. What Adi and Remus are demonstrating, I believe, is that there will be a problem if a callback takes an inordinate amount of time.
Padraig : I think you are missing the point here. First of all : the bugs that you are mentioning have nothing to do with this situation. The function that is executed in gnopernicus, for layer 7 key 3, is (src_hierarchy_flat), a function that makes screen-review and presents all the lines at once, not on demand. So this is an expensive operation, but one that is working (not like the infinite loop that was hapenning in #124108). If you map the same command "window hierachy flat" on CTRL+ALT+U (for instance), the function is executed properly and the focus is regained immediately. Note: in order to do that please apply the patch http://bugzilla.gnome.org/show_bug.cgi?id=125764 Second:"callback takes an inordinate amount of time". What do you mean by that? What are the usual bounds of time that should we respect in a callback? There was no specification that the callbacks should respect a certain quantity of time and gnopernicus is not design to do that. What would happen if the user will want flat-review for the entire desktop (in the future)? As we know this will be a very expensive operation. So in my opinion the callbacks should indulge us in executing time consuming operations. Third: There are lots of inconsistencies between behaviors for keys registred differntly (PLEASE see the test program), so at-spi clients (especially gnopernicus, will expose inconsistent behaviors: same function mapped on layers (default) will have diffrent behavior than if it's mapped on "user defined" keys. Conclusion: 1.This is a serious bug and I think that its priority is the proper one. 2.I think that it was proven that this is an at-spi bug, so it is appropriate for me to reassign it.
It's still not proven that this is an at-spi bug. We need to determine some things: 1) is the gnopernicus callback returning, or it is blocking for some reason? (sounds as though it's returning, but...) 2) is at-spi-registryd blocking on the gnopernicus callback, or at all? 3) if not, why is gnopernicus not receiving further at-spi-registryd events? If the gnopernicus callback is returning normally, then we need to determine why gnopernicus is failing to respond to further user keystrokes until the 'triggering' key is pressed again. I still think there is a gnopernicus bug in here even if there is also a serious at-spi bug at work.
Adi: Padraig is noting the srcore bug which can cause srcore to loop in callbacks, which would indeed cause a hang. However I am doubtful that this is the problem here. However I don't think you and Remus quite understand my observations either. What I am seeing: * neither gnopernicus nor at-spi-registryd are actually dying * at-spi-registryd keeps running, but no longer forwards key events once the problem occurs (other than to gnopernicus). This suggests that gnopernicus' callbacks are failing to return. I am not using gnopernicus HEAD; I will see if the problem has been fixed there in the meantime. I don't think the observed behavior with the modified simple-at is the same as what I am seeing and which led to the filing to this bug.
I do not believe that this bug depends on 108664.
OK, I am thinking Padraig was right. I think srcore on my system is not returning from one of it's callbacks (at all, it's not just slow) until the '7' is pressed again or something similarly causes gnopernicus to break out of it's screen review mode. You can see something similar with ./simple-at, but ONLY if key-repeat is on (in which case simple-at gets flooded with key events whose callbacks each take a long time to return). There may be some weirdness with keypress-without-keyrelease which can happen during key-repeat, which bears independent investigating in at-spi. But the core bug seems to be in gnopernicus. The reason gnopernicus did not seem to be "stuck" is that the calls it makes in the callback are reenterant, therefore gnopernicus can continue to run and service other calls while waiting to return from this callback, but because at-spi-registryd is waiting for the callback to return, it cannot release the key events to the rest of the desktop. The bottom line is that the desktop will be 'locked' while an at-spi client's ALL_WINDOWS callback is being processed. In the case of this bug, pressing '7' a second time seems to unlock/interrupt the gnopernicus callback, releasing the keyboard from the user's perspective. I am pretty sure Padraig is right, root cause is bug #124108 in gnopernicus, which was fixed in HEAD. I will try to reproduce this with gnopernicus HEAD, bet I can't !
Yes, this _was_ a gnopernicus bug, in fact a duplicate of #124108 as Padraig surmised. Thanks Padraig for your assistance. *** This bug has been marked as a duplicate of 124108 ***
Bug is still reproducible in the next 3 cases: 1.in gnopernicus in the following conditions: ============================================= * at-spi, atk, gail from HEAD * gnopernicus from HEAD * erase gconf settings for gnopernicus, from your home (the "window hierachy flat" command was missing from the schemas and you won't find it in the UI with old gconf settings) * build && install gnopernicus from HEAD * check "Key pressed repeat when key is held down" (on my machine Delay is set to 497 and Speed has 30-value) Note: in order to reach this setting, open the "Keyboard Preferences" application: ** launch gnome-keyboard-properties in a terminal OR ** follow the GUI path from desktop: Applications->Desktop Preferences ->Keyboard * enable speech in gnopernicus Note: Without speech enabled, the function that is to be executed for L03K07 is returning immediately. The feature is speech-oriented: <snip gnopernicus/srcore/srctrl.c:2629 (src_hierarhy_flat)> ... SRObject *window; if (!src_use_speech)//if there is no speech enabled there is nothing to do return FALSE; if (!src_crt_sro) return FALSE; ... </snip gnopernicus/srcore/srctrl.c:2629 (src_hierarhy_flat)> * launch gnopernicus * map "window hierachy flat" command to CTRL+ALT+A (**Gnopernicus->2Preferences->6Command Mapping->User Defined->Add **select Ctrl and Alt; choose A **choose "window hierachy flat" from "Command List") * put the focus on gnopernicus' main menu OR put the focus on "Credis" dialog of gnome-volume-control application (initial scenario) * select layer 3 * press 7 from numpad Actual results: --------------- - graphical focus disappears - the notification for key 7 is comming 3 times - pressing again key 7 makes the graphical focus to reappear Expected results: ----------------- - notification for pressing key 7 from layer 3 to appear 1 time (like in the case of releasing CTRL+ALT+A) Note: gnopernicus listens to "release" for a key, not for "press" - graphical focus to reappear (ideal would be NOT to disappear at all, but we know that this is not possible in this current stage)without pressing the 7 key again Additional information: ---------------------- - the results are the expected ones if one of the following conditions is happening: * "Key pressed repeat when key is held down" (from gnome-keyboard-properties)is NOT checked. * speech is NOT enabled in gnopernicus (the function will return immediatelly in this case) * press CTRL+ALT+A to call the (src_hierarhy_flat) function ** note that CTRL+ALT+A is NOT registered with SPI_KEYLISTENER_ALL_WINDOWS flag and SPI_KEYMASK_MOD2 and CONSUMED ** note that only one notification is comming ** note that graphical focus NEVER disappears - if you apply the patch bug124837_1.diff (attached to this bug) you will see that: * current bug has nothing to do with Accessible_is Application (in case of gnopernicus and gnome-volume-control/Credits this function returns TRUE as expected) * the function (src_hierarhy_flat) that is called for L03K07 returns (callbacks succed in returning): see the messages that gnopernicus displays 3 times if you press key 7 from layer 3: <snip from log obtained with gnopernicus patched with bug124837_1.diff> (srctrl.c,2641) src_hierarchy_flat INIT ... (srctrl.c,2704) src_hierarchy_flat TERMINATE </snip from log obtained with gnopernicus patched with bug124837_1.diff> 2. in gnopernicus patched (bug124837_2.diff) for test purposes: =============================================================== * at-spi, atk, gail from HEAD * get gnopernicus from HEAD * apply patch bug124837_2.diff (attached to this bug) to gnopernicus * erase gconf settings for gnopernicus, from your home (the "window hierachy flat" command was missing from the schemas and you won't find it in the UI with old gconf settings) * build && install gnopernicus previously patched * check "Key pressed repeat when key is held down" (on my machine Delay is set to 497 and Speed has 30-value) Note: in order to reach this setting open the "Keyboard Preferences" application: ** launch gnome-keyboard-properties in a terminal OR ** follow the GUI path from desktop: Applications->Desktop Preferences ->Keyboard * enable speech in gnopernicus Note: Without speech enabled the function that is to be executed for L03K07 is returning. The feature is speech-oriented: * launch gnopernicus * map "window hierachy flat" command to CTRL+ALT+A (**Gnopernicus->2Preferences->6Command Mapping->User Defined->Add **select Ctrl and Alt; choose A **choose "window hierachy flat" from "Command List") * put the focus on gnopernicus' main menu OR put the focus on "Credis" dialog of gnome-volume-control application (initial scenario) * select layer 3 * press 7 from numpad Actual results: --------------- - graphical focus disappears - the notification for key 7 is comming 3 times - pressing again key 7 makes the graphical focus to reappear Expected results: ----------------- - notification for pressing key 7 from layer 3 to appear 1 time (like in the case of releasing CTRL+ALT+A) Note: gnopernicus listens to "release" of a key, not "press" - graphical focus to reappear (ideal would be not to disappear at all, but we know that this is not possible in this current stage) without pressing the 7 key again Additional information: ---------------------- - the results are the expected ones if one of the following conditions is happening: * "Key pressed repeat when key is held down" (from gnome-keyboard-properties)is not checked. * speech is not enabled in gnopernicus (the function will return immediatelly) * press CTRL+ALT+A to call the (src_hierarhy_flat) function ** note that CTRL+ALT+A is NOT registered with SPI_KEYLISTENER_ALL_WINDOWS flag and SPI_KEYMASK_MOD2 and CONSUMED ** note that only one notification is comming ** note that graphical focus NEVER disappears - the function (src_hierarhy_flat)that is called for L03K07, in the case of patch bug124837_2.diff consists of: <snip gnopernicus/srcore/srctrl.c:2631> static gboolean src_hierarchy_flat () { fprintf (stderr, "\n(%s,%d) src_hierarchy_flat INIT", __FILE__, __LINE__); sleep (1); fprintf (stderr, "\n(%s,%d) src_hierarchy_flat TERMINATE", __FILE__, __LINE__); return TRUE; } </snip gnopernicus/srcore/srctrl.c:2631> - the function (src_hierarhy_flat) that is called for L03K07 returns (callbacks succed in returning): see the messages that gnopernicus displays 3 times if you press key 7 from layer 3: <snip from log obtained with gnopernicus patched with bug124837_2.diff> (srctrl.c,2641) src_hierarchy_flat INIT (srctrl.c,2704) src_hierarchy_flat TERMINATE </snip from log obtained with gnopernicus patched with bug124837_2.diff> 3.in simple-at test-program (already attached to this bug http://bugzilla.gnome.org/showattachment.cgi?attach_id=21057) ============================================================== - please "Additional Comments From Adi Dascal 2003-10-30 07:40" to this bug AND what the test program does. I think that I have proven with this last comments that the issues that I reported on 2003-10-30 (in this bug report) are the cause of this bug AND that the scenario from the test-program is the same as the actual scenario from gnopernicus. So this bug can NOT be closed and issues are generated by gnopernicus, but at-spi.
Created attachment 21249 [details] [review] patch to apply for gnopernicus (for the first scenario )
Created attachment 21250 [details] [review] patch to apply to gnopernicus (second scenario)
the bug you are describing is not the bug I originally filed. The bug I originally filed is a gnopernicus bug that was fixed as reported by Padraig. The issue you are now reporting is worth investigating but I do not believe that it is reasonable to do so in this bug report, since the summary is not accurate. For reasons of preserving the history of the original bug I do not think we should change this bug's summary. I suggest filing a new bug, when a test program can be produced.
"the bug you are describing is not the bug I originally filed." It is the same bug, as I've proven with "Additional Comments From Adi Dascal 2003-11-06 10:34 " "The bug I originally filed is a gnopernicus bug that was fixed as reported by Padraig." That bug has nothing to do with this one. Please read "Additional Comments From Adi Dascal 2003-11-06 10:34 " "The issue you are now reporting is worth investigating but I do not believe that it is reasonable to do so in this bug report, since the summary is not accurate. For reasons of preserving the history of the original bug I do not think we should change this bug's summary. I suggest filing a new bug, when a test program can be produced." More than one test program are attached to this bug. Please take a look at them. Padraig: could you take a look and confirm that the problem originally filled is still reproducible, like I reported in "Additional Comments From Adi Dascal 2003-11-06 10:34". Thank you!
Adi: The bug I originally filed concerned a hang which was caused by a gnopernicus loop. I do not agree that your report is the same issue, though the two problems may present similar symptoms in some cases. We really should be taking this to a new bug report. I have read your comments and the test program, thanks - but this should be a new bug report, not the original one.
Some comments: 1) we expect that if key-repeat is on, gnopernicus will get multiple key-pressed notifications. I think gnopernicus should be able to handle this case gracefully. 2) are you saying that multiple key press notifications are being received even if the key is not held for the specified repeat period (i.e. less than the key-repeat delay?). If this is the case please clarify, that would indeed be something we need to look at - as a separate at-spi bug, however.
since turning off key repeat suppresses this bug (as currently reported), the severity should be reduced.
*** Bug 127883 has been marked as a duplicate of this bug. ***
this is still IMO a gnopernicus bug, with a probably less-severe at-spi issue.
I confirm that with Adi's patch from 2003-10-30 the desktop has same behaviour as the described in the bug. After compilation of the test program, launch simple-at and gedit. Press NumLock if it is off (it SHOULD be ON). Press m key. For first press of the key, 3 events are reveived by test program. After that the desktop seems to freeze. User can now press any key but with no results. If m key is pressed again, no event is received, but the desktop can be used again. So, for every odd press the desktop freezes and for every even press, the desktop can be used again.
revised summary to reflect the generality of this bug. Please see 127883 and the above discussion for specific cases that trigger this bug.
Note that this test program: "simple-at test-program (already attached to this bug" "http://bugzilla.gnome.org/showattachment.cgi?attach_id=21057)" won't work as expected on Solaris since it is listening to MOD2MASK instead of NUMLOCK. It should attach numlock listeners to SPI_KEYMASK_NUMLOCK
I can now reproduce this on Solaris. I will see what I can see.
The behaviour I am observing on Solaris is slightly different from that reported by Remus on 4th December but I have simplified simple-at so that the only event is registers for is NumLock+m. When I press the key a second time the desktop is not unfrozen. When the key is pressed the function global_filter_fn is called for the KeyPressEvent. This event is consumed so XAllowEvents is called with event_mode of ASyncKeyboard. The next event is also a KeyPress event for the key. This event is also consumed and XAllowEvents is called with event_mode of ASyncKeyboard. There follows a KeyRelease event. Thisc causes XAllowEvents to be called with event_mode of ReplayKeyboard. There then follows another KeyPress event for the key. This explains why the event is reported three times. I have not yet figure out why the KeyPress and KeyRelease events are occurring.
I have an X based test program which works correctly. If I insert a call to sleep(1) before the call to XAllowEvents the program behaves in a similar way to at-spi-registryd.
It seems that the problem occues when AutoRepeatMode is on. When dealing with the keystroke takes a long time I am seeing extra events being added to the Xevent queue. I will attach a patch which sems to fix the problem.
Created attachment 22337 [details] [review] Patch for at-spi
It's surprising that key repeat events happen even after the physical key is released, when an active grab is held. Thanks Padraig for discovering this peculiar fact. If the patch appears to fix the problem please apply it. If you prefer to wait for me to apply and test on Linux/JDS then I will.
padraig: can you post your X test program?
Created attachment 22461 [details] [review] Test program
It seems that the reason this bug was causing the desktop to 'freeze' is because the repeated key events were getting caught by the passive grab, again and again.
Patch committed to CVS HEAD.