Bug 425899 - Wild clock shifts cause mouse clicks to stay as "move" (implement _NET_WM_MOVERESIZE_CANCEL?)
Wild clock shifts cause mouse clicks to stay as "move" (implement _NET_WM_MOV...
Status: RESOLVED FIXED
Product: metacity
Classification: Other
Component: general
2.16.x
Other Linux
: High normal
: ---
Assigned To: Metacity maintainers list
Metacity maintainers list
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2007-04-03 16:19 UTC by James Cape
Modified: 2010-05-19 13:32 UTC (History)
3 users (show)

See Also:
GNOME target: ---
GNOME version: 2.15/2.16


Attachments
script to jump the back back and forth (356 bytes, text/plain)
2007-04-16 12:22 UTC, Robert Williams
  Details
Add support for _NET_WM_MOVERESIZE_CANCEL (1.15 KB, patch)
2007-04-24 21:32 UTC, Elijah Newren
committed Details | Diff | Review

Description James Cape 2007-04-03 16:19:02 UTC
Using:

RHEL 5 Client (Xorg 7.1, GNOME 2.16) + metacity 2.16.5 from GNOME.org.

Behavior:

Clicking on a titlebar makes the move cursor appear and stay on (even if it's a single click, not a drag) but the window does not move (even if it's a legitimate drag instead of a single-click).

This may be related to a badly configured NTP, which can cause repeated clock swings of up 800+ ms once a minute. If this happens in proximity to (or possibly during) a window being moved via DnD, Metacity seems to loose it's ability to properly recognize what the user is trying to do from then on.

I haven't done the best-practice thing and test it in 2.18, nor have I tried to reproduce it with a script that shifts the clock around.
Comment 1 James Cape 2007-04-13 13:01:09 UTC
This issue occurred again, after NTP was fixed and there was this message in the xsession-errors (possibly unrelated):

"Window manager warning: Received a _NET_WM_MOVERESIZE message for 0x3400003 (Configurat); these messages lack timestamps and therefore suck."

The only applications running at the time were the GNOME panel and a custom Java application running under JPackage RPMS of JRE 1.5.0_11 on RHEL5.
Comment 2 Elijah Newren 2007-04-13 19:02:47 UTC
Does this Java application draw its own window frame/decorations/whatever-you-want-to-call-it?  (You can usually tell because the window's frame doesn't match the metacity theme on other windows, such as you see with XMMS or Xine)

_NET_WM_MOVERESIZE is used by clients who try to draw their own decorations (i.e. crappy, sucky apps) but want the window manager to still control moving their windows around when the user clicks on those decorations.  That warning message was added by me, because I was trying to fix race conditions with timestamps and simply couldn't do so in this case.  But, timestamps aren't the race condition possible with those messages.  Check out this other comment in the code that I think was written by Havoc, who from context was thinking about something other than timestamps:

              /* The race conditions in this _NET_WM_MOVERESIZE thing
               * are mind-boggling
               */

So, if you could verify whether this is happening when clicking on such window frames (or whether you can duplicate it when clicking on a normal window, such as gedit or gnome-terminal), that would help.

The only other bug report we have in here that I can think of being similar is bug 304430, which we recently may have fixed.  It'd be nice if you could try cvs head (should build under GNOME 2.16.5 just fine if you have e.g. gtk+ development libraries from RH installed).
Comment 3 Robert Williams 2007-04-13 19:30:37 UTC
This may be related: quoting wm-spec 1.4draft2
http://standards.freedesktop.org/wm-spec/latest/ar01s04.html#id2526932

        #define _NET_WM_MOVERESIZE_CANCEL           11   /* cancel operation */
        
        The Client MUST release all grabs prior to sending such message 
        (except for the _NET_WM_MOVERESIZE_CANCEL message).

        The Window Manager can use the button field to determine the 
        events on which it terminates the operation initiated by the
        _NET_WM_MOVERESIZE message. Since there is a race condition 
        between a client sending the _NET_WM_MOVERESIZE message and the
        user releasing the button, Window Managers are advised to offer
        some other means to terminate the operation, e.g. by pressing the
        ESC key. The special value _NET_WM_MOVERESIZE_CANCEL also allows 
        clients to cancel the operation by sending such message if they
        detect the release themselves (clients should send it if they get 
        the button release after sending the move resize message,
        indicating that the WM did not get a grab in time to get the 
        release).
Comment 4 Robert Williams 2007-04-13 19:32:17 UTC
Also potentially related http://bugs.kde.org/show_bug.cgi?id=101468

I tried to reporoduce the bug with jittering the clock back and forth and wasn't able to do so, it may be more related to high load than the clock jitter.
Comment 5 Robert Williams 2007-04-13 19:37:14 UTC
Sorry to spam, but here's more, _NET_WM_MOVERESIZE_CANCEL comments by Havoc Pennington http://osdir.com/ml/gnome.wm-spec/2005-12/msg00003.html

metacity doesn't have any mention of _NET_WM_MOVERESIZE_CANCEL, it's not definied in src/window.c up to 2.19.2 where the rest of 'em are defined:

#define _NET_WM_MOVERESIZE_SIZE_TOPLEFT      0
#define _NET_WM_MOVERESIZE_SIZE_TOP          1
#define _NET_WM_MOVERESIZE_SIZE_TOPRIGHT     2
#define _NET_WM_MOVERESIZE_SIZE_RIGHT        3
#define _NET_WM_MOVERESIZE_SIZE_BOTTOMRIGHT  4
#define _NET_WM_MOVERESIZE_SIZE_BOTTOM       5
#define _NET_WM_MOVERESIZE_SIZE_BOTTOMLEFT   6
#define _NET_WM_MOVERESIZE_SIZE_LEFT         7
#define _NET_WM_MOVERESIZE_MOVE              8
#define _NET_WM_MOVERESIZE_SIZE_KEYBOARD     9
#define _NET_WM_MOVERESIZE_MOVE_KEYBOARD    10
Comment 6 Elijah Newren 2007-04-13 20:30:44 UTC
Oh wow, I somehow missed or forgot those wm-spec-list discussions about _NET_WM_MOVERESIZE_CANCEL.  If fixing this just means implementing _NET_WM_MOVERESIZE_CANCEL, it shouldn't be too hard at all.  Of course, we're assuming that it is a _NET_WM_MOVERESIZE related problem.  While that certainly sounds likely right now, it'd be nice to verify.  James?
Comment 7 Robert Williams 2007-04-13 20:47:21 UTC
The custom Java application that we run on these workstations does not draw it's own decorations, but it does create multiple windows using JFC/Swing and then resizes them at startup.
Comment 8 Elijah Newren 2007-04-13 21:13:26 UTC
That sounds different.  _NET_WM_MOVERESIZE is used by an app to ask the window manager to start a _user-involved_ moving/resizing action (usually because the user clicked on the application's pseudo-frame); the action won't end until the user releases the mouse button (or a _NET_WM_MOVERESIZE_CANCEL is sent, but that isn't supported in metacity yet).

If an app is going to resize a bunch of windows at startup, it'd be more likely to use ConfigureRequest events or _NET_MOVERESIZE_WINDOW (both of which specify to move/resize a window to a given final configuration without user involvement).


So, just so I understand, are you two coworkers who are both working on this same problem, or are you two just independent users who have run into issues that look the same?  (If the latter, perhaps there are two different problems involved...)
Comment 9 Robert Williams 2007-04-13 21:15:53 UTC
We are working on the same problem.
Comment 10 Elijah Newren 2007-04-13 21:29:49 UTC
Okay, can you run

  metacity --replace

in a terminal and then try to duplicate?  If you see the warning

  Window manager warning: Received a _NET_WM_MOVERESIZE message for
  0x<some-number> (<some-window-name>); these messages lack timestamps
  and therefore suck.

at the same time you trigger the bug, then we can conclude it's _NET_WM_MOVERESIZE related.  Otherwise, we'll need to look elsewhere.
Comment 11 Robert Williams 2007-04-16 12:21:27 UTC
I was able to reproduce the problem, or at least portion of it, on our dual-core 64-bit RHEL5 w/ upgraded metacity 2.16.5 workstation, however I do not see any _NET_WM_MOVERESIZE warnings in .xessions-errors

I ran 2 instances of "cat /dev/urandom >/dev/null" as well as a shell script to jump the time back and worth (see attached jitter.sh) and ran the workstation sitting overnight.

This morning, no window had focus, no mouse events were having any effect.

There were a lot of warnings regarding inaccurate timestamps, as expected, such as:

  Window manager warning: last_user_time (3950369695) is greater
  than comparison timestamp (3950358644).  This most likely represents
  a buggy client sending inaccurate timestamps in messages such as 
  _NET_ACTIVE_WINDOW.  Trying to work around...

  Window manager warning: 0x321bdff (username00) appears to be one
  of the offending windows with a timestamp of 3950369695.
  Working around...

Keyboard still worked, as I have <Alt>X keyboard shortcut mapped to open gnome-terminal. New window came up with focus, but cursor was hollowed out and no mouse events or key pressed were working.

I could not switch focus with the mouse, however, I did switch focus with keyboard (<Alt>Tab) and everything went back to normal.

Comment 12 Robert Williams 2007-04-16 12:22:13 UTC
Created attachment 86424 [details]
script to jump the back back and forth
Comment 13 Robert Williams 2007-04-16 12:25:03 UTC
I should point out that this may not be exactly the same problem as initially reported, but perhaps it's related.
Comment 14 James Cape 2007-04-16 14:03:37 UTC
Note: the cursor from comment 11 was the character-position block drawn by g-t, not the mouse cursor.
Comment 15 Robert Williams 2007-04-23 15:15:30 UTC
Original problem happened twice this morning. Since the application that's running on this workstation is critical and we have to give user control of it ASAP, we can not generally sit around and get a lot of information out of the situation.

I did collect a brief strace of metacity after it occured, and can email it on request.

Any suggestings on what information to collect when this occures?  There is a lot of pressure to switch the window manager, so we may not be able to help troubleshoot this in the future.
Comment 16 Elijah Newren 2007-04-24 21:32:03 UTC
Created attachment 86952 [details] [review]
Add support for _NET_WM_MOVERESIZE_CANCEL

While I'm not sure if this is _NET_WM_MOVERESIZE_CANCEL related or not, it could be and it was pretty easy to add support for this, so I cooked up a patch.  Don't have a good program around to test with.  I guess I could cook one up, but I'm starting to run short on time...
Comment 17 James Cape 2007-05-03 18:39:19 UTC
The issue is still occuring on 2.18.2 with the patch from attachment 86952 [details] [review].

We experienced the issue again with meta_topic debugging enabled, I'll mail the end of the log privately after the window titles have been renamed.

Comment 18 Thomas Thurman 2009-01-25 23:00:46 UTC
Patch looks sane; I'll write a veracity test for it later, but for now it's small enough I'm just putting it into trunk.
http://svn.gnome.org/viewvc/metacity?rev=4088&view=rev

James: is this still a problem for you?
Comment 19 James Cape 2009-01-26 14:33:57 UTC
We switched x86_64 to i386 shortly after the last message, and the issue has only repeated itself once more since then (have not updated since original report). So the bug is still in there, but masked nearly perfectly when running i386.
Comment 20 Thomas Thurman 2010-05-19 13:32:09 UTC
James: sorry to reply after so long, but did the problem re-occur on i386 before you used a version of Metacity with the above patch included?

I'm closing this as FIXED for now, but if it did re-occur, please re-open the bug.

Note You need to log in before you can comment on or make changes to this bug.