GNOME Bugzilla – Bug 751414
File descriptor leak in gdk_wayland_selection_request_target()
Last modified: 2015-06-24 21:54:41 UTC
I discovered that gdk_wayland_selection_request_target() does not close() wayland_selection->stored_selection.fd before assigning a new fd to it. A malicious Wayland client can trick a user into dragging data to it from a GTK+ app, and then cause the GTK+ app to leak an arbitrary number of file descriptors up to its limit by calling wl_data_offer_receive() in a loop. This probably also work against any GTK+ app that has placed data in the clipboard, though I didn't test that. I'll attach the trivial fix.
Created attachment 305963 [details] [review] GdkSelectionWayland: Fix file descriptor leak I discovered that gdk_wayland_selection_request_target() does not close() wayland_selection->stored_selection.fd before assigning a new fd to it. A malicious Wayland client can trick a user into dragging data to it from a GTK+ app, and then cause the GTK+ app to leak an arbitrary number of file descriptors up to its limit by calling wl_data_offer_receive() in a loop. This probably also works against any GTK+ app that has placed data in the clipboard, though I didn't test that.
Comment on attachment 305963 [details] [review] GdkSelectionWayland: Fix file descriptor leak This fix is wrong; the fd is supposed to be closed by a call to async_write_data_free(). I need to study this more.
Doh, this is something I noticed on wip/wayland-dnd-actions, but forgot to take into a separate commit. Please let me know if the following patch fixes it for you.
Created attachment 305991 [details] [review] wayland: Properly initialize/finalize where we store the data_source.send fd The fd must be closed on async_write_data_free(), but we should also initialize it to -1 so gdk_wayland_selection_check_write() doesn't wrongly pick the stdin fd.
Attachment 305991 [details] pushed as 25885ca - wayland: Properly initialize/finalize where we store the data_source.send fd
Unfortunately your patch is wrong too; in my tests, GTK+ with your patch (which went into master this morning) leaks roughly half as many fds as it did previously, whereas my patch plugs the leak completely. (But don't apply my patch, it probably breaks things worse, I need to study this more still. :) We should not close() the fd manually because we pass TRUE as the second argument to g_unix_output_stream_new(). So actually, this may not be a leak per se, but a protocol flaw. Perhaps GTK+ will eventually free the fds if the other client ever reads the data from the pipe (which my attacking client does not ever do).
By the way, I discovered this when attempting to debug why GTK+ never sends me drag data nor closes the write end of its pipe when I call wl_data_offer_receive(), unlike the weston-dnd demo app I had been developing against, so this probably isn't the only problem here.
A testcase would do wonders
Created attachment 306028 [details] Weston patch, for testing My test case was a heavily-hacked up Chrome web browser, hence I didn't post it, but here is a simpler one (for this specific issue only) as a patch to Weston. Configure Weston with --enable-demo-clients-install and rebuild, then launch Weston, open gedit and weston-dnd, type some text into gedit, and drag it to weston-dnd. (Will probably work in GNOME on Wayland as well, but I didn't test that.) Easiest way to observe the open descriptors is to use gnome-system-monitor, right-click on the gedit process, and select Open Files.
We should probably try to define the expected behavior as well. I propose: "GTK+ should not create more pipes to the destination client than the number of MIME types it has offered."
(In reply to Michael Catanzaro from comment #10) > We should probably try to define the expected behavior as well. I propose: > "GTK+ should not create more pipes to the destination client than the number > of MIME types it has offered." The behavior is actually imposed by X-isms in GDK/GTK+, not more than one target per selection can be on the fly, selection requests on the low-level work by 1) emitting a GdkEventSelection and 2) the affected widget calling gdk_property_change for the selection atom. Newer requests should be cancelling previous ones, which I realize we don't always do on all paths where gdk_wayland_selection_check_write() is called. More patches coming, testing appreciated, although I'll try too with your weston hack.
Created attachment 306033 [details] [review] wayland: Do not close the descriptor on async_write_data_free() At the moment we create the AsyncWriteData, the ownership of the fd is granted to the GOutputStream, and the fd set to -1, so at this moment we're just silently getting EBADFD. This partially reverts 25885ca600fff, the initialization of .fd to -1 is valid and stays though.
Created attachment 306034 [details] [review] wayland: Ensure we cancel previous selection writes before starting one We weren't catching all the places where the AsyncWriteData operation should be cancelled, which could happen if we repeatedly request the same target on different fds.
I will test your patches now. Anyway, so now it's clear to me that GTK+ is not intended to handle the case where another client calls wl_data_offer_receive() multiple times in a row without reading the data. I was doing this to handle drops in Chrome (works with drags from weston-dnd, fails with drags from GTK+): * foreach offered MIME type Chrome could ever possibly accept * call wl_data_offer_receive() * Pass all received fds to another process via internal IPC (UNIX domain socket) * In the other process, read() the data from each fd The read() currently never completes (without your patches applied; I will test them now), presumably due to the missing cancel. (Note: I have to receive data in each possible MIME type even if it won't be used in order to hook up to Chrome's existing cross-platform drag-and-drop abstraction.) To work properly with GTK+, I will need to modify my code to do the following: * foreach offered MIME type Chrome could ever possibly accept * call wl_data_offer_receive() * Pass the received fd to the other process over internal IPC * In the other process, read() the data from each fd * Close the fd * Tell the original process it's safe to use wl_data_offer_receive() again Correct? Do you consider this a deficiency in GTK+ worth changing eventually? (But if so, we'd need a plan to prevent the drag destination from causing clients to use arbitrarily many fds.) Or would it be better to change the Wayland protocol documentation to indicate that wl_data_offer_receive() should not be used multiple times in a row without first closing the previously-received pipe?
Attachment #306033 [details] looks correct to me. Attachment #306034 [details] also looks correct, and causes my read() in Chrome to complete (although only one read() actually receives data, of course). However, the fd leak still occurs.
(In reply to Michael Catanzaro from comment #14) > > The read() currently never completes (without your patches applied; I will > test them now), presumably due to the missing cancel. (Note: I have to > receive data in each possible MIME type even if it won't be used in order to > hook up to Chrome's existing cross-platform drag-and-drop abstraction.) > > To work properly with GTK+, I will need to modify my code to do the > following: > > * foreach offered MIME type Chrome could ever possibly accept > * call wl_data_offer_receive() > * Pass the received fd to the other process over internal IPC > * In the other process, read() the data from each fd > * Close the fd > * Tell the original process it's safe to use wl_data_offer_receive() again > > Correct? Yeah, that'd be one way to work around the "1 request on the fly" limitation, is it too much of a change to?: * forward all mimetypes to the other process over internal IPC * In the other process, pick one and tell about it to the original process * call wl_data_offer_receive on that one, pass the fd. * In the other process, read() and close TBH I wonder how often does the web process end up using more than one mimetype, probably never? > > Do you consider this a deficiency in GTK+ worth changing eventually? (But if > so, we'd need a plan to prevent the drag destination from causing clients to > use arbitrarily many fds.) Or would it be better to change the Wayland > protocol documentation to indicate that wl_data_offer_receive() should not > be used multiple times in a row without first closing the > previously-received pipe? Selections and DnD APIs are indeed calling for a redesign, probably lowering everything to GDK and making it all GAsyncResult driven (sounds like a 4.x target though). However, on X11 the limitation is at the guts of the selections mechanism, so it's unclear to me whether we can safely support this across backends. (In reply to Michael Catanzaro from comment #15) > Attachment #306033 [details] looks correct to me. > > Attachment #306034 [details] also looks correct, and causes my read() in > Chrome to complete (although only one read() actually receives data, of > course). > > However, the fd leak still occurs. Ok, saw what's going on :). because the emission of the GdkEventSelection is handled on the main loop, requesting data through fds like in your weston hack makes a bunch of calls to gdk_wayland_selection_request_target(), but no final call to gdk_wayland_selection_check_write(), so we just replace the previous fd and leak it, a third patch coming.
Created attachment 306048 [details] [review] wayland: close() the selection fd if we didn't start writing yet If the other peer requests data too fast (too rare/unlikely though), we might receive multiple gdk_wayland_selection_request_target() calls with no ending gdk_wayland_selection_check_write(), in which case the fd is leaked as no GOutputStream was created to take over it.
Created attachment 306049 [details] [review] wayland: Ensure we close the fd on all error paths in data_source.send
Created attachment 306050 [details] [review] wayland: protect against unknown mimetype requests This oddly can be reproduced with weston+weston-dnd, when dragging anything from GTK+ into weston-dnd, it will insist on picking its custom application/x-wayland-dnd-flower mimetype, and this request forwarded by the compositor, even if GTK+ didn't announce it on its wl_data_source mimetype list. (What should probably happen here is that the request is silenced, and/or weston-dnd picks (null)) This should be harmless, we are leaking though the fd in that case, because the emission of GdkEventSelection on an unhandled mimetype results in NOP. In order to avoid this, we should check whether the mimetype is supported at all on the backend code and possibly close the fd, this involves storing these in the first place.
That was more patches than expected... With all these, I don't see any further fd leaks under valgrind.
I confirm this bug is fixed when I apply these five patches. Great, thanks! (In reply to Carlos Garnacho from comment #16) > Yeah, that'd be one way to work around the "1 request on the fly" > limitation, is it too much of a change to?: > > * forward all mimetypes to the other process over internal IPC > * In the other process, pick one and tell about it to the original process > * call wl_data_offer_receive on that one, pass the fd. > * In the other process, read() and close If that would work it would have been the first thing to try, but Chrome expects the data to already be present before it deciding which type of data to pick, unfortunately. > TBH I wonder how often does the web process end up using more than one > mimetype, probably never? Probably. > > Do you consider this a deficiency in GTK+ worth changing eventually? (But if > > so, we'd need a plan to prevent the drag destination from causing clients to > > use arbitrarily many fds.) Or would it be better to change the Wayland > > protocol documentation to indicate that wl_data_offer_receive() should not > > be used multiple times in a row without first closing the > > previously-received pipe? > > Selections and DnD APIs are indeed calling for a redesign, probably lowering > everything to GDK and making it all GAsyncResult driven (sounds like a 4.x > target though). > > However, on X11 the limitation is at the guts of the selections mechanism, > so it's unclear to me whether we can safely support this across backends. To be clear, the problems I see are: * If GTK+ were to allow handling multiple data transfers at once in the future, there would need to be policy either in GTK+ or in the compositor to ensure that the drop target cannot initiate an arbitrary number of simultaneous transfers. * If GTK+ continues to not handle multiple data transfers at once, it's arguably not implementing its end of the protocol properly. Since it works with the reference Weston client, I would expect it to work with GTK+ as well. But if there was upstream documentation to say "don't expect this to work" (the simplest solution, possibly the best solution) then there would be no problem. Whatever; I know now not to expect it to work. :)
(In reply to Michael Catanzaro from comment #21) > I confirm this bug is fixed when I apply these five patches. Great, thanks! Thanks for verifying! Will push now. > > (In reply to Carlos Garnacho from comment #16) > > Yeah, that'd be one way to work around the "1 request on the fly" > > limitation, is it too much of a change to?: > > > > * forward all mimetypes to the other process over internal IPC > > * In the other process, pick one and tell about it to the original process > > * call wl_data_offer_receive on that one, pass the fd. > > * In the other process, read() and close > > If that would work it would have been the first thing to try, but Chrome > expects the data to already be present before it deciding which type of data > to pick, unfortunately. I feared as much... Then there's no way around one-by-one requests and/or selection data caching ATM. > > To be clear, the problems I see are: > > * If GTK+ were to allow handling multiple data transfers at once in the > future, there would need to be policy either in GTK+ or in the compositor to > ensure that the drop target cannot initiate an arbitrary number of > simultaneous transfers. Agreed. > > * If GTK+ continues to not handle multiple data transfers at once, it's > arguably not implementing its end of the protocol properly. Since it works > with the reference Weston client, I would expect it to work with GTK+ as > well. But if there was upstream documentation to say "don't expect this to > work" (the simplest solution, possibly the best solution) then there would > be no problem. Oh, if it were the only unclear point in the DnD protocol. I found a few in the doing of the proposal for wayland DnD actions :/
Attachment 306033 [details] pushed as 69b5955 - wayland: Do not close the descriptor on async_write_data_free() Attachment 306034 [details] pushed as 5e71594 - wayland: Ensure we cancel previous selection writes before starting one Attachment 306048 [details] pushed as 3bd7b2a - wayland: close() the selection fd if we didn't start writing yet Attachment 306049 [details] pushed as fb266a8 - wayland: Ensure we close the fd on all error paths in data_source.send Attachment 306050 [details] pushed as 342db27 - wayland: protect against unknown mimetype requests