GNOME Bugzilla – Bug 608901
[gnomevfssrc] Problem using gnomevfssrc with ssh uri
Last modified: 2010-02-18 05:44:32 UTC
Created attachment 152933 [details] source code that reproduces the error. I'm developing an application that does heavy use on the gnomevfssrc plugin to load urls from different protocols, most of the urls are ssh. Everything was going fine since we started to notice an odd behaviour on our tests, when the application receives the request to open the same ssh url multiple times it gives a lot of warnings, and sometimes it even crash. Because of that odd behaviour i made a simple application reproducing this problem, basically you pass a url and how many pipes reading that url at the same time you want, it is just a gnomevfssrc ! fakesink pipe (on more complex pipelines the error is the same). With this code test if you open even a hundred times the same http address it works fine, but if you open 3 or more times the same ssh url, it gets this error (it is the same error i get on the real application, sometimes the errors are bigger but it is the same warnings multiple times before the crash): (gnomevfs_ssh:5316): gnome-vfs-modules-CRITICAL **: Message length too long: 1761607680 (gnomevfs_ssh:5316): gnome-vfs-modules-WARNING **: ID mismatch (15 != 5) (gnomevfs_ssh:5316): gnome-vfs-modules-CRITICAL **: Message length too long: 1728053248 (gnomevfs_ssh:5316): gnome-vfs-modules-CRITICAL **: Message length too long: 100663312 (gnomevfs_ssh:5316): gnome-vfs-modules-CRITICAL **: Could not read 1 bytes (gnomevfs_ssh:5316): gnome-vfs-modules-CRITICAL **: Could not read 4 bytes (gnomevfs_ssh:5316): gnome-vfs-modules-CRITICAL **: ID mismatch (4286513152 != 8) (gnomevfs_ssh:5316): gnome-vfs-modules-CRITICAL **: Expected SSH2_FXP_HANDLE(102) packet, got 0 (gnomevfs_ssh:5316): gnome-vfs-modules-CRITICAL **: Could not read 4 bytes GLib-ERROR **: /build/buildd/glib2.0-2.22.3/glib/gmem.c:136: failed to allocate 18446744073701097472 bytes aborting... Aborted I don't know if it is something that i am doing wrong, but i suppose i am allowed to open the same url multiple times to do different processing on that url, so it sens like a bug on gnomevfssrc, the fact that file and http protocols works fine is an indicative that the problem is related to gnomevfssrc and ssh (i didn't tested with others protocols like smb or ftp, but the test code can test any of these). Of course when the test is run with only one pipeline reading the ssh url it works just fine, no warnings or errors are given, 2 or more makes things go ugly. The tests have been made on Ubuntu 9.10 32 bits and 64 bits with gstreamer 0.10.25 (same error on both). the source code of the test can be found on: https://svn.inf.ufsc.br/katcipis/c/gstreamer/gnomevfs_ssh_bug/ or on the attachment.
Created attachment 152959 [details] pure gnomevfs crash test i did some debugging (with some help at work) and we find out that the problem is at the method "gst_gnome_vfs_src_get_size". The method uses some info facilities of gnomevfs. We started to test gnomevfs alone and we where able to determine that when you just open and read the same ssh uri multiple times on multiple threads, it works fine. But if you try to read and access the info of the same ssh uri from multiple threads, it crashes. The main difference is that sometimes it crashes like gstreamer and sometimes a deadlock happens (if you start something like 30 worker threads it always give a GLib-ERROR **: /build/buildd/glib2.0-2.22.3/glib/gmem.c:175: failed to allocate [lot of] bytes), and a lot of protocol errors and i/o errors gets printed. It looks that gnomevfs has a problem with thread safety when you require info about the same ssh uri from multiple threads (other protocols looks to work fine). Source code without info that works fine: https://svn.inf.ufsc.br/katcipis/c/gnomevfs_bug/ok_test.c Source code with info that crashes: https://svn.inf.ufsc.br/katcipis/c/gnomevfs_bug/crash_test.c Source code with info and with mutex that works fine: https://svn.inf.ufsc.br/katcipis/c/gnomevfs_bug/ok_mutex_test.c
Maybe adding a lock around the gnomevfs calls would work then.
Is there any reason you are not using giosrc for this? gnome-vfs is known to misbehave in multiple ways and basically unmaintained, as far as I know (not that we won't apply patches if there's any easy fix on our side, but I think there are just a lot of issues that are not fixable easily or at all).
We modified the gnomevfssrc and now it works fine, but it uses locks around the gnomevfs calls, as Wim said. The problem with this solution is adding a overhead on all protocols because of a bug using ssh. If you guys thinks it can be useful to use locks to solve this i can make a patch of our modifications and send it here. I know gnomevfs is deprecated, but we have several problems installing gio support on our development machines, instead gnomevfs was pretty easy to compile and install. And even deprecated, gnomevfs is part of gstreamer-base plugins, it is expected to work fine, even if it is deprecated.
Created attachment 153023 [details] source code of the fixed gnomevfssrc This is the modified gnomevfssrc.c that works fine, since the problem ONLY happens when you get info about the same ssh uri concurrently we used a read/write lock(http://library.gnome.org/devel/glib/stable/glib-Threads.html#GStaticRWLock), all operation can be made on parallel, except when someone is getting info about the uri, this way is good enough for us, we would like to fix it directly on gnomevfs, but at least now it works :-).
Could you attach your changes as a diff? Using a GStaticRWLock is probably not a good idea, the overhead of RW locks is not worth the lower contention unless you really have a lot of contention. I think in this case a normal GMutex would be better.
Created attachment 153093 [details] [review] Changes diff We thought that the RW lock would be a good idea to avoid the contention, but im not very used to use Glib locks, if the mutex is faster this can be changed easily.
Thanks, not sure if we should use that patch though. Better fix this in gnomevfs, especially because it's only with the SSH module. Using your patch will add a global lock around all gnomevfs uses... which could lead to noticeable problems: Think of two gnomevfssrc, one is already running and all that. The other one is now started and blocks because of networking (e.g. it's a HTTP resource). The first gnomevfssrc will then stop working until the second one has finished the read.
I agree Sebastian, when Win mentioned using locks i showed our solution using locks but i didn't thought that you guys would actually think about using it :-), i know it is not a good solution and it would be better to fix it on gnomevfs instead of "forcing" it to work using locks. And since the problem only happens with ssh it makes less sense solving this way. The product we are developing only uses ssh, so we are fixing it that way while this don't get fixed on gnomevfs (i reported a bug to gnomevfs too, but i don't have time to work on gnomevfs and try to fix it myself :-(, so I'm depending on the good will of the gnomevfs maintainers). At least everyone now knows this is an issue and a "way" of fixing it :-). and sorry for my bad english ;-) best regards, Katcipis
I fix this bug in the gnome-vfs. For details, see the bug: https://bugzilla.gnome.org/show_bug.cgi?id=609007 Now, let's wait the next gnome-vfs release.
Ok, then let's close this as duplicate of the gnomevfs bug :) *** This bug has been marked as a duplicate of bug 609007 ***