GNOME Bugzilla – Bug 131959
tar method won't open root directory of archive
Last modified: 2008-09-06 19:17:38 UTC
I haven't seen any documentation in gnome-vfs to indicate how one should encode the root directory of an archive into a URI, but empirically, it looks like something like either "foo.tar#tar:" or "foo.tar#tar:/" should work (gnome-vfs expands the former to the latter before giving the uri to the module's method). The tar method doesn't handle either of these cases.
Created attachment 23538 [details] [review] fixes problem - test with "gnomevfs-ls /foo.tar#tar:", etc., before & after
Hmm, the tar-method seems to use some strange indentation. How well does it work anyway? I bet it would be better if converted to a daemon-side method. That way you'd know you have a glib mainloop and could use e.g. timeouts for cache handling, plus all apps would share the same cached data. We do need a way for the method cache to keep the daemon alive a bit longer after the last client quit when there is a cache though. The tar method seems to think uri->text being NULL means thats the root, but I think that may have changed at some point. This makes me doubt this part of the patch: if (uri->text) - node = tree_lookup_entry (tar->info_tree, uri->text); + tree_lookup_entry (tar->info_tree, uri->text, &node); else Shouldn't the else case handle is_root == TRUE? Anyway, we're currently in a feature freeze for gnome 2.6, so i'd like to focus on bugs in stuff that is supposed to work. Its very nice that someone is interested in uri chaining, but I think it'll have to wait for 2.7 for integration in cvs. That shouldn't stop you from working on it if you want though.
Thanks for the catching the do_get_file_info() inconsistency. I'll get to the bottom of whether or not NULL is a still a valid value for gnome-vfs to generate, and fix it the right way in the next patch, along with the case in do_open_directory(). Leaving that ambiguity in is only going to cause more mistakes. Regarding the general design, things I've noticed: - Caching the entire tarfile to RAM on open() is a scalability problem. - Synchronous read of entire file on open() isn't expected by clients, can take too long (i.e., asynchronous main loop is nice, like you said). - Doesn't implement write() yet, which is a problem for general use. I don't know much about the daemon - haven't really looked at HEAD, I've been using tarballs - I'll remedy that soon. Thinking off the top of my head, it'd be interesting to treat the daemon's cache as a garbage collected pool. Maybe that's overkill, but if enough apps use caching gnomevfs methods, you want to be real careful about keeping such stuff in memory, even when the files are opened. Seems like there needs to be a layer to cache files or fragment of files in memory, that sits on top of local files or tmpfile() to cache remote files. P.S. Yeah - the indentation looks like you need to use some a negative hanging brace offset to get the contents of the file to match the editor magic at the top. I suppose it's not GNOME standard either way. Also, I'm in no rush to get changes in, I'm just biting off problems to have a practical reason to learn the GNOME toolkit.
I found at least one more problem in the originall code - it makes some assumptions that lead to a segfault in do_get_info(). I'm going to think about the requirements for a bit before doing any more hacking. tar has many differnt variants, I'm not sure it really makes sense to try to parse them all. The latest alpha GNU tar appears to support them all, but it appears that the default GNU tar format itself changes over time, as part of a long-term effort to generate POSIX tar files. I think maybe the right thing to do is to fork a tar process to extract an archive to a temp directory. That would let tar keep all the knowledge, and let GNOME vfs adjust its encoding/decoding behavior as the tar binary associated with it changes. I'll look into this in the context of the new daeomon this next week.
I've made some progress on this. Before I start writing real code, I've got a question. If I end up writing something capable of running in the daemon, am I obligated to support it with and without the daemon? I worry that some of the assumptions I've making w.r.t. fork/sigchld can't be safely made in the process of the caller, and can only be done if all of the code running on the daemon are willing to work in terms of gnome-vfs-process.c APIs and other common abstractions. FWIW, putting the control for this in the modules' .conf files seems weird. A sysadmin could conceivably add or remove the [daemon] qualifier to any module as he changes the modules used on his system. If the general idea is that daemon capability is determined by the module implementation, maybe it makes more sense to put this information in each module's GnomeVFSMethod object, or some other extra object that the module's vfs_module_init function would be obligated to set if it supports daemon (perhaps no ABI breakage)?
The [daemon] modifier is up to the module itself to specify. Of course, the sysadmin can fuck up his system in any way he wants, but we don't support that. Sysadmins aren't really supposed to mess with the .conf files (unless possible disabling things or something like that). They are meant for 3rd party modules to be able to install new vfs methods. And we don't really want to look in the GnomeVFSMethod object, because then we have to dlopen the daemon module in the client too. That is quite unnecessary. So, you can rely on you module being run in the daemon (if you specify it as such). However, what sort of requirements do you have on the other modules in the daemon? I'm not sure that will fly. You mention gnome-vfs-process.c, but that is a non-threadsafe api (per the comment at the top), and vfs-methods are quite threaded.
Thanks for the clarification on [daemon]. That makes plenty of sense. One down, "n" to go... Regarding the tar stuff, what I want to do is get the parsing of the tar file out of the daemon, and rely on a more tested parser. There are a lot of tar formats out there. I could write up a lot of test cases to ensure that the existing tar-method.c parser works in all of them (it doesn't right now, apparently), but I think it's better to rely on a real tar. My initial idea was to write a patch to split GNU tar into a libtar (not to be confused with the preexisting unrelated (AFAIK) libtar) and a tar wrapper would be its first user. Licensing issues (GPL/LGPL) and maintenance questions (if the tar maintainers cared, they'd have done it by now - do I want to commit to keeping it API/ABI safe?) made me think that I should look at using the tar binary instead, but I'm not sold on the idea yet. The other idea is uglier, but avoids the license question, and may be easier to maintain. To use the tar binary, I'd fork tar to dump each file into a temporary directory, and handle SIGCHLD on completion. The handler needs to use some IPC that's safe from a sighandler in a threaded program to wake up the thread that's blocked in do_open(). I'm not sold on gnome-vfs-process's implementation. I haven't looked at it to see if I can lock around it or not. If I can, I can use it, if not, I just need something to let me do the forking and blocking that I need. My pthreads-foo isn't great (yet), but there's got to be a way to pull this off. What's even worse about the forking idea is that I end up still wanting to patch GNU tar to add an option to dump out correct per-file information (some of it's missing with the "t" option to tar". Also, I need to do this at the same time as the "x" option to avoid two passes - I don't think tar permits that, so yet another patch would be needed. Finally, I don't know if tar "t" output keeps the same format over time - if it ever changes, then my code breaks. So, I'm open to ideas on the implementation. I hope I've missed something obvious that simplifies this. I'm going to be looking at the "tar" source to figure out how to implement what I need in either case, and see if either approach is OK by the maintainers.
gnome-vfs modules which run in the daemon context can be GPL since the daemon itself is GPL, maybe that solves your libtar license concern ?
Thanks. I'll check this out later this year. I'll largely be away from computers for a month or two, so it may be a while.
*** Bug 140563 has been marked as a duplicate of this bug. ***
*** Bug 137072 has been marked as a duplicate of this bug. ***
Any news on your investigation George? Would be cool to have this fixed :)
The short answer, is no. I haven't had access to a Linux-running PC in a while, so I haven't been able to hack on this. I think you've just inspired me to buy an external disk so I can in fact boot into Linux on the machine I've got. Thanks for the kick in the pants! ;)
According to Christian Kellner, libtar isn't suitable for our needs since it can't do seeking for instance.
gnome-vfs has been deprecated and superseded by gio/gvfs since GNOME 2.22, hence mass-closing many of the gnome-vfs requests/bug reports. This means that gnome-vfs is NOT actively maintained anymore, however patches are still welcome. If your reported issue is still valid for gio/gvfs, please feel free to file a bug report against glib/gio or gvfs. @Bugzilla mail recipients: query for gnome-vfs-mass-close to get rid of these notification emails all together. General further information: http://en.wikipedia.org/wiki/GVFS Reasons behind this decision are listed at http://www.mail-archive.com/gnome-vfs-list@gnome.org/msg00899.html