GNOME Bugzilla – Bug 739008
Add a Google Drive backend
Last modified: 2015-09-09 15:12:39 UTC
A simple back-end for Google Drive based on libgdata and gnome-online-accounts would be useful. Drive is quite popular, and being able to access it from nautilus and the gtk+ file chooser would enhance our online accounts story. I don't expect to have all the bells and whistles of Drive because it is more like an object store with IDs as opposed to a hierarchical file system with paths, but we will see what we can do. We already have a GOA volume monitor, but that only looks uses raw passwords for authentication. It should now look for OAuth2 tokens too, because that is what the Drive API uses. Way back in 2009, there was an attempt to do the same as a GSoC project [1, 2]. I will see if any of that code can be recovered and put to use. I will attach patches once things are more functional. [1] http://thiblahute.blogspot.cz/2009/08/gsoc-add-google-documents-support-to.html [2] https://github.com/thiblahute/gvfs/tree/googledocuments
*** Bug 594423 has been marked as a duplicate of this bug. ***
*** Bug 674769 has been marked as a duplicate of this bug. ***
Sorry, I forgot that we already a few other bugs about this.
GOA needs to export the GVfs URI representing the user's Drive, just as it does for the URI representing a ownCloud store. See bug 739009 for that.
Created attachment 291029 [details] [review] build: Fix the comments to reflect reality
Created attachment 291030 [details] [review] goa: Pick up OAuth2-based accounts with a GoaFiles interface
Created attachment 291031 [details] [review] Add GVfsBackendGoogle This is a work in progress: 1) I need to clean up the code in a bunch of places 2) You can not upload 3) For SetDisplayName to work with nautilus, you need the patch in bug 740383 4) You also need the GOA patch from bug 739009 But you can take a look at the code just so I know that I am not doing something totally crazy. :)
Review of attachment 291029 [details] [review]: Looks good, thanks!
Review of attachment 291030 [details] [review]: Looks good, the code of get_access_token_cb is basically just a copy of get_password_cb, however untested. This authentication could be tested only with the new google backend I suppose?
If you want to test the backend, you have to: 1/ build goa from git and run daemon <prefix>/libexec/goa-daemon --replace 2/ setup your google account using g-c-c and enable Files on it (be sure g-c-c is linking goa from previous step) 3/ build gvfs branch wip/goa and run monitor <prefix>/libexec/gvfs-goa-volume-monitor 4/ build nautilus (due to Bug 740383) and run, there should be visible your google drive in sidbar
Review of attachment 291031 [details] [review]: It looks good overall, there are some notes from the first look: This should fail in cleaner way: $ gvfs-mount google-drive:/ (gvfs-mount:4053): GVFS-CRITICAL **: g_mount_spec_set_with_len_internal: assertion 'value != NULL' failed (gvfs-mount:4053): GVFS-CRITICAL **: g_mount_spec_set_with_len_internal: assertion 'value != NULL' failed (gvfs-mount:4053): GVFS-CRITICAL **: g_mount_spec_set_with_len_internal: assertion 'value != NULL' failed Error mounting location: Invalid mount spec ::: daemon/gvfsbackendgoogle.c @@ +89,3 @@ + +static gchar * +get_content_type_from_entry (GDataEntry *entry) Isn't there way how to get original format/content-type? Because it isn't good, when we have e.g. svg vector file on drive and download png bitmap instead... @@ +352,3 @@ + GList *l; + + query = gdata_documents_query_new (NULL); Wouldn't be better to specify query to not download whole list of files with every operation? @@ +728,3 @@ + authorizer = gdata_goa_authorizer_new (object); + + if (self->service != NULL) When this could happens? @@ +785,3 @@ + entry = g_hash_table_lookup (self->entries, document_id); + + is_folder_or_root (filename, entry, NULL, &is_root); You could pass also &is_folder and save following call... @@ +788,3 @@ + if (is_root) + { + g_vfs_job_failed (G_VFS_JOB (job), G_IO_ERROR, G_IO_ERROR_IS_DIRECTORY, _("%s is a directory"), filename); All other backends use error messages like "File is a directory", "File not found" (without %s), etc. Would be good to do it same. @@ +864,3 @@ + const gchar *type; + + if (self->service == NULL) When this could happens? @@ +1164,3 @@ + + error = NULL; + self->client = get_goa_client_sync (&error); I don't think it is good idea to call blocking function in init...
Also it seems some parts of libgdata have to be rewritten since April 20, 2015: Warning: The deprecation period for Version 3 of the Google Documents List API is nearly at an end. On April 20, 2015, we will discontinue service for this API. This means that service calls to the API are no longer supported, and features implemented using this API will not function after April 20, 2015. You must migrate to the Drive API as soon as possible to avoid disruptions to your application. Source: https://developers.google.com/google-apps/documents-list/
Created attachment 292124 [details] [review] Add GVfsBackendGoogle
(In reply to comment #9) > Review of attachment 291030 [details] [review]: > > Looks good, the code of get_access_token_cb is basically just a copy of > get_password_cb, however untested. Yes, it is basically the same except the part that calls goa_oauth2_based_call_get_access_token_finish instead of goa_password_based_call_get_password_finish. > This authentication could be tested only with the new google backend I suppose? Yes.
(In reply to comment #11) > Review of attachment 291031 [details] [review]: Thanks for the review, Ondrej. I added a paging mechanism to rebuild_entries and enumerate. There are some server-side bugs which prevent the standard libgdata APIs from working (see the comment), so I worked around it. Can you please try it against your account? There is some duplicated code in the two functions. If it works, I can try to refactor it a bit. > It looks good overall, there are some notes from the first look: > > This should fail in cleaner way: > $ gvfs-mount google-drive:/ > (gvfs-mount:4053): GVFS-CRITICAL **: g_mount_spec_set_with_len_internal: > assertion 'value != NULL' failed > (gvfs-mount:4053): GVFS-CRITICAL **: g_mount_spec_set_with_len_internal: > assertion 'value != NULL' failed > (gvfs-mount:4053): GVFS-CRITICAL **: g_mount_spec_set_with_len_internal: > assertion 'value != NULL' failed > Error mounting location: Invalid mount spec Fixed. I added some checks in google_from_uri to not set NULL values. > ::: daemon/gvfsbackendgoogle.c > @@ +89,3 @@ > + > +static gchar * > +get_content_type_from_entry (GDataEntry *entry) > > Isn't there way how to get original format/content-type? Because it isn't good, > when we have e.g. svg vector file on drive and download png bitmap instead... Yes, this part is still rough and unfinished. > @@ +352,3 @@ > + GList *l; > + > + query = gdata_documents_query_new (NULL); > > Wouldn't be better to specify query to not download whole list of files with > every operation? We keep a hash table mapping document-ids to GDataEntries. We only use rebuild_entries if an operation uses a document-id that we have never seen before. Once we have called rebuild_entries in an operation, we don't need to call it again unless the contents of the Drive has changed. When you are using a UI like nautilus, rebuild_entries should not be needed at all. The user can only use document-ids that have been returned by the enumerate operation. I tried using gdata_documents_query_set_folder_id, but it was not working. Could be a server-side bug. I did not dig deeper because this appeared simpler and less prone to failures due to server-side issues. > @@ +728,3 @@ > + authorizer = gdata_goa_authorizer_new (object); > + > + if (self->service != NULL) > > When this could happens? This should never happen, because the code assumes that one instance of the backend handles only one mount. I could have used an assert, but then I thought that it is easy to 'recover' from this, if it ever happens. I don't know. Maybe this code is pointless? What do you suggest? > @@ +785,3 @@ > + entry = g_hash_table_lookup (self->entries, document_id); > + > + is_folder_or_root (filename, entry, NULL, &is_root); > > You could pass also &is_folder and save following call... The problem is that 'entry' can be NULL, if this document-id was never seen before. The check for is_root can be done without 'entry', so we do it first, but the check for is_folder is based on 'entry'. So we rebuild_entries before doing that. > @@ +788,3 @@ > + if (is_root) > + { > + g_vfs_job_failed (G_VFS_JOB (job), G_IO_ERROR, G_IO_ERROR_IS_DIRECTORY, > _("%s is a directory"), filename); > > All other backends use error messages like "File is a directory", "File not > found" (without %s), etc. Would be good to do it same. Fixed. > @@ +864,3 @@ > + const gchar *type; > + > + if (self->service == NULL) > > When this could happens? This is useless. Removed. > @@ +1164,3 @@ > + > + error = NULL; > + self->client = get_goa_client_sync (&error); > > I don't think it is good idea to call blocking function in init... It is a copy of the code from g_vfs_goa_volume_monitor_init. If you think it is a big problem, I can change it.
(In reply to comment #12) > Also it seems some parts of libgdata have to be rewritten since April 20, 2015: Thanks for pointing it out, Ondrej. I spoke with Philip about it yesterday. The plan is to port it to the new Drive API as soon as possible. It might cause some changes to the public libgdata API, but the basics would remain the same.
libgdata has now been mostly ported to the new Drive v2 API (see bug 684920 and the libgdata:wip/rishi/drive branch) and I have adjusted the GVfs backend on top of it. Major changes so far: (i) Use batched queries when fetching the file list from Drive. Paging / batched queries was previously broken in libgdata. It has now been fixed. (ii) Use GDataEntry:id instead of GDataDocuments:document-id. The former used to be an URL in the previous API and the latter was just the unique identifier part, which made it easy to construct a GVfs path using them. In the current Drive API, GDataEntry:id is always the unique identifier, and GDataDocuments:document-id has been deprecated. Note: the ID should be ideally treated as a blob without making any assumptions about its format. (iii) gdata_documents_entry_get_path doesn't work anymore. We were using it to construct a path made of GDataDocuments:document-ids. It worked by making a lot of assumptions about the format of GDataEntry:id. There is no way to cheaply (ie. without blocking calls) implement this in libgdata. Hence, we do it ourselves by working our way up the directory hierarchy and looking at our ID to GDataEntry hash table. (iv) Offer thumbnails via G_FILE_ATTRIBUTE_PREVIEW_ICON. However there is an issue with native Drive files. See (v). (v) Native Drive files open in the web browser instead of being exported into some other format. This is implemented by faking a link-type *.desktop file that points to the actual URL. I don't know if this is the best way to do this. Also, thumbnailing is currently broken for these files because I couldn't find a way to make the thumbnailer use the icons provided by the backend. (vi) Honour GFileInfoQueryFlags (vii) Implement copy (viii) Use gdata_documents_service_add_entry_to_folder instead of gdata_service_insert_entry in make_directory because there is no folder-specific URL in the new API anymore. (ix) Support G_FILE_ATTRIBUTE_STANDARD_IS_VOLATILE (see bug 741602 and bug 751481)
Created attachment 306086 [details] [review] Add GVfsBackendGoogle
Review of attachment 306086 [details] [review]: I’m not really familiar with GVFS internals, so this is more of a general review. ::: client/googleuri.c @@ +2,3 @@ +/* gvfs - extensions for gio + * + * Copyright (C) 2014 Red Hat, Inc. 2015 as well? @@ +91,3 @@ + if (has_host && has_user) + { + identity = g_strconcat (uri->userinfo, "@", uri->host, NULL); Is there any need to escape the userinfo or host here? @@ +117,3 @@ +{ + static const gchar *types[] = { + "google-drive", There’s a lot of duplication of this magic ‘google-drive’ string. Could it be factored out as a global static const variable or something? ::: configure.ac @@ +418,3 @@ +msg_google=no +GOOGLE_LIBS= +GOOGLE_CFLAGS= No need to pre-define these; undefined variables are treated as empty by AC_SUBST(). @@ +420,3 @@ +GOOGLE_CFLAGS= + +if test "x$enable_google" != "xno" ; then Use AS_IF(). @@ +421,3 @@ + +if test "x$enable_google" != "xno" ; then + PKG_CHECK_EXISTS(goa-1.0 >= 3.17.1 libgdata >= 0.17.2, msg_google=yes) Please quote this lot: PKG_CHECK_EXISTS([goa-1.0 >= 3.17.1 libgdata >= 0.17.2],[msg_google=yes]) Same below. http://www.hep.by/gnu/autoconf/Autoconf-Language.html ::: daemon/Makefile.am @@ +49,3 @@ mount_DATA = sftp.mount ftp.mount ftps.mount trash.mount computer.mount burn.mount localtest.mount network.mount +mount_in_files +=google.mount.in Space after ‘+=’. @@ +447,3 @@ + -DDEFAULT_BACKEND_TYPE=google-drive \ + -DBACKEND_TYPES='"google-drive", G_VFS_TYPE_BACKEND_GOOGLE,' \ + $(GOOGLE_CFLAGS) GOOGLE_CFLAGS should be in gvfsd_google_CFLAGS. ::: daemon/gvfsbackendgoogle.c @@ +66,3 @@ +}; + +G_DEFINE_TYPE(GVfsBackendGoogle, g_vfs_backend_google, G_VFS_TYPE_BACKEND) Space before opening bracket. @@ +71,3 @@ + +#define CATEGORY_SCHEMA_KIND "http://schemas.google.com/g/2005#kind" +#define CATEGORY_SCHEMA_KIND_FILE "http://schemas.google.com/docs/2007#file" These should be provided by libgdata. Please add them to libgdata if needed. @@ +79,3 @@ +#define REBUILD_ENTRIES_TIMEOUT 60 /* s */ + +#define URI_PREFIX "https://www.googleapis.com/drive/v2/files/" Same for this — please add it to libgdata if needed. @@ +91,3 @@ + + monitored_path = g_object_get_data (G_OBJECT (monitor), "g-vfs-backend-google-path"); + if (g_str_has_prefix (filename, monitored_path)) What if filename is /foo/bar/abcdef/blah and monitored_path is /foo/bar/abc? I think this check would pass incorrectly. @@ +205,3 @@ + g_file_info_set_attribute_boolean (info, G_FILE_ATTRIBUTE_STANDARD_IS_VOLATILE, is_symlink); + + g_file_info_set_attribute_boolean (info, G_FILE_ATTRIBUTE_ACCESS_CAN_TRASH, FALSE); A comment justifying why these attributes are set this way would be good. I can imagine this might change in future, so it would be good to have a reference of how the initial decision was made. @@ +230,3 @@ + type = g_mount_spec_get (spec, "type"); + mount_identity = g_mount_spec_get (spec, "goa-identity"); + target_uri = g_strdup_printf ("%s://%s%s", type, mount_identity, filename); Does this not need escaping? Also, can you guarantee that type and mount_identity are non-NULL? Also, this seems to set the symlink target for the GFileInfo for filename to the filename, resulting in a reflexive symlink. Am I misreading this? @@ +244,3 @@ + content_type_override = TRUE; + virtual_content_type = "application/x-desktop"; + } A comment explaining this would be grand. @@ +392,3 @@ + goto out; + + path = g_string_prepend (path, id); libgdata can’t guarantee that the IDs will not contain ‘/’, so you need to perform escaping here (and further up in the function). @@ +506,3 @@ + g_hash_table_remove_all (self->entries); + succeeded_once = TRUE; + } Not quite sure I understand this succeeded_once stuff. What’s the idea here? @@ +519,3 @@ + id = gdata_entry_get_id (entry); + if (g_hash_table_lookup (self->entries, id) == NULL) + g_hash_table_insert (self->entries, g_strdup (id), g_object_ref (entry)); What situations would you have where the ID already exists in self->entries on a query for the second or subsequent pages, having earlier called g_hash_table_remove_all(self->entries)? @@ +538,3 @@ + gpointer monitors) +{ + g_object_weak_unref (G_OBJECT (monitor), (GWeakNotify) g_hash_table_remove, monitors); TODO: Check this. @@ +552,3 @@ + GVfsJobCloseRead *job = G_VFS_JOB_CLOSE_READ (user_data); + + error = NULL; Could move this up to the declaration. @@ +565,3 @@ + out: + g_object_unref (job); + g_debug ("- close_read\n"); Don’t need \n with g_debug(). (And in various other places in the file.) @@ +626,3 @@ + G_IO_ERROR, + G_IO_ERROR_WOULD_RECURSE, + _("Can't recursively copy directory")); This seems like a bit of a feature omission. @@ +821,3 @@ + g_vfs_job_failed (G_VFS_JOB (job), G_IO_ERROR, G_IO_ERROR_NOT_FOUND, _("No such file or directory")); + goto out; + } This code to look up an entry by ID, rebuild the entries if that fails, then error out if _that_ fails is quite common. You could factor it out of a number of these operation functions. @@ +968,3 @@ + G_IO_ERROR, + G_IO_ERROR_NOT_SUPPORTED, + _("Can not create root directory")); s/Can not/Cannot/, and elsewhere in the file. @@ +1069,3 @@ + { + g_vfs_job_failed (G_VFS_JOB (job), G_IO_ERROR, G_IO_ERROR_FAILED, _("Failed to connect to GOA")); + goto out; Couldn’t you retry this here, in case GOA has become functional since init()? @@ +1222,3 @@ + + data = g_key_file_to_data (file, &length, NULL); + stream = g_memory_input_stream_new_from_data (data, (gssize) length, g_free); This block of code definitely needs some explanation in a comment. @@ +1264,3 @@ + + auth_domain = gdata_documents_service_get_primary_authorization_domain (); + stream = gdata_download_stream_new (GDATA_SERVICE (self->service), auth_domain, icon_id, cancellable); Is the icon_id really guaranteed to be a valid URI? @@ +1335,3 @@ + G_IO_ERROR, + G_IO_ERROR_WOULD_RECURSE, + _("Can't recursively copy directory")); This also looks like a pretty major missing bit of functionality. ::: daemon/gvfsbackendgoogle.h @@ +2,3 @@ +/* gvfs - extensions for gio + * + * Copyright (C) 2014 Red Hat, Inc. And 2015? @@ +52,3 @@ + +typedef struct _GVfsBackendGoogle GVfsBackendGoogle; +typedef struct _GVfsBackendGoogleClass GVfsBackendGoogleClass; You could use G_DECLARE_FINAL_TYPE instead of all this boilerplate, since GVFS depends on GLib ≥ 2.45.0, and that was introduced in 2.44.
Review of attachment 306086 [details] [review]: Haven't really looked at this, just a quick question. ::: daemon/gvfsbackendgoogle.c @@ +1605,3 @@ + + error = NULL; + g_seekable_seek (G_SEEKABLE (stream), offset, type, cancellable, &error); Could this be doing I/O? If so, it must not be done in try_seek. @@ +1613,3 @@ + } + + cur_offset = g_seekable_tell (G_SEEKABLE (stream)); Could this be doing I/O? If so, it must not be done in try_seek.
(In reply to Philip Withnall from comment #19) > Review of attachment 306086 [details] [review] [review]: > > I’m not really familiar with GVFS internals, so this is more of a general > review. > > ::: client/googleuri.c > @@ +2,3 @@ > +/* gvfs - extensions for gio > + * > + * Copyright (C) 2014 Red Hat, Inc. > > 2015 as well? This file wasn't touched in 2015. Only daemon/gvfsbackendgoogle.c was touched in 2015. > @@ +91,3 @@ > + if (has_host && has_user) > + { > + identity = g_strconcat (uri->userinfo, "@", uri->host, NULL); > > Is there any need to escape the userinfo or host here? I don't know. > @@ +117,3 @@ > +{ > + static const gchar *types[] = { > + "google-drive", > > There’s a lot of duplication of this magic ‘google-drive’ string. Could it > be factored out as a global static const variable or something? > > ::: configure.ac > @@ +418,3 @@ > +msg_google=no > +GOOGLE_LIBS= > +GOOGLE_CFLAGS= > > No need to pre-define these; undefined variables are treated as empty by > AC_SUBST(). > > @@ +420,3 @@ > +GOOGLE_CFLAGS= > + > +if test "x$enable_google" != "xno" ; then > > Use AS_IF(). > > @@ +421,3 @@ > + > +if test "x$enable_google" != "xno" ; then > + PKG_CHECK_EXISTS(goa-1.0 >= 3.17.1 libgdata >= 0.17.2, msg_google=yes) > > Please quote this lot: > > PKG_CHECK_EXISTS([goa-1.0 >= 3.17.1 libgdata >= 0.17.2],[msg_google=yes]) These follow the pattern in the rest of the file. I will fix these and add a separate patch to address the existing instances. > Same below. http://www.hep.by/gnu/autoconf/Autoconf-Language.html > > ::: daemon/Makefile.am > @@ +49,3 @@ > mount_DATA = sftp.mount ftp.mount ftps.mount trash.mount computer.mount > burn.mount localtest.mount network.mount > > +mount_in_files +=google.mount.in > > Space after ‘+=’. > > @@ +447,3 @@ > + -DDEFAULT_BACKEND_TYPE=google-drive \ > + -DBACKEND_TYPES='"google-drive", G_VFS_TYPE_BACKEND_GOOGLE,' \ > + $(GOOGLE_CFLAGS) > > GOOGLE_CFLAGS should be in gvfsd_google_CFLAGS. Same as above. > ::: daemon/gvfsbackendgoogle.c > @@ +66,3 @@ > +}; > + > +G_DEFINE_TYPE(GVfsBackendGoogle, g_vfs_backend_google, G_VFS_TYPE_BACKEND) > > Space before opening bracket. Same as above. (See the AFC backend and monitor, the GOA monitor, etc..) > @@ +71,3 @@ > + > +#define CATEGORY_SCHEMA_KIND "http://schemas.google.com/g/2005#kind" > +#define CATEGORY_SCHEMA_KIND_FILE "http://schemas.google.com/docs/2007#file" > > These should be provided by libgdata. Please add them to libgdata if needed. Ok. > @@ +79,3 @@ > +#define REBUILD_ENTRIES_TIMEOUT 60 /* s */ > + > +#define URI_PREFIX "https://www.googleapis.com/drive/v2/files/" > > Same for this — please add it to libgdata if needed. This is used for the same hack as the one here: https://git.gnome.org/browse/libgdata/tree/gdata/services/documents/gdata-documents-utils.h#n28 Do you still want libgdata to promote this in its public API? Maybe the more proper solution would be to make GDataLink store the ID too? > @@ +91,3 @@ > + > + monitored_path = g_object_get_data (G_OBJECT (monitor), > "g-vfs-backend-google-path"); > + if (g_str_has_prefix (filename, monitored_path)) > > What if filename is /foo/bar/abcdef/blah and monitored_path is /foo/bar/abc? > I think this check would pass incorrectly. This is basically the same logic as emit_event_internal in gvfsbackendmtp.c If monitored_path is the directory at /foo/bar/abc then we want to emit the event for /foo/bar/abcdef/blah, and if /foo/bar/abc is a regular file then the situation won't arise in the first place. Note that we are only creating directory monitors at the moment, because, in my understanding, those are more important for nautilus and gtk+ file chooser. > @@ +205,3 @@ > + g_file_info_set_attribute_boolean (info, > G_FILE_ATTRIBUTE_STANDARD_IS_VOLATILE, is_symlink); > + > + g_file_info_set_attribute_boolean (info, > G_FILE_ATTRIBUTE_ACCESS_CAN_TRASH, FALSE); > > A comment justifying why these attributes are set this way would be good. I > can imagine this might change in future, so it would be good to have a > reference of how the initial decision was made. Ok. The IS_VOLATILE part is pretty fundamental thing for this backend. Unlike the existing GVfs backends, Drive doesn't have POSIXish semantics. Instead, it is a database-backed storage where the ID is like a cookie or blob and doesn't map to a path. Future examples of such storages could be a file portal backend. See 741602 for the whole story. Regarding CAN_TRASH=FALSE, I was wondering if we should add some libgdata API for trashing GDataDocumentEntries instead of permanently deleting them. > @@ +230,3 @@ > + type = g_mount_spec_get (spec, "type"); > + mount_identity = g_mount_spec_get (spec, "goa-identity"); > + target_uri = g_strdup_printf ("%s://%s%s", type, mount_identity, > filename); > > Does this not need escaping? Also, can you guarantee that type and > mount_identity are non-NULL? Note that this is the same GMountSpec that was set during mount using g_vfs_backend_set_mount_spec. I do expect goa-identity to always be non-NULL. It is supposed to be something like foo@gmail.com, and you can't, for example, mount the storage without it. Similarly, the type has to be 'google-drive' and is glued together by GVfsUriMapperGoogle and the following entry in daemon/Makefile.am: -DBACKEND_TYPES='"google-drive", G_VFS_TYPE_BACKEND_GOOGLE,' This is based on my understanding of GVfs internals, which is not bullet-proof. :) > Also, this seems to set the symlink target for the GFileInfo for filename to > the filename, resulting in a reflexive symlink. Am I misreading this? See above for IS_VOLATILE and bug 741602 When you want to create a new file, the backend sees a request for /id1/New File. However, "New File" is only going to be the GDataEntry:title or display name of the new entry that will be created. The actual filename will be /id1/id20. To convey this to the application, we keep a lookaside cache that maps /id1/New File to /id1/id20. We expect applications (see bug 751481 for nautilus) to use "some" mechanism (currently it is the IS_VOLATILE attribute) to map the fake path to the real one "as soon as possible". The Drive backend makes some effort to handle subsequent operations that use fake paths, but in the end they are not going to have the same guarantees (persistence, uniqueness) as real paths made from server-defined IDs. > @@ +244,3 @@ > + content_type_override = TRUE; > + virtual_content_type = "application/x-desktop"; > + } > > A comment explaining this would be grand. Yeah, I was expecting you would say that. :) This is meant to be (a gross hack?) way to open native Drive files in the browser instead of exporting them to a desktop-native format and downloading them. I couldn't find a way to do that, other than creating fake Link-type desktop files. Is there a better way? > @@ +392,3 @@ > + goto out; > + > + path = g_string_prepend (path, id); > > libgdata can’t guarantee that the IDs will not contain ‘/’, so you need to > perform escaping here (and further up in the function). Yes, we need escaping for a few other things too. It is going to be in the next iteration of the patch. > @@ +506,3 @@ > + g_hash_table_remove_all (self->entries); > + succeeded_once = TRUE; > + } > > Not quite sure I understand this succeeded_once stuff. What’s the idea here? We fail the operation only if we didn't get a single page from the server.
(In reply to Philip Withnall from comment #19) Continued ... > @@ +519,3 @@ > + id = gdata_entry_get_id (entry); > + if (g_hash_table_lookup (self->entries, id) == NULL) > + g_hash_table_insert (self->entries, g_strdup (id), g_object_ref > (entry)); > > What situations would you have where the ID already exists in self->entries > on a query for the second or subsequent pages, having earlier called > g_hash_table_remove_all(self->entries)? This should be removed. It is a remnant from the past when paging was broken in libgdata. :D We had some weird code that kept the start-index at 1 and kept increasing the max-results. See attachment 292124 [details] [review] for the horror. > @@ +538,3 @@ > + gpointer monitors) > +{ > + g_object_weak_unref (G_OBJECT (monitor), (GWeakNotify) > g_hash_table_remove, monitors); > > TODO: Check this. This is copied from the MTP backend, but, yeah, always good to double check. > @@ +552,3 @@ > + GVfsJobCloseRead *job = G_VFS_JOB_CLOSE_READ (user_data); > + > + error = NULL; > > Could move this up to the declaration. Ok. > @@ +565,3 @@ > + out: > + g_object_unref (job); > + g_debug ("- close_read\n"); > > Don’t need \n with g_debug(). (And in various other places in the file.) We do, because GVfs has its own GLogFunc that uses g_print. See the other files. > @@ +626,3 @@ > + G_IO_ERROR, > + G_IO_ERROR_WOULD_RECURSE, > + _("Can't recursively copy directory")); > > This seems like a bit of a feature omission. This should not be an issue. The default GFile copy implementation (used by GLocalFile) also throws WOULD_RECURSE if the source is a directory. See open_source_for_copy in gio/gfile.c. Similarly, a bunch of GVfs backends (AFP, burn, FTP, MTP and SMB) also throw it. There is code in nautilus (see libnautilus-private/nautilus-file-operations.c) that handles this by falling back to iterating over the source directory, and creating directories and copying files as necessary. > @@ +821,3 @@ > + g_vfs_job_failed (G_VFS_JOB (job), G_IO_ERROR, > G_IO_ERROR_NOT_FOUND, _("No such file or directory")); > + goto out; > + } > > This code to look up an entry by ID, rebuild the entries if that fails, then > error out if _that_ fails is quite common. You could factor it out of a > number of these operation functions. Yes, I should do that. I am waiting for the ID escaping code to settle down a bit before doing that. > @@ +968,3 @@ > + G_IO_ERROR, > + G_IO_ERROR_NOT_SUPPORTED, > + _("Can not create root directory")); > > s/Can not/Cannot/, and elsewhere in the file. Ok. > @@ +1069,3 @@ > + { > + g_vfs_job_failed (G_VFS_JOB (job), G_IO_ERROR, G_IO_ERROR_FAILED, > _("Failed to connect to GOA")); > + goto out; > > Couldn’t you retry this here, in case GOA has become functional since init()? Yes, I might as well move the whole thing to mount from init. > @@ +1222,3 @@ > + > + data = g_key_file_to_data (file, &length, NULL); > + stream = g_memory_input_stream_new_from_data (data, (gssize) length, > g_free); > > This block of code definitely needs some explanation in a comment. This is the "show native Drive files in the browser" code. I don't know if this is the best way to do this. > @@ +1264,3 @@ > + > + auth_domain = gdata_documents_service_get_primary_authorization_domain (); > + stream = gdata_download_stream_new (GDATA_SERVICE (self->service), > auth_domain, icon_id, cancellable); > > Is the icon_id really guaranteed to be a valid URI? Yes. See thumbnailLink: https://developers.google.com/drive/v2/reference/files > @@ +1335,3 @@ > + G_IO_ERROR, > + G_IO_ERROR_WOULD_RECURSE, > + _("Can't recursively copy directory")); > > This also looks like a pretty major missing bit of functionality. Shouldn't be. See above. > ::: daemon/gvfsbackendgoogle.h > @@ +2,3 @@ > +/* gvfs - extensions for gio > + * > + * Copyright (C) 2014 Red Hat, Inc. > > And 2015? This file wasn't touched in 2015. Only daemon/gvfsbackendgoogle.c was touched in 2014. > @@ +52,3 @@ > + > +typedef struct _GVfsBackendGoogle GVfsBackendGoogle; > +typedef struct _GVfsBackendGoogleClass GVfsBackendGoogleClass; > > You could use G_DECLARE_FINAL_TYPE instead of all this boilerplate, since > GVFS depends on GLib ≥ 2.45.0, and that was introduced in 2.44. I could, but the code was written six months ago. :)
(In reply to Ross Lagerwall from comment #20) > ::: daemon/gvfsbackendgoogle.c > @@ +1605,3 @@ > + > + error = NULL; > + g_seekable_seek (G_SEEKABLE (stream), offset, type, cancellable, &error); > > Could this be doing I/O? If so, it must not be done in try_seek. Good point. GDataDownloadStream can, indeed, block during seek under certain cases. I have moved this to the non-try vfunc.
Here is another iteration of the backend. Major changes: (i) Initial implementation of escaping and unescaping of GDataEntry:ids. There are various cases where we need to distinguish between a path made up of server-defined, persistent IDs and one made up of "fake" names. Consider copying a directory. The copy operation will be invoked as /id1/id2 -> /id10/id2. The basename of the destination is obviously "fake". After copy throws a WOULD_RECURSE, we will fallback to make_directory with /id10/id2 (followed by further invocations of copy). Normally, there is no way for make_directory to know that the basename of /id10/id2 is a server-defined ID, and not a user-readable string (ie. id10/new-folder). The current escaping scheme prepends an underscore to IDs before handing it to the application. The next step would be to also pass the IDs through g_uri_escape_string because we should not be making assumptions about their internals. There is an open question about how to handle user-readable display names that start with an underscore... (ii) Sanitize copy-name by changing slashes to dashes. Nautilus generally disallows slashes, but makes a similar exception for desktop files. I think replacing the slashes with another character is better than completely stripping them off. We won't need to handle cases where there are no other characters apart from slashes. (iii) Changed seek_on_read to the non-try variant (comment 20). Missing pieces: (i) Missing append, create and replace implementations. (ii) More robust escaping.
Created attachment 306309 [details] [review] Add GVfsBackendGoogle
I get following error in Nautilus when I try to access my google drive currently: This location could not be displayed. Authentication required: { "error": { "errors": [ { "domain": "usageLimits", "reason": "dailyLimitExceededUnreg", "message": "Daily Limit for Unauthenticated Use Exceeded. Continued use requires signup.", "extendedHelp": "https://code.google.com/apis/console" } ], "code": 403, "message": "Daily Limit for Unauthenticated Use Exceeded. Continued use requires signup." } } I didn't investigate yet what does it mean, but json shouldn't be provided as an error message...
(In reply to Ondrej Holy from comment #26) > I didn't investigate yet what does it mean, but json shouldn't be provided > as an error message... Should be fixed by the patch in bug 751782 I will still see if there is room for a better error message.
Review of attachment 306309 [details] [review]: I'm still quite skeptic about the file name handling (i.e. lookaside cache, volatile etc.). I see the reasons why you use it, but it brings a lot of other problems. I think applications should use the gio api in same way for all types of shares and should not care about volatile flags etc. There is a lot of problematic cases for this solution... Therefor I would prefer similar workflow as is in mtp backend, but ok... Many applications will have to be fixed to use display name etc. I suppose also fuse backend will be having problems with this backend... It would be nice to reorder the source code e.g. to have methods open_for_read, open_icon_for_read, read, seek_on_read, close_read in same place in the code etc. There are some other observations from a first look: ::: daemon/gvfsbackendgoogle.c @@ +927,3 @@ + cancellable, + &error); + if (error != NULL) You should also take care about G_FILE_COPY_OVERWRITE flag and return correct error messages for specific cases (i.e. G_IO_ERROR_EXISTS, G_IO_ERROR_IS_DIRECTORY, G_IO_ERROR_WOULD_MERGE). @@ +967,3 @@ + GVfsJobCreateMonitor *job, + const gchar *filename, + GFileMonitorFlags flags) I'm not experienced with monitors, but it seems to me that this implementation is pretty misleading if events are emitted only for changes done using gvfs. We should compare entries in rebuild_entries an emit events at least if there isn't support from libgdata. Otherwise it should be removed, because e.g. applications can rely on and do not use polling instead. @@ +1069,3 @@ + + if (is_volatile) + { Following code should be probably in separate function, because it is used on several places. @@ +1096,3 @@ + + if (entry == NULL) + { Following code should be probably in separate function, because it is used on several places. @@ +1279,3 @@ + if (unescaped_filename == NULL) + { + g_vfs_job_failed (G_VFS_JOB (job), G_IO_ERROR, G_IO_ERROR_NOT_FOUND, _("No such file or directory")); G_IO_ERROR_INVALID_ARGUMENT? @@ +1421,3 @@ + g_debug ("+ mount\n"); + + if (self->client == NULL) self->client should be always NULL @@ +1459,3 @@ + authorizer = gdata_goa_authorizer_new (object); + + if (self->service != NULL) This should never happen... @@ +1610,3 @@ + auth_domain = gdata_documents_service_get_primary_authorization_domain (); + uri = gdata_entry_get_content_uri (entry); + stream = gdata_download_stream_new (GDATA_SERVICE (self->service), auth_domain, uri, cancellable); You should check the result as you do in open_icon_for_read. @@ +2110,3 @@ + new_id = gdata_entry_get_id (new_entry); + g_hash_table_insert (self->entries, g_strdup (new_id), g_object_ref (new_entry)); + You should probably emit monitor event here... @@ +2169,3 @@ + cancellable, + read_cb, + g_object_ref (job)); g_object_ref seems to me to be redundant @@ +2248,3 @@ + +GVfsBackend * +g_vfs_backend_google_new (void) This isn't needed. ::: daemon/gvfsbackendgoogle.h @@ +56,3 @@ +GType g_vfs_backend_google_get_type (void) G_GNUC_CONST; + +GVfsBackend *g_vfs_backend_google_new (void); This isn't needed.
(In reply to Ondrej Holy from comment #28) <snip> > Many applications will have to be fixed to use display name etc. I suppose > also fuse backend will be having problems with this backend... They would already have been broken reading recent:/// for example. I don't think that's our problem to solve, applications that don't use display name are already broken today.
(In reply to Ondrej Holy from comment #28) > Review of attachment 306309 [details] [review] [review]: Thanks for taking a look, Ondrej. > I'm still quite skeptic about the file name handling (i.e. lookaside cache, > volatile etc.). I see the reasons why you use it, but it brings a lot of > other problems. I think applications should use the gio api in same way for > all types of shares and should not care about volatile flags etc. In a sandboxed world, applications would be using cookies/blobs/IDs (like, portal://ID) instead of traditional paths to represent files. See https://wiki.gnome.org/Projects/SandboxedApps/DocumentPortal In such a world, the gtk+ file chooser and nautilus would be considered "special" in the sense that they would run outside the sandbox, and all other apps would only see the IDs. Therefore, the big percentage of applications won't have to deal with this sort of stuff - they will dealing with clearly separated display names and IDs. The actual code to create a file will be limited to the file chooser and the file manager. > There is a > lot of problematic cases for this solution... Therefor I would prefer > similar workflow as is in mtp backend, but ok... The realities of MTP are a bit different from Google Drive. A cloud storage will be more racy than a device that is attached to the computer and accessed (only?) through gvfs. One might be changing the cloud storage through a web browser on the same computer, or a different computer. Trying to invent our own persistent IDs and then mapping them back to the server-defined ones is just going to expose ourselves to more complicated code and data corruption bugs, in my opinion. Plus, the display names (or GDataEntry:title) don't work very well as pathname components. For instance, titles can have forward slashes, and the same folder can have multiple identical titles.
(In reply to Debarshi Ray from comment #21) > (In reply to Philip Withnall from comment #19) > > Review of attachment 306086 [details] [review] [review] [review]: > > ::: configure.ac > > @@ +418,3 @@ > > +msg_google=no > > +GOOGLE_LIBS= > > +GOOGLE_CFLAGS= > > > > No need to pre-define these; undefined variables are treated as empty by > > AC_SUBST(). > > > > @@ +420,3 @@ > > +GOOGLE_CFLAGS= > > + > > +if test "x$enable_google" != "xno" ; then > > > > Use AS_IF(). > > > > @@ +421,3 @@ > > + > > +if test "x$enable_google" != "xno" ; then > > + PKG_CHECK_EXISTS(goa-1.0 >= 3.17.1 libgdata >= 0.17.2, msg_google=yes) > > > > Please quote this lot: > > > > PKG_CHECK_EXISTS([goa-1.0 >= 3.17.1 libgdata >= 0.17.2],[msg_google=yes]) > > These follow the pattern in the rest of the file. I will fix these and add a > separate patch to address the existing instances. I filed bug 753838 for the build system fixes.
I had a look at current master of wip/rishi/goa. First of all, you can completely drop the "Support g_file_resolve_created_path()" commit. That should not be needed anymore. Secondly, I think the current handling of escaping etc is a bit loose. It needs to be made a bit more robust: Make the gdrive backend set some magic value of mount_info->prefered_filename_encoding, and in g_daemon_file_get_child_for_display_name() for that encoding, make sure to replace an inital "_" with "__". i.e. if the user types "_foo" in the save as dialog, it'll be created as "__foo" (and __foo as ___foo). This will also make nautilus correctly handle copying a file called "_foo" from local disk to gdrive. In all operations that can create a file/dir we should return G_IO_ERROR_INVALID_FILENAME if the user tries to create a file starting with a single underscore, unless the file already exists (which should then behave just like that operation normally handles existing files, be it EEXIST or replace the file). Right now this seems quite broken with unescape_basename_and_map_dirname(), it can create a volatile file starting with underscore that may conflict with later added real ids. Also, we need to properly handle new filenames that are not utf8. We can't just pass these on as a display name. We need to either return INVALID_FILENAME, or try to convert to utf8 in some way. I think INVALID_FILENAME is the best approach here. The mutex seems to be a bit heavy in places. Could we perhaps drop it during the upload in push()? It seems like we're calling rebuild_entries() a lot, which downloads the entire db. Is it possible to somehow do some kind of etags check to avoid this if nothing changed since last time? The monitoring seems a bit wrong. If you're watching "/a" you will get create events for "/aa", as well as "/a/b/c", neighter of which are right. Furthermore, it only gives events for local changes (although i guess that is better than nothing). For the shortcut work, i think this code is nice, but we may have to add code in gio to create a desktop link file when copying such files to local files. Also, to make that work i unfortunately think we need to keep the ".desktop" prefix in the copy name hack that you used to have. Otherwise the local links will not work after copy. Actually its unfortunate that we're using the .desktop prefix, as that makes it hard to tell link files from reall application desktop files. Maybe we should make up a separate prefix for gio, ".desktoplink" and add it to shared-mime-info. That would make it easier to get some support for this in gio without it conflicting with existing .desktop files.
talking about monitoring - it would be cool to add testcases for this sort of thing (/a vs /aa or /a/b/c) to testfilemonitor.c which I just started in gio/tests/
It looks to me like you can get all changes since a particular change id with this api: https://developers.google.com/drive/v2/reference/changes/list This should let us avoid downloading the entire db unnecessary. The simple approach is to just let use it to avoid downloading when nothing has changed, and the more complex approach is to actually download and apply the delta. The later could also give us change notifications. That said, we don't necessarily need this in the initial version.
Also, seems like we can support trashing: https://developers.google.com/drive/v2/reference/files/trash Although the trashed files would not appear in the gnome trashcan without some further work.
(In reply to Debarshi Ray from comment #21) > (In reply to Philip Withnall from comment #19) > > Review of attachment 306086 [details] [review] [review] [review]: > > @@ +91,3 @@ > > + > > + monitored_path = g_object_get_data (G_OBJECT (monitor), > > "g-vfs-backend-google-path"); > > + if (g_str_has_prefix (filename, monitored_path)) > > > > What if filename is /foo/bar/abcdef/blah and monitored_path is /foo/bar/abc? > > I think this check would pass incorrectly. > > This is basically the same logic as emit_event_internal in gvfsbackendmtp.c > If monitored_path is the directory at /foo/bar/abc then we want to emit the > event for /foo/bar/abcdef/blah, and if /foo/bar/abc is a regular file then > the situation won't arise in the first place. > > Note that we are only creating directory monitors at the moment, because, in > my understanding, those are more important for nautilus and gtk+ file > chooser. (In reply to Alexander Larsson from comment #32) > The monitoring seems a bit wrong. If you're watching "/a" you will get > create events for "/aa", as well as "/a/b/c", neighter of which are right. Ok, I was wrong. I completely overlooked the watching "/a" but getting events for "/aa" case, and I didn't know that directory monitors were not supposed to be recursive. I have now reworked emit_event_internal to address these issues. It is almost identical to the code in gvfsbackendmtp.c plus the escaping logic we have. > Furthermore, it only gives events for local changes (although i guess that > is better than nothing). The directory monitoring code is a remnant from my experiments around file and directory creation. I was trying to see if we can do something like the MTP backend to make things work. In the end we decided that it won't work. See bug 741602 I think that Drive Channels [1] can be used to implement proper monitoring. Historically, libgdata didn't have any monitoring API, so we need to implement it first. Do you want me to remove the current directory monitoring code? Or do you want to keep it and improve it later? [1] https://developers.google.com/drive/v2/reference/channels
Keep it for now, at least cross-app changes on the same machine will work right.
(In reply to Alexander Larsson from comment #32) > I had a look at current master of wip/rishi/goa. Thanks for the review, Alex! > First of all, you can completely drop the "Support > g_file_resolve_created_path()" commit. That should not be needed anymore. Dropped. > Secondly, I think the current handling of escaping etc is a bit loose. It > needs to be made a bit more robust: > > Make the gdrive backend set some magic value of > mount_info->prefered_filename_encoding, and in > g_daemon_file_get_child_for_display_name() for that encoding, make sure to > replace an inital "_" with "__". i.e. if the user types "_foo" in the save > as dialog, it'll be created as "__foo" (and __foo as ___foo). This will also > make nautilus correctly handle copying a file called "_foo" from local disk > to gdrive. > > In all operations that can create a file/dir we should return > G_IO_ERROR_INVALID_FILENAME if the user tries to create a file starting with > a single underscore, unless the file already exists (which should then > behave just like that operation normally handles existing files, be it > EEXIST or replace the file). Right now this seems quite broken with > unescape_basename_and_map_dirname(), it can create a volatile file starting > with underscore that may conflict with later added real ids. > > Also, we need to properly handle new filenames that are not utf8. We can't > just pass these on as a display name. We need to either return > INVALID_FILENAME, or try to convert to utf8 in some way. I think > INVALID_FILENAME is the best approach here. > > The mutex seems to be a bit heavy in places. Could we perhaps drop it during > the upload in push()? Ok, I will address these. > It seems like we're calling rebuild_entries() a lot, which downloads the > entire db. Is it possible to somehow do some kind of etags check to avoid > this if nothing changed since last time? Currently we cache the entries for 1 minute. We only rebuild the cache if a minute has passed or we can't find an entry in it. Something like the Changes API [1], as you pointed out, is obviously superior, but I don't think it was ever implemented in libgdata. I will check again. [1] https://developers.google.com/drive/v2/reference/changes/list > For the shortcut work, i think this code is nice, but we may have to add > code in gio to create a desktop link file when copying such files to local > files. Also, to make that work i unfortunately think we need to keep the > ".desktop" prefix in the copy name hack that you used to have. Otherwise the > local links will not work after copy. As discussed on IRC, Drive-native files are now reported as shortcuts. However, for the copy to local storage fallback code to work, their copy names have ".desktop" suffixes and read() returns the contents of a link-type desktop file. > Actually its unfortunate that we're > using the .desktop prefix, as that makes it hard to tell link files from > reall application desktop files. Maybe we should make up a separate prefix > for gio, ".desktoplink" and add it to shared-mime-info. That would make it > easier to get some support for this in gio without it conflicting with > existing .desktop files. Do you want to use a suffix other than .desktop right away? Or do you want to do it next cycle? I am a bit scared to introduce a new suffix this late in the cycle.
(In reply to Debarshi Ray from comment #38) > > It seems like we're calling rebuild_entries() a lot, which downloads the > > entire db. Is it possible to somehow do some kind of etags check to avoid > > this if nothing changed since last time? > > Currently we cache the entries for 1 minute. We only rebuild the cache if a > minute has passed or we can't find an entry in it. > > Something like the Changes API [1], as you pointed out, is obviously > superior, but I don't think it was ever implemented in libgdata. I will > check again. https://bugzilla.gnome.org/show_bug.cgi?id=654652
(In reply to Alexander Larsson from comment #35) > Also, seems like we can support trashing: > > https://developers.google.com/drive/v2/reference/files/trash > > Although the trashed files would not appear in the gnome trashcan without > some further work. I filed bug 753929 for the necessary libgdata work.
Created attachment 309832 [details] [review] Add GVfsBackendGoogle
(In reply to Alexander Larsson from comment #32) > Make the gdrive backend set some magic value of > mount_info->prefered_filename_encoding, and in > g_daemon_file_get_child_for_display_name() for that encoding, make sure to > replace an inital "_" with "__". i.e. if the user types "_foo" in the save > as dialog, it'll be created as "__foo" (and __foo as ___foo). This will also > make nautilus correctly handle copying a file called "_foo" from local disk > to gdrive. I have now added this. I didn't implement it earlier because I wanted to be sure that my escaping implementation vaguely resembled what we had earlier discussed on IRC. I noticed something weird though. All the backends set "" as their preferred encoding, and that causes g_file_get_child_for_display_name to (always?) fail because g_convert doesn't like "" as the to_codeset. I see that nautilus has some fallbacks to deal with these failures, where the last option is to simply use "g_file_get_child (file, display_name)". I guess there is some good reason for that. > In all operations that can create a file/dir we should return > G_IO_ERROR_INVALID_FILENAME if the user tries to create a file starting with > a single underscore, unless the file already exists (which should then > behave just like that operation normally handles existing files, be it > EEXIST or replace the file). Right now this seems quite broken with > unescape_basename_and_map_dirname(), it can create a volatile file starting > with underscore that may conflict with later added real ids. I didn't understand this part. unescape_basename_and_map_dirname and unescape_filename were not converting /__foo to /_foo till now because I had not implemented the corresponding client-side part based on preferred filename encoding. I have now updated them to turn /__foo to /_foo. What do you mean by "user tries to create a file starting with a single underscore"? If the user wants a file with "_foo" as the display name, then our escaping will turn it into "__foo" when the string crosses the UI / gio boundary, which will tell the backend that "__foo" stands for the display name "_foo" typed by the user. The backend should be able to do the right thing, no?
Created attachment 309835 [details] [review] Add GVfsBackendGoogle Fixed a crash when creating a top-level directory.
(In reply to Debarshi Ray from comment #42) > > I noticed something weird though. All the backends set "" as their preferred > encoding, and that causes g_file_get_child_for_display_name to (always?) > fail because g_convert doesn't like "" as the to_codeset. I see that > nautilus has some fallbacks to deal with these failures, where the last > option is to simply use "g_file_get_child (file, display_name)". I guess > there is some good reason for that. Once upon a time we wanted to do a better handling of filenames of remote sites. For instance, people used to have ftp sites with filenames in russian encodings and whatnot. The plan was to add per-mount preferences for the ftp backend so you could configure that. This never really happened because over time more and more places just were using utf8 which meant "doing nothing" worked fine, making this change less and less urgent. > > In all operations that can create a file/dir we should return > > G_IO_ERROR_INVALID_FILENAME if the user tries to create a file starting with > > a single underscore, unless the file already exists (which should then > > behave just like that operation normally handles existing files, be it > > EEXIST or replace the file). Right now this seems quite broken with > > unescape_basename_and_map_dirname(), it can create a volatile file starting > > with underscore that may conflict with later added real ids. > > I didn't understand this part. > > unescape_basename_and_map_dirname and unescape_filename were not converting > /__foo to /_foo till now because I had not implemented the corresponding > client-side part based on preferred filename encoding. I have now updated > them to turn /__foo to /_foo. > > What do you mean by "user tries to create a file starting with a single > underscore"? If the user wants a file with "_foo" as the display name, then > our escaping will turn it into "__foo" when the string crosses the UI / gio > boundary, which will tell the backend that "__foo" stands for the display > name "_foo" typed by the user. The backend should be able to do the right > thing, no? Yeah, if something propely uses the API a "_foo" in the file selector would turn into "__foo" in the gio call, and then back to "_foo" in the gdata call. However, if something is using this API improperly, and passes a basename of "_foo" into e.g. push() you can have two cases: a) There is already a document with the id "foo", thus you should return G_IO_ERROR_EXISTS. b) There is no such document id, and you create a new document which gets some other id, and then you add "_foo" to the lookaside cache pointing to this new id. Now you suddenly have a "_foo" in the lookaside cache which doesn't represent a read id "foo". What if one suddenly was to be created from another operation? In this case I think we should just fail the operation with invalid filename, to avoid potentially confusing the backend.
(In reply to Debarshi Ray from comment #38) > > Do you want to use a suffix other than .desktop right away? Or do you want > to do it next cycle? I am a bit scared to introduce a new suffix this late > in the cycle. Lets keep it as is for now, until we chose to make the more significant changes on the gio side to propely handle shortcuts on a lower level than nautilus.
Review of attachment 309835 [details] [review]: ::: client/gdaemonfile.c @@ +2341,3 @@ + { + if (g_str_has_prefix (display_name, "_")) + basename = g_strconcat ("_", display_name, NULL); Eh, don't you need an else case here that just uses the display name as-is?
(In reply to Debarshi Ray from comment #38) > > Currently we cache the entries for 1 minute. We only rebuild the cache if a > minute has passed or we can't find an entry in it. True, but the "can't find entry" case happens each time a new file is created, which will be a lot if you're doing e.g. a recursive copy.
Review of attachment 309835 [details] [review]: ::: daemon/gvfsbackendgoogle.c @@ +440,3 @@ + escaped_filename = escape_filename (filename); + target_uri = g_strdup_printf ("%s://%s%s", type, self->account_identity, escaped_filename); + g_file_info_set_symlink_target (info, target_uri); This doesn't look right. Symlinks are supposed to be pathnames, not uris. They can't ever point outside of the mount itself.
Review of attachment 309835 [details] [review]: I would like there to be a bit more comments about what kind of form all "path" arguments have. I.e. are they guaranteed to be ids, can have the last element volatile, etc. Also, i think the lookaside cache is way to brittle at the moment, as it is based on a pure string match. For example, given a file "/_id1/_id2/_id3" you can use multiple paths to reach is, such as "/title1/title2/title3", "/title1/_id2/_id3", "/_id1/title2/_id3", etc. I think the cache would be more robust if it was a mapping from parent id + basename to id. This makes the lookup a bit more complex, something like: char *path; resolve_path (const char *path) { if (path == "/") return "/"; parent_resolved_path = resolve_path (dirname (path)); parent_id = basename (parent_resolved_path); id = lookup (parent_id, basename(path)); if (id == NULL) return NULL; else return join (parent_resolved_path, id); } In fact, I think the code would probably be clearer if instead of using a combination of escaped and unescaped paths inside the functions you'd start each function with mapping the arguments to parent_id + basename (which can be a volatile or an escaped id) at the boundary and then avoid using any path-based operations at all in the "meat" of the implementations. ::: daemon/gvfsbackendgoogle.c @@ +806,3 @@ +{ + g_debug ("+ append_to: %s\n", filename); + g_vfs_job_failed (G_VFS_JOB (job), G_IO_ERROR, G_IO_ERROR_FAILED, _("append_to doesn't work")); G_IO_ERROR_NOT_SUPPORTED Seems like a better value in these cases. @@ +937,3 @@ + * lookaside cache at the end of the copy so that subsequent jobs + * can recognize it as a volatile path. + */ I still think this sounds risky. In the normal case the copy operation would use the copy name, and would avoid this case. If you did a copy from say "/_id1/_id2" into the "/_id3" folder the wrong way, so dest="/_id3/_id2" they I think we should give an invalid-filename error rather than making up a volatile path that looks like a non-volatile to avoid any later confusion. @@ +1057,3 @@ + } + + self->entries_stale = TRUE; Is there a particular reason to make all entries stale after creating a new one? We do have the new entry in our table, and i can't see anything else that would make it more likely some other entry appeared just because we created one? @@ +2291,3 @@ + g_hash_table_insert (self->entries, g_strdup (new_id), g_object_ref (new_entry)); + + g_vfs_job_set_display_name_set_new_path (job, filename); This is one of the few places we get to pass back a new filename. We should make sure this is always a non-volatile path.
(In reply to Alexander Larsson from comment #44) > (In reply to Debarshi Ray from comment #42) > > > > I noticed something weird though. All the backends set "" as their preferred > > encoding, and that causes g_file_get_child_for_display_name to (always?) > > fail because g_convert doesn't like "" as the to_codeset. I see that > > nautilus has some fallbacks to deal with these failures, where the last > > option is to simply use "g_file_get_child (file, display_name)". I guess > > there is some good reason for that. > > Once upon a time we wanted to do a better handling of filenames of remote > sites. For instance, people used to have ftp sites with filenames in russian > encodings and whatnot. The plan was to add per-mount preferences for the ftp > backend so you could configure that. This never really happened because over > time more and more places just were using utf8 which meant "doing nothing" > worked fine, making this change less and less urgent. Right. I remember this from past conversations. I was wondering more about why we don't leave NULL as the preferred encoding because that will automatically fall back to using the display-name for the child. I assume there is good reason for that because I see that the current fallbacks for g_file_get_child_for_display_name returning NULL in nautilus are not a trivial call to g_file_get_child (file, display_name). > > > In all operations that can create a file/dir we should return > > > G_IO_ERROR_INVALID_FILENAME if the user tries to create a file starting with > > > a single underscore, unless the file already exists (which should then > > > behave just like that operation normally handles existing files, be it > > > EEXIST or replace the file). Right now this seems quite broken with > > > unescape_basename_and_map_dirname(), it can create a volatile file starting > > > with underscore that may conflict with later added real ids. > > > > I didn't understand this part. > > > > unescape_basename_and_map_dirname and unescape_filename were not converting > > /__foo to /_foo till now because I had not implemented the corresponding > > client-side part based on preferred filename encoding. I have now updated > > them to turn /__foo to /_foo. > > > > What do you mean by "user tries to create a file starting with a single > > underscore"? If the user wants a file with "_foo" as the display name, then > > our escaping will turn it into "__foo" when the string crosses the UI / gio > > boundary, which will tell the backend that "__foo" stands for the display > > name "_foo" typed by the user. The backend should be able to do the right > > thing, no? > > Yeah, if something propely uses the API a "_foo" in the file selector would > turn into "__foo" in the gio call, and then back to "_foo" in the gdata > call. However, if something is using this API improperly, and passes a > basename of "_foo" into e.g. push() you can have two cases: > > a) There is already a document with the id "foo", thus you should return > G_IO_ERROR_EXISTS. > b) There is no such document id, and you create a new document which gets > some other id, and then you add "_foo" to the lookaside cache pointing to > this new id. > > Now you suddenly have a "_foo" in the lookaside cache which doesn't > represent a read id "foo". What if one suddenly was to be created from > another operation? > > In this case I think we should just fail the operation with invalid > filename, to avoid potentially confusing the backend. Ok. This makes sense. I will address this. (In reply to Alexander Larsson from comment #48) > Review of attachment 309835 [details] [review] [review]: > > ::: daemon/gvfsbackendgoogle.c > @@ +440,3 @@ > + escaped_filename = escape_filename (filename); > + target_uri = g_strdup_printf ("%s://%s%s", type, > self->account_identity, escaped_filename); > + g_file_info_set_symlink_target (info, target_uri); > > This doesn't look right. Symlinks are supposed to be pathnames, not uris. > They can't ever point outside of the mount itself. Ok. Fixed (In reply to Alexander Larsson from comment #46) > Review of attachment 309835 [details] [review] [review]: > > ::: client/gdaemonfile.c > @@ +2341,3 @@ > + { > + if (g_str_has_prefix (display_name, "_")) > + basename = g_strconcat ("_", display_name, NULL); > > Eh, don't you need an else case here that just uses the display name as-is? Oops! Yes, we do. I think I never tested the non-underscore case after writing this code.
(In reply to Debarshi Ray from comment #50) > > I was wondering more about why we don't leave NULL as the preferred encoding > because that will automatically fall back to using the display-name for the > child. I assume there is good reason for that because I see that the current > fallbacks for g_file_get_child_for_display_name returning NULL in nautilus > are not a trivial call to g_file_get_child (file, display_name). I wonder if the reason is accidental as we pass the string via dbus which can't pass NULL strings. I guess we should fix this.
(In reply to Ondrej Holy from comment #28) > Review of attachment 306309 [details] [review] [review]: > It would be nice to reorder the source code e.g. to have methods > open_for_read, open_icon_for_read, read, seek_on_read, close_read in same > place in the code etc. I have rearranged some of the vfunc implementations. Let me know if you want to shuffle them around a bit more. > There are some other observations from a first look: > > ::: daemon/gvfsbackendgoogle.c > @@ +927,3 @@ > + cancellable, > + &error); > + if (error != NULL) > > You should also take care about G_FILE_COPY_OVERWRITE flag and return > correct error messages for specific cases (i.e. G_IO_ERROR_EXISTS, > G_IO_ERROR_IS_DIRECTORY, G_IO_ERROR_WOULD_MERGE). Ok, I will take a look. > @@ +1421,3 @@ > + g_debug ("+ mount\n"); > + > + if (self->client == NULL) > > self->client should be always NULL > > @@ +1459,3 @@ > + authorizer = gdata_goa_authorizer_new (object); > + > + if (self->service != NULL) > > This should never happen... Fixed both. > @@ +1610,3 @@ > + auth_domain = > gdata_documents_service_get_primary_authorization_domain (); > + uri = gdata_entry_get_content_uri (entry); > + stream = gdata_download_stream_new (GDATA_SERVICE (self->service), > auth_domain, uri, cancellable); > > You should check the result as you do in open_icon_for_read. Done. > @@ +2110,3 @@ > + new_id = gdata_entry_get_id (new_entry); > + g_hash_table_insert (self->entries, g_strdup (new_id), g_object_ref > (new_entry)); > + > > You should probably emit monitor event here... Ok. Since we are only changing the display name, I guess G_FILE_MONITOR_EVENT_ATTRIBUTE_CHANGED would be right thing to do here? > @@ +2169,3 @@ > + cancellable, > + read_cb, > + g_object_ref (job)); > > g_object_ref seems to me to be redundant > > @@ +2248,3 @@ > + > +GVfsBackend * > +g_vfs_backend_google_new (void) > > This isn't needed. > > ::: daemon/gvfsbackendgoogle.h > @@ +56,3 @@ > +GType g_vfs_backend_google_get_type (void) G_GNUC_CONST; > + > +GVfsBackend *g_vfs_backend_google_new (void); > > This isn't needed. Removed all of them.
(In reply to Alexander Larsson from comment #49) > Review of attachment 309835 [details] [review] [review]: > ::: daemon/gvfsbackendgoogle.c > @@ +806,3 @@ > +{ > + g_debug ("+ append_to: %s\n", filename); > + g_vfs_job_failed (G_VFS_JOB (job), G_IO_ERROR, G_IO_ERROR_FAILED, > _("append_to doesn't work")); > > G_IO_ERROR_NOT_SUPPORTED Seems like a better value in these cases. Those are a remnant from the initial skeleton. G_IO_ERROR_FAILED helped me figure out which operation is being called in what order in response to various applications. I have removed the stub append_to because the Drive API does not have native support for it. The job will now emit G_IO_ERROR_NOT_SUPPORTED due to the missing implementation. Create() is now implemented, but gedit is exclusively using g_file_replace (via gtksourceview), so I need to find some real world example. Luckily it shares some code with replace(). > @@ +1057,3 @@ > + } > + > + self->entries_stale = TRUE; > > Is there a particular reason to make all entries stale after creating a new > one? We do have the new entry in our table, and i can't see anything else > that would make it more likely some other entry appeared just because we > created one? No, I will remove those. It is a remnant from the initial versions when I was rebuilding the cache much more aggressively. > @@ +2291,3 @@ > + g_hash_table_insert (self->entries, g_strdup (new_id), g_object_ref > (new_entry)); > + > + g_vfs_job_set_display_name_set_new_path (job, filename); > > This is one of the few places we get to pass back a new filename. We should > make sure this is always a non-volatile path. Done.
(In reply to Alexander Larsson from comment #49) > Review of attachment 309835 [details] [review] [review]: > > [...] > > Also, i think the lookaside cache is way to brittle at the moment, as it is > based on a pure string match. For example, given a file "/_id1/_id2/_id3" > you can use multiple paths to reach is, such as "/title1/title2/title3", > "/title1/_id2/_id3", "/_id1/title2/_id3", etc. I think the cache would be > more robust if it was a mapping from parent id + basename to id. This makes > the lookup a bit more complex, something like: > > char *path; > resolve_path (const char *path) > { > if (path == "/") > return "/"; > > parent_resolved_path = resolve_path (dirname (path)); > parent_id = basename (parent_resolved_path); > > id = lookup (parent_id, basename(path)); > if (id == NULL) > return NULL; > else > return join (parent_resolved_path, id); > } > > In fact, I think the code would probably be clearer if instead of using a > combination of escaped and unescaped paths inside the functions you'd start > each function with mapping the arguments to parent_id + basename (which can > be a volatile or an escaped id) at the boundary and then avoid using any > path-based operations at all in the "meat" of the implementations. See below. > @@ +937,3 @@ > + * lookaside cache at the end of the copy so that subsequent jobs > + * can recognize it as a volatile path. > + */ > > I still think this sounds risky. In the normal case the copy operation would > use the copy name, and would avoid this case. > If you did a copy from say "/_id1/_id2" into the "/_id3" folder the wrong > way, so dest="/_id3/_id2" they I think we should give an invalid-filename > error rather than making up a volatile path that looks like a non-volatile > to avoid any later confusion. This is based on what nautilus is doing. When copying within the Drive mount it always does /_id1/_id2 -> /_id10/_id2. ie. it doesn't use the copy-name, and this matches with the documentation. It says: "This is useful if you want to copy the file to *another filesystem* that might have a different encoding." I *think* this issue has an impact on the structure of the lookaside cache.
(In reply to Debarshi Ray from comment #50) > (In reply to Alexander Larsson from comment #44) > > Yeah, if something propely uses the API a "_foo" in the file selector would > > turn into "__foo" in the gio call, and then back to "_foo" in the gdata > > call. However, if something is using this API improperly, and passes a > > basename of "_foo" into e.g. push() you can have two cases: > > > > a) There is already a document with the id "foo", thus you should return > > G_IO_ERROR_EXISTS. > > b) There is no such document id, and you create a new document which gets > > some other id, and then you add "_foo" to the lookaside cache pointing to > > this new id. > > > > Now you suddenly have a "_foo" in the lookaside cache which doesn't > > represent a read id "foo". What if one suddenly was to be created from > > another operation? > > > > In this case I think we should just fail the operation with invalid > > filename, to avoid potentially confusing the backend. > > Ok. This makes sense. I will address this. I have now fixed push() to return INVALID_FILENAME for bogus names passed to gio, and implemented overwrites. The code is in the wip/rishi/goa branch. (I don't like the blocks of escaping / resolving code that gets repeated in all the vfuncs, but, I guess, that will be better once we nail down the structure of the lookaside cache, etc. (ie. comment 54).)
(In reply to Debarshi Ray from comment #52) > (In reply to Ondrej Holy from comment #28) > > Review of attachment 306309 [details] [review] [review] [review]: > > > It would be nice to reorder the source code e.g. to have methods > > open_for_read, open_icon_for_read, read, seek_on_read, close_read in same > > place in the code etc. > > I have rearranged some of the vfunc implementations. Let me know if you want > to shuffle them around a bit more. It looks much better now, thanks. Exact order isn't demanded, but it is handy to have some functions together when debugging... > > @@ +2110,3 @@ > > + new_id = gdata_entry_get_id (new_entry); > > + g_hash_table_insert (self->entries, g_strdup (new_id), g_object_ref > > (new_entry)); > > + > > > > You should probably emit monitor event here... > > Ok. Since we are only changing the display name, I guess > G_FILE_MONITOR_EVENT_ATTRIBUTE_CHANGED would be right thing to do here? It is pretty specific here, because uri is still same, thus _ATTRIBUTE_CHANGED seems to me to be good choice here...
I took a look at the latest branch. Overall this looks much cleaner. Some comments: remove_entry() - This doesn't correctly add back previous conflicts which were ignored to dir_entries resolve_dir() returns ERROR_NOT_DIRECTORY if a parent doesn't exists. This is not quite right for most users. Take make_directory for instance, its supposed to return G_IO_ERROR_NOT_FOUND in this case. NOT_DIRECTORY is only for cases where the specified file must be a directory. You can remove self->lookaside. The order of the error handling is a bit wrong. For instance, in copy() if the source is not found we should report NOT_FOUND independent of the destination, but in the needs_rebuild=TRUE case we could report errors wrt the destination first. See the API docs for g_file_copy() for the exact semantics required. Maybe we should only fail for G_FILE_COPY_OVERWRITE in the actual case where there is a destination, rather than just if the flag is set. How exactly does copy work. It sets the "id" for the newly created entry to the source id. But this creates a new entry right? I.e. the new entry will have a different id. If so, i think the title handling is a bit wrong. The final result will have some unknown new basename, so whatever basename the user passed in must be the new volatile basename, and thus it has to be the title. In the case where we use the copy name this works out fine, but in the case where the target is a pure copy of a source id the code now uses the title of the old id as the destination title, which causes this to get the wrong volatile path. Or am i missing something? There are similar issus in make_directory. It must use the passed in basename as the title, because the final id is not known, and what the user passed in as basename must map back to the newly created folder. This works fine in the normal case, but the current code uses an existing title instead of the basename if the basename exists as a previous id.
(In reply to Alexander Larsson from comment #57) Thanks for the review. Repeating a few things from IRC for those who are lurking. > remove_entry() - This doesn't correctly add back previous conflicts which > were ignored to dir_entries I added a flat GList to track the skipped GDataEntries. We could use something fancier like a "(title_or_id, parent_id) -> GList", but this might be good enough because we use it only when deleting. > resolve_dir() returns ERROR_NOT_DIRECTORY if a parent doesn't exists. This > is not quite right for most users. > Take make_directory for instance, its supposed to return > G_IO_ERROR_NOT_FOUND in this case. NOT_DIRECTORY is > only for cases where the specified file must be a directory. We decided to leave it as it is after some discussion on IRC. > You can remove self->lookaside. Removed. > The order of the error handling is a bit wrong. For instance, in copy() if > the source is not found we > should report NOT_FOUND independent of the destination, but in the > needs_rebuild=TRUE case we could > report errors wrt the destination first. See the API docs for g_file_copy() > for the exact semantics required. Done. > Maybe we should only fail for G_FILE_COPY_OVERWRITE in the actual case where > there is a destination, rather > than just if the flag is set. Done. > How exactly does copy work. It sets the "id" for the newly created > entry to the source id. But this creates a new entry right? I.e. the > new entry will have a different id. If so, i think the title handling > is a bit wrong. The final result will have some unknown new basename, > so whatever basename the user passed in must be the new volatile > basename, and thus it has to be the title. In the case where we use > the copy name this works out fine, but in the case where the target is > a pure copy of a source id the code now uses the title of the old id > as the destination title, which causes this to get the wrong volatile path. > Or am i missing something? As discussed on IRC, I did this to make the UIs look nicer for the /id1/id2 -> /id10/id2 case, so that the destination's display-name is title(id2). I emit a CREATE event with the real path to work around the broken volatile paths. I have now changed this to do the right (but ugly thing). ie. the destination's display-name will be id2, and volatile paths would work. We can probably use Drive properties [1] to tag the source ID to the GDataEntry on the server-side, and then dir_entries can use the original ID alongwith the actual ID and title. [1] https://developers.google.com/drive/v2/reference/properties > There are similar issus in make_directory. It must use the passed in > basename as the title, because the final id is not known, and what the > user passed in as basename must map back to the newly created > folder. This works fine in the normal case, but the current code uses > an existing title instead of the basename if the basename exists as a > previous id. Oh, yes, trying to prettify this wouldn't have worked anyway because recursive directory copies really need volatile paths to work. I was stupid.
Created attachment 310488 [details] [review] Add GVfsBackendGoogle This compiles, but I haven't tested the code after making the changes.
(In reply to Alexander Larsson from comment #32) > The mutex seems to be a bit heavy in places. Could we perhaps drop it during > the upload in push()? I am worried about the insert_entry after the upload. Suppose another operation changes the hashtables by calling rebuild_entries. It appeared safer to disallow that.
Created attachment 310597 [details] [review] Add GVfsBackendGoogle * Added some more g_debug logging in places * Always replace our the old GDataEntry with the new one returned by the server, so that we don't miss ETag updates and other changes made behind our back
Created attachment 310673 [details] [review] Add GVfsBackendGoogle * Added logic to de-duplicate copy-names because Drive allows duplicate titles in the same folder. * Don't use g_object_set_data during writes. Setting the stream on the GDataEntry sounds risky because the GDataEntry will be re-used across all operations for the lifetime of the mount. * Make query_info_on_{read, write} check whether the read/write was started with a volatile path and accordingly set it on the GFileInfo.
At this point I think the backend is pretty clean. It has some unfortunate behaviour issues due to the db backing (such as it not inheriting titles right if you do a in-fs copy operations), but these will require larger changes to the platform later to properly solve. There are also some inefficiencies in how it accesses the google apis (keeps downloading everyting), but these can be incrementally fixed as we go on, and should not block people being able to use this. Enabling this is optional, so is using it, and it doesn't affect anything else. So, I'd say we should push this to master now and announce it as some kind of preview support in gnome 3.18, to set expectations correctly.
I guess we might also need some nautilus fixes?
*** Bug 754573 has been marked as a duplicate of this bug. ***
Created attachment 310843 [details] [review] Add GVfsBackendGoogle Ondrej pointed out that some folders in his Drive don't have empty "parents" arrays in the JSON. So, let's handle them. In future we could mark these things as HIDDEN because they appear to have originated from GMail attachments.
Created attachment 310850 [details] [review] Add GVfsBackendGoogle As suggested by Ondrej, let's not set the size for Drive-native files since those don't report a meaningful size.
Created attachment 310880 [details] [review] Add GVfsBackendGoogle I add some error sanitizing code that maps GData errors to their corresponding GIO counterparts. The mapping is not exhaustive, but we can keep adding to it as we go along.
Created attachment 310892 [details] [review] Add GVfsBackendGoogle * Punctuation fix * Address warnings not triggered on F21 that were caught by Ondrej
Review of attachment 310892 [details] [review]: Lets get this in now.
Review of attachment 291030 [details] [review]: Looks good to me...
Created attachment 310907 [details] [review] Add GVfsBackendGoogle Replaced a few new strings with already existing ones to avoid breaking string freeze.
Comment on attachment 291030 [details] [review] goa: Pick up OAuth2-based accounts with a GoaFiles interface Pushed to master.
Created attachment 310910 [details] [review] Add GVfsBackendGoogle Pushed after updating the libgdata and glib requirements to 0.17.3 and 2.45.7 respectively.
Thanks for all the reviews, help and feedback!