GNOME Bugzilla – Bug 780613
Improve thumbnailing of Google and OneDrive entries
Last modified: 2021-07-05 11:31:07 UTC
Our current approach to thumbnailing GoogleDocuments is to download the server generated thumbnails in the createThumbnail virtual method. SkydriveDocuments are not thumbnailed because historically we were unable to fetch the server-side thumbnails due to limitations of OneDrive's REST API. Newer versions of the REST API do support it [1], but it needs to be implemented in libzapojit. Regardless of whether we can download thumbnails or not, I think we can optimize our use of the network: - Create the thumbnail from the EvDocument when it is loaded for previewing - Use a cached copy of the document as source for the thumbnail As a nice side-effect SkydriveDocuments will also get thumbnailed if the user previews/previewed them. As a proof of concept, and to get OneDrive thumbnails working in RHEL, I hacked up a couple of patches - one against gnome-online-miners (it's just a bug-fix specific to the Zpj miner) and another against gnome-documents (which is where the real thing is). It is currently restricted to SkydriveDocuments, but I think we should extend it to other remote items too. Comments welcome! :) https://dev.onedrive.com/items/thumbnails.htm
Created attachment 348827 [details] [review] [gnome-online-miners] zpj: Use an HTTP URL as nie:url
Created attachment 348828 [details] [review] [gnome-documents] documents: Thumbnail SkydriveDocuments once they are loaded
Review of attachment 348828 [details] [review]: The code looks mostly OK and I like the idea of using the cached copy if we can, but I don't really like the approach. Why are you manually running effectively what the evince thumbnailer would do? Ideally we would instead just use the common thumbnail code. ::: src/documents.js @@ +1078,3 @@ + let thumbnailFile = Gio.File.new_for_path(thumbnailPath); + + let thumbnailDir = GLib.path_get_dirname(thumbnailPath); Nitpick, but you could just use thumbnailFile.get_parent() @@ +1107,3 @@ + let rc = EvDocument.RenderContext.new(page, 0, zoom); + let pixbuf = evDoc.get_thumbnail(rc); + Single quotes for the string
Review of attachment 348827 [details] [review]: Is the URL generated this way actually accessible? If so, this patch looks good. Otherwise, if the thumbnail machinery does not care about it being accessible, why does it care about it not being "windows-live:skydrive:<id>"?
(In reply to Cosimo Cecchi from comment #4) > Review of attachment 348827 [details] [review] [review]: > > Is the URL generated this way actually accessible? If so, this patch looks > good. No, it isn't. It is a fake URL. Although, it seems that OneDrive offers a real URL that can be used to open the entry in a web browser, but we will have to parse it in libzapojit. > Otherwise, if the thumbnail machinery does not care about it being > accessible, why does it care about it not being "windows-live:skydrive:<id>"? We run g_file_query_info (file, G_FILE_ATTRIBUTE_THUMBNAIL_PATH, ...) on the URI. GFile won't know what to do with it if the URI doesn't have a scheme that's supported by GIO/GVfs.
(In reply to Cosimo Cecchi from comment #3) > Review of attachment 348828 [details] [review] [review]: > Why are you manually running effectively what the evince thumbnailer would > do? Ideally we would instead just use the common thumbnail code. I went back and forth on this. The evince-thumbnailer and the GnomeDesktop thumbnailing interface require a URI, whereas to implement the first optimization (ie. creating a thumbnail directly from the EvDocument after loading it) we need to use a EvDocument. We can ask evince-thumbnailer to re-load it from the cache, but we will lose niceties like thumbnailing encrypted documents. The actual thumbnailing code was less than 10 lines so I went for it. Either way, I think we can let gnome_desktop_thumbnail_factory_save_thumbnail save the GdkPixbuf for us. That will reduce quite a bit of code on our end. For the second optimization, we can definitely let GnomeDesktop handle it. That will require changes to gd_queue_thumbnail_job_for_file_async. This was just a quick downstream fix that I wanted to be as self-contained as possible. > ::: src/documents.js > @@ +1078,3 @@ > + let thumbnailFile = Gio.File.new_for_path(thumbnailPath); > + > + let thumbnailDir = GLib.path_get_dirname(thumbnailPath); > > Nitpick, but you could just use thumbnailFile.get_parent() Even better, this code can be removed! :) (see above)
(In reply to Debarshi Ray from comment #5) > (In reply to Cosimo Cecchi from comment #4) > > Review of attachment 348827 [details] [review] [review] [review]: > > > > Is the URL generated this way actually accessible? If so, this patch looks > > good. > > No, it isn't. It is a fake URL. Although, it seems that OneDrive offers a > real URL that can be used to open the entry in a web browser, but we will > have to parse it in libzapojit. > > > Otherwise, if the thumbnail machinery does not care about it being > > accessible, why does it care about it not being "windows-live:skydrive:<id>"? > > We run g_file_query_info (file, G_FILE_ATTRIBUTE_THUMBNAIL_PATH, ...) on the > URI. GFile won't know what to do with it if the URI doesn't have a scheme > that's supported by GIO/GVfs. Ah, I see... If we can add support for libzapojit to know the actual presentation URL, that would make it a bit less of a hack IMO.
(In reply to Debarshi Ray from comment #6) > I went back and forth on this. > > The evince-thumbnailer and the GnomeDesktop thumbnailing interface require a > URI, whereas to implement the first optimization (ie. creating a thumbnail > directly from the EvDocument after loading it) we need to use a EvDocument. > We can ask evince-thumbnailer to re-load it from the cache, but we will lose > niceties like thumbnailing encrypted documents. The actual thumbnailing code > was less than 10 lines so I went for it. Either way, I think we can let > gnome_desktop_thumbnail_factory_save_thumbnail save the GdkPixbuf for us. > That will reduce quite a bit of code on our end. OK... It sounds like it may be worth adding an API to gnome-desktop to run the thumbnailing job for a specific URI that is different than the actual file path then, if we want to avoid thumbnailing the file directly here?
Created attachment 349260 [details] [review] [gnome-documents] documents: Thumbnail SkydriveDocuments once they are loaded Rebased.
(In reply to Cosimo Cecchi from comment #8) > (In reply to Debarshi Ray from comment #6) >> The evince-thumbnailer and the GnomeDesktop thumbnailing interface require a >> URI, whereas to implement the first optimization (ie. creating a thumbnail >> directly from the EvDocument after loading it) we need to use a EvDocument. >> We can ask evince-thumbnailer to re-load it from the cache, but we will lose >> niceties like thumbnailing encrypted documents. The actual thumbnailing code >> was less than 10 lines so I went for it. Either way, I think we can let >> gnome_desktop_thumbnail_factory_save_thumbnail save the GdkPixbuf for us. >> That will reduce quite a bit of code on our end. > > OK... It sounds like it may be worth adding an API to gnome-desktop to run > the thumbnailing job for a specific URI that is different than the actual > file path then, if we want to avoid thumbnailing the file directly here? Umm... sorry, I wasn't clear. :) These patches add two new thumbnailing code paths. The first one is about using the EvDocument that's created as a side-effect of loading a PDF/PS/XPS/... for previewing. This addresses two scenarios: (a) we can thumbnail encrypted documents which is hard or impossible otherwise, (b) if we had previously failed to thumbnail the document then it gives us a chance to address that. However, this code path takes a EvDocument as input, not an URI. So we can't really use GnomeDesktopThumbnailFactory to *create* the PNG. The thumbnailers are out-of-process, and I doubt we can marshall a EvDocument across process boundaries. However, we can use use GnomeDesktopThumbnailFactory to *save* the thumbnail because gnome_desktop_thumbnail_factory_save_thumbnail is a separate API from gnome_desktop_thumbnail_factory_generate_thumbnail. Sadly, GnomeDesktopThumbnailFactory doesn't have async APIs. Bastien said that he is overhauling the thumbnailing APIs doesn't want to add further code to GnomeDesktopThumbnailFactory. So we might have to live with a C wrapper in gnome-documents. The second code path that these patches add is to use the cache to avoid downloading thumbnails over the network. This might be able to use GnomeDesktopThumbnailFactory to even *create* the PNGs.
(In reply to Debarshi Ray from comment #10) > Umm... sorry, I wasn't clear. :) > > These patches add two new thumbnailing code paths. > > The first one is about using the EvDocument that's created as a side-effect > of loading a PDF/PS/XPS/... for previewing. This addresses two scenarios: > (a) we can thumbnail encrypted documents which is hard or impossible > otherwise, (b) if we had previously failed to thumbnail the document then it > gives us a chance to address that. However, this code path takes a > EvDocument as input, not an URI. So we can't really use > GnomeDesktopThumbnailFactory to *create* the PNG. The thumbnailers are > out-of-process, and I doubt we can marshall a EvDocument across process > boundaries. However, we can use use GnomeDesktopThumbnailFactory to *save* > the thumbnail because gnome_desktop_thumbnail_factory_save_thumbnail is a > separate API from gnome_desktop_thumbnail_factory_generate_thumbnail. I see, thanks for taking the time to explain it! I probably left that comment in a hurry and did not make myself very clear either :-) About your point a), I actually don't think that we should ever try to show a thumbnail for encrypted documents. If a document is encrypted, it's IMO most prudent to never display any of its contents at any resolution. What if, for example, the encrypted document is a presentation with one of its pages being a picture that should remain protected? A thumbnail could potentially have enough resolution to give it away, so I'm leaning towards not taking that risk. Point b) is a good use case, but would that not be addressed by trying to run the thumbnailer again on the document? Or is there something inherent to our loading process that makes it work in a case where the thumbnailer would always fail? > The second code path that these patches add is to use the cache to avoid > downloading thumbnails over the network. This might be able to use > GnomeDesktopThumbnailFactory to even *create* the PNGs. Yeah, that's the case I was referring to in my comment. The EvDocument object that we have in this case is still backed by a local path on disk, so it sounds feasible to just run the regular thumbnailer on it if we have a way to fetch the thumbnail for the local path. (Either by specifically looking for a thumbnail for the cache path, or by having a way to save a thumbnail for a different path/URI than the path it was loaded from when making it.)
(In reply to Cosimo Cecchi from comment #11) > About your point a), I actually don't think that we should ever try to show > a thumbnail for encrypted documents. If a document is encrypted, it's IMO > most prudent to never display any of its contents at any resolution. What > if, for example, the encrypted document is a presentation with one of its > pages being a picture that should remain protected? A thumbnail could > potentially have enough resolution to give it away, so I'm leaning towards > not taking that risk. That is a very good point that I hadn't thought of. At this point, this patch set is blocking on libzapojit work to parse an actual URL for OneDrive documents (see comment 7).
Created attachment 372578 [details] [review] [gnome-documents] documents: Thumbnail PDF SkydriveDocuments once loaded I had to rebase my downstream copy of this patch. So I took the opportunity to make the following changes: * I took Cosimo's advice and removed the bits that caused password protected PDFs to be thumbnailed. * It still doesn't use gnome_desktop_thumbnail_factory_save_thumbnail because we need our own C asynchronous wrapper. This is a TODO. * It still has _createThumbnailFromEvDocument because the generic gd_queue_thumbnail_job_for_file_async cannot use a separate cached version as the source. The 'file' argument is used as the source and it's URI is used to generate the location of the thumbnail. This means we are only thumbnailing PDFs at the moment. This is a TODO.
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/gnome-documents/-/issues/ Thank you for your understanding and your help.