After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 597017 - provide the ability to remove metadata based on recency
provide the ability to remove metadata based on recency
Status: RESOLVED OBSOLETE
Product: gvfs
Classification: Core
Component: metadata
1.4.x
Other Linux
: Normal normal
: ---
Assigned To: gvfs-maint
gvfs-maint
: 325976 529935 590541 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2009-10-01 16:22 UTC by Josh Triplett
Modified: 2018-09-21 16:54 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Josh Triplett 2009-10-01 16:22:42 UTC
[Reporting this against component "general" since no component seems to exist for metadata.]

gvfs stores metadata about files, in files under ~/.local/share/gvfs-metadata/ .  This represents a privacy issue, and unlike the recent documents lists, gvfs does not provide an obvious way to observe and/or clear this metadata.
Comment 1 Alexander Larsson 2009-10-02 09:16:27 UTC
Its easy to observe by using gvfs-info or gvfs-ls. You can set it with gvfs-set-attribute.

What kind of security issue is this? You mean being able to read the custom icon or the emblems you set on a file is a privacy issue?
Comment 2 Alexander Larsson 2009-10-02 09:25:50 UTC
I guess i can see one possibile leak, if you have sensitive filename and then delete the file not using gio there could be a leftover reference to the filename. Is this what you worry about?
Comment 3 Christian Kellner 2009-10-02 09:29:24 UTC
moving to the newly created metadata Component.
Comment 4 Josh Triplett 2009-10-02 09:47:45 UTC
Yes, the list of filenames itself seems like the primary concern, for the same reasons as the recent document list.  The metadata itself could also prove potentially sensitive depending on what programs store things in it, though for files that still exist on the system that doesn't seem nearly as problematic.

"gvfs-info" and "gvfs-ls" don't come anywhere close to "easy to observe".  Consider the average user here, who knows just enough to consider functions like GNOME's "Clear Recent Documents" and Firefox's "Clear Recent History".  The existence of yet another similar type of data does not prove obvious.

I only noticed the files in ~/.local/share/gvfs-metadata/ because I keep my home directory in Git and didn't have that in .gitignore , so it showed up in "git status".  I doubt I would have noticed it otherwise unless I went spelunking in my home directory's dotfiles.

As a long-term solution, I'd love to see this information integrated together with "Clear Recent Documents" into something like Firefox's "Clear Recent History".
Comment 5 Alexander Larsson 2009-10-02 10:44:38 UTC
What exactly do you want such a clean operation to do though? Delete all the users manually added metadata? Its not uncommon for metadata to be stored for not currently mounted files, so we don't always know if a file has been deleted or not.
Comment 6 Christian Kellner 2009-10-02 11:13:28 UTC
What we could do on such a "Clear Metadata" is look if all files we have stored metadata for *locally* still exists. For removable media we could present a dialog and ask if one wants to purge metadata for that medium totally. Not sure if that makes sense though.
Comment 7 Alexander Larsson 2009-10-02 13:30:51 UTC
That is not 100% possible, we don't store metadata separately for mountpoint unless a) udev & co is in use and b) we find a label or uuid for the mount. So, for instance, remote NFS mounted files may still end in the global "root" tree. However, for the "home" tree we could certainly do this.
Comment 8 Josh Triplett 2009-10-02 16:55:09 UTC
(In reply to comment #5)
> What exactly do you want such a clean operation to do though? Delete all the
> users manually added metadata?

If you have timestamps, you could delete all recently added metadata, for some sensible values of "recent"; Firefox's dialog supports "last hour", "last two hours", "last four hours", "today", and "everything", with big warnings displayed if you choose "everything".  I think that seems fairly sensible.

> Its not uncommon for metadata to be stored for
> not currently mounted files, so we don't always know if a file has been deleted
> or not.

Metadata for files on a no-longer-mounted filesystem may well qualify as something the user wants to delete as well.

But OK, trying to delete stuff for deleted files may not work then.  Perhaps stick with just the "recent" idea; I think that'll cover the standard use cases for this feature.
Comment 9 Alexander Larsson 2009-11-04 09:54:45 UTC
Why would you want to remove old metadata?

Say you tag a file as "important" or add a comment about it, or add an emblem. Why would you want to remove that after a few hours?
Comment 10 Carlos Garcia Campos 2009-11-04 09:55:16 UTC
*** Bug 325976 has been marked as a duplicate of this bug. ***
Comment 11 Alexander Larsson 2009-11-04 10:00:34 UTC
Oh, or do you mean to remove "new" metadata?
Comment 12 Josh Triplett 2009-11-04 19:46:46 UTC
(In reply to comment #11)
> Oh, or do you mean to remove "new" metadata?

Precisely.  Provide the ability to remove metadata based on recency, on the assumption that the metadata you want to get rid of got created recently.  As I said in comment 8: "last hour", "last two hours", "last four hours", "today", and "everything", with big warnings displayed if you choose "everything".

Seems like that needs integrating with other features like clearing the "recent documents" list into a common dialog.  After all, it would prove useful to clear recent documents based on recency as well, and it seems likely that you might want to do both of those at the same time.
Comment 13 Ignacio Casal Quinteiro (nacho) 2009-12-30 16:16:15 UTC
*** Bug 590541 has been marked as a duplicate of this bug. ***
Comment 14 Cosimo Cecchi 2011-09-13 04:22:22 UTC
*** Bug 529935 has been marked as a duplicate of this bug. ***
Comment 15 Kousu 2015-11-04 01:47:16 UTC
I just rediscovered this in the context of Bug 757452, and I would like to bump it. Since this bug was opened the gnome-control-center Privacy panel has been invented which tries to addressing the UI side of this. Unfortunately, it leaks data.

My example in #757452 is Evince: Evince remembers page positions and even window sizing, even after you Clear Recent History (i.e. clear the recent documents list), and the way it does this is via gvfs-metadata. I spent months idly trying to track this down, unnerved (though not panicked) at the privacy implications, always thinking I would find the saved state in a folder named "evince": ~/.local/share/evince or ~/.config/evince or ~/.cache/evince. ~/.local/share/gvfs-metadata was an uncomfortable surprise; what else is in these files? It seems that *almost anything* could be in these files, because they are used by different apps in different ways to store whatever they think is relevant. My sense is that this is the real source of the problem, and that until every app is changed it won't fully go away.

gvfs-info can demonstrate what gets attached:
$ gvfs-info a2.pdf 
display name: a2.pdf
edit name: a2.pdf
name: a2.pdf
type: regular
size:  236130
uri: file:///home/kousu/School/Assignments/A2/a2.pdf
attributes:
  standard::type: 1
  standard::name: a2.pdf
  standard::display-name: a2.pdf
  standard::edit-name: a2.pdf
  standard::copy-name: a2.pdf
  standard::icon: application-pdf, x-office-document
  standard::content-type: application/pdf
[...]
  owner::user: kousu
  owner::group: kousu
  metadata::evince::author: 
  metadata::evince::continuous: 1
  metadata::evince::dual-page: 0
  metadata::evince::dual-page-odd-left: 0
  metadata::evince::fullscreen: 0
  metadata::evince::inverted-colors: 0
  metadata::evince::page: 6
  metadata::evince::sidebar_page: thumbnails
  metadata::evince::sidebar_size: 192
  metadata::evince::sidebar_visibility: 1
  metadata::evince::sizing_mode: free
  metadata::evince::title: 
  metadata::evince::window_height: 1053
  metadata::evince::window_maximized: 1
  metadata::evince::window_width: 1920
  metadata::evince::window_x: 0
  metadata::evince::window_y: 27
  metadata::evince::zoom: 1.2194481573547493

You can edit this data with gvfs-set-attribute or erase it by setting each to "" ((something gvfs-set-attribute(1) doesn't document :/)), but it's not clear to me that even erasing all attributes will remove the record of the filename from ~/.local/share/gvfs-metadata.

It seems like this is a finicky issue to solve, since Gnome apps have evolved to use the gvfs-metadata as a generic, side, database, apart from the filesystem.

I would like to see a standard imposed on Gnome apps that distinguishes data you should expect to stick around (like Nautilus shortcuts or Evince annotations) from data that is handy but just leaks history if kept for too long, like Gedit per-file cursor positions (as in Bug 590541) or Evince active page numbers. App-specific metadata should go under an app-specific directory and be clearable on a per-app or per-date basis.

The XDG spec <http://standards.freedesktop.org/basedir-spec/basedir-spec-0.6.html> already implies some of this, but it only defines config, data, and cache directories. The old Bug 590541 mentions that Gedit used to use ~/.cache/gedit to store this leaky history data; I like using ~/.cache to store this useful but transient state information, because it means that covering your tracks is as simple as `rm -r ~/.cache/`.  But a cache is not clearly the same as a history log; caches are intuitively for big files that you don't want to recompute or redownload, not so much for transient state information, so maybe another directory is needed; I don't know.  Whatever happens, the current method of storing arbitary data under custom-database files ~/.local/share/gvfs-metadata/*, some permanentish and some temporaryish, really bothers me because it is impossible to clean up.  Once you learn it's there (and really, does anyone know that?), you can either bluntly `rm ~/.local/share/gvfs-metadata/*`, or you can try to scan the filesystem and carefully script `gvfs-info` + `gvfs-set-attribute` to erase specific issues, somehow. What if, instead, metadata followed a standard of one-metadata-file per-file under ~/.cache; then you could clean up logs--in fact, any sort of meta-activity--younger than a certain point just by looking at filesystem timestamps.

Should I report this to Evince? The gnome-control-center Privacy panel connotes that Gnome has an effective system-wide policy for privacy, and gvfs-metadata appears to be the substrate upon which that is built. Who has responsibility for this?
(and then there's zeitgeist (which I know isn't Gnome but is tightly interwoven to its libs and used by a few gnome apps, like gnome-music, so is worth considering), and trackerd, both of which leave all sorts of logs around, and are definitely not covered by the privacy panel
zeitgeist uses ~/.local/share/zeitgeist
trackerd uses both ~/.cache/tracker and ~/.local/share/tracker/)
Comment 16 Ondrej Holy 2015-11-05 09:07:17 UTC
(In reply to kousu+gnome from comment #15)
> I just rediscovered this in the context of Bug 757452, and I would like to
> bump it. Since this bug was opened the gnome-control-center Privacy panel
> has been invented which tries to addressing the UI side of this.
> Unfortunately, it leaks data.
> 
> My example in #757452 is Evince: Evince remembers page positions and even
> window sizing, even after you Clear Recent History (i.e. clear the recent
> documents list), and the way it does this is via gvfs-metadata. I spent
> months idly trying to track this down, unnerved (though not panicked) at the
> privacy implications, always thinking I would find the saved state in a
> folder named "evince": ~/.local/share/evince or ~/.config/evince or
> ~/.cache/evince. ~/.local/share/gvfs-metadata was an uncomfortable surprise;
> what else is in these files? It seems that *almost anything* could be in
> these files, because they are used by different apps in different ways to
> store whatever they think is relevant. My sense is that this is the real
> source of the problem, and that until every app is changed it won't fully go
> away.
> 
> gvfs-info can demonstrate what gets attached:
> $ gvfs-info a2.pdf 
> display name: a2.pdf
> edit name: a2.pdf
> name: a2.pdf
> type: regular
> size:  236130
> uri: file:///home/kousu/School/Assignments/A2/a2.pdf
> attributes:
>   standard::type: 1
>   standard::name: a2.pdf
>   standard::display-name: a2.pdf
>   standard::edit-name: a2.pdf
>   standard::copy-name: a2.pdf
>   standard::icon: application-pdf, x-office-document
>   standard::content-type: application/pdf
> [...]
>   owner::user: kousu
>   owner::group: kousu
>   metadata::evince::author: 
>   metadata::evince::continuous: 1
>   metadata::evince::dual-page: 0
>   metadata::evince::dual-page-odd-left: 0
>   metadata::evince::fullscreen: 0
>   metadata::evince::inverted-colors: 0
>   metadata::evince::page: 6
>   metadata::evince::sidebar_page: thumbnails
>   metadata::evince::sidebar_size: 192
>   metadata::evince::sidebar_visibility: 1
>   metadata::evince::sizing_mode: free
>   metadata::evince::title: 
>   metadata::evince::window_height: 1053
>   metadata::evince::window_maximized: 1
>   metadata::evince::window_width: 1920
>   metadata::evince::window_x: 0
>   metadata::evince::window_y: 27
>   metadata::evince::zoom: 1.2194481573547493

I truly do not understand why it is soo problematic that somewhere is stored e.g. last position in some file. Guys, there aren't any passwords, or other sensitive information... and the metadata database is in your home directory with your permissions...

Only real problem with metadata database what I see is that there might be metadata for already unexisting files...

> You can edit this data with gvfs-set-attribute or erase it by setting each
> to "" ((something gvfs-set-attribute(1) doesn't document :/)), but it's not
> clear to me that even erasing all attributes will remove the record of the
> filename from ~/.local/share/gvfs-metadata.

You can't remove metadata this way. The key is still stored with empty value. That's why it isn't documented.

$ gvfs-info -a "metadata::test" /
uri: file:///
attributes:
$ gvfs-set-attribute / metadata::test test
$ gvfs-info -a "metadata::test" /
uri: file:///
attributes:
  metadata::test: test
$ gvfs-set-attribute / metadata::test ""
$ gvfs-info -a "metadata::test" /
uri: file:///
attributes:
  metadata::test: 

AFAIK there isn't any public API to remove the metadata. There is just internal API to purge metadata which is used e.g. when you remove your file...

However it shouldn't be such problem to implement this...

> It seems like this is a finicky issue to solve, since Gnome apps have
> evolved to use the gvfs-metadata as a generic, side, database, apart from
> the filesystem.
> 
> I would like to see a standard imposed on Gnome apps that distinguishes data
> you should expect to stick around (like Nautilus shortcuts or Evince
> annotations) from data that is handy but just leaks history if kept for too
> long, like Gedit per-file cursor positions (as in Bug 590541) or Evince
> active page numbers. App-specific metadata should go under an app-specific
> directory and be clearable on a per-app or per-date basis.

I afraid we can't explicitly say, what should be permanent and what shouldn't. I personally expect that e.g. position in file should be persistent, because I am using Evince as ebook reader and refuse to look up for correct page again and again, but you can say that this is just handy feature...

> The XDG spec
> <http://standards.freedesktop.org/basedir-spec/basedir-spec-0.6.html>
> already implies some of this, but it only defines config, data, and cache
> directories. The old Bug 590541 mentions that Gedit used to use
> ~/.cache/gedit to store this leaky history data; I like using ~/.cache to
> store this useful but transient state information, because it means that
> covering your tracks is as simple as `rm -r ~/.cache/`.  But a cache is not
> clearly the same as a history log; caches are intuitively for big files that
> you don't want to recompute or redownload, not so much for transient state
> information, so maybe another directory is needed; I don't know.  Whatever
> happens, the current method of storing arbitary data under custom-database
> files ~/.local/share/gvfs-metadata/*, some permanentish and some
> temporaryish, really bothers me because it is impossible to clean up.  Once
> you learn it's there (and really, does anyone know that?), you can either
> bluntly `rm ~/.local/share/gvfs-metadata/*`, or you can try to scan the
> filesystem and carefully script `gvfs-info` + `gvfs-set-attribute` to erase
> specific issues, somehow. What if, instead, metadata followed a standard of
> one-metadata-file per-file under ~/.cache; then you could clean up logs--in
> fact, any sort of meta-activity--younger than a certain point just by
> looking at filesystem timestamps.

I afraid that usage of one metadata-file per file would be slow and fragile. And still this doesn't solve the question what should be permanent and what shouldn't...
 
> Should I report this to Evince? The gnome-control-center Privacy panel
> connotes that Gnome has an effective system-wide policy for privacy, and
> gvfs-metadata appears to be the substrate upon which that is built. Who has
> responsibility for this?

You can, but how Evince should fix this? 

> (and then there's zeitgeist (which I know isn't Gnome but is tightly
> interwoven to its libs and used by a few gnome apps, like gnome-music, so is
> worth considering), and trackerd, both of which leave all sorts of logs
> around, and are definitely not covered by the privacy panel
> zeitgeist uses ~/.local/share/zeitgeist
> trackerd uses both ~/.cache/tracker and ~/.local/share/tracker/)
Comment 17 Hussam Al-Tayeb 2015-11-05 12:01:58 UTC
It is a matter of cleanliness for me. I don't want orphan metadata. Since gvfs won't delect files deleted outside gio, it doesn't always know a file has been deleted and therefore its metadata is still there.


So what I would suggest is some use-at-your-own-risk command line tool that reads the metadata database and deletes orphan entries (entries pointing to non existing files either deleted or remote-but-not-mounted-or-available-yet). I can just use it in a cron job. That way no functionality is sacrificed but people who want to do extra cleaning can still do so.
Comment 18 Hussam Al-Tayeb 2015-11-05 12:03:06 UTC
I think that is a good compromise?
Comment 19 Josh Triplett 2015-11-05 17:21:36 UTC
(In reply to Ondrej Holy from comment #16)
> Only real problem with metadata database what I see is that there might be
> metadata for already unexisting files...

Right, that seems like the largest issue: metadata for files that don't exist anymore.  Metadata for deleted/moved files needs timely pruning.

This has many of the same implications as the "recent" list, with less visibility.
Comment 20 Kousu 2015-11-06 03:08:12 UTC
I second this being like the "Recent" list. It's an annoying implicit "Recent" list with no reasonable way to summarize or control what it holds.

Personally, orphan metadata is a subset of what I am concerned about.  If someone can see your metadata they can probably see your data too, but they don't know how you've been using them. It's the same issue as the NSA collecting phone metadata: metadata is at least, maybe more, sensitive than data. I'm arguing for Better Safe Than Sorry and the Principle of Least Privilege here.
Comment 21 Kousu 2015-11-06 03:09:30 UTC
I just ran a test and discovered that unencrypted gvfs-metadata persists even after a LUKS encrypted drive is detached. I have two files:
.local/share/gvfs-metadata/uuid-2f6864cf-f2ce-42f1-90ad-65fe3198aab9{,.log}
and they contain filenames and page lengths and whatever else arbitary apps arbitarily decided were relevant. Solving orphan metadata would solve this, but I wish this data wasn't hitting my main disk in the first place at all; perhaps like DRIVE/.Trash-$UID there can be DRIVE/.$UID/local/share/gvfs-metadata?
Comment 22 GNOME Infrastructure Team 2018-09-21 16:54:26 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gvfs/issues/117.