Bug 700785 – gvfs metadata corrupted by concurrent access

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 700785 - gvfs metadata corrupted by concurrent access


Summary:	gvfs metadata corrupted by concurrent access


Status:	RESOLVED FIXED

Product:	gvfs
Classification:	Core
Component:	metadata
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gvfs-maint
QA Contact:	gvfs-maint

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2013-05-21 16:12 UTC by Colin Walters
Modified:	2013-06-14 09:38 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
gdb.txt (5.50 KB, text/plain) 2013-05-21 16:12 UTC, Colin Walters		Details
0001-metadata-Downgrade-journal-short-read-to-debug.patch (1.20 KB, patch) 2013-05-21 22:19 UTC, Colin Walters	committed	Details \| Review

Description Colin Walters 2013-05-21 16:12:40 UTC

Created attachment 244960 [details]
gdb.txt

Unsure whether the flaw here is in glib or gvfs, let's start here.  See attached stack:

Comment 1 Colin Walters 2013-05-21 16:39:11 UTC

Blowing away ~/.local/share/gvfs-metadata makes the symptom go away, and the test passes.  

Ok, to get a clean environment: logging out, and from a VT as root, shutting down gdm, ensuring gvfs-metadatad is dead, then rm ~walters/.local/share/gvfs-metadata -rf, then start gdm again to get a clean environment.

Running glib/file.test twice in a row triggers the corruption.

Comment 2 Colin Walters 2013-05-21 20:33:50 UTC

Ok, it's worth noting that glib/file.test creates a *lot* of threads.  From
https://bugzilla.gnome.org/show_bug.cgi?id=637095#c9

"One thing I find disturbing is no protection from concurrent access to the same
journal file from multiple clients. Clients have the journal file partially
mmaped - the data area but not the header, containing number of records etc.,
which is updated separately."

Really =(

This trivially reproduces for me by the way in jhbuild in Fedora 19, *assuming* that I build gvfs in the jhbuild root.

Comment 3 Colin Walters 2013-05-21 22:19:22 UTC

Created attachment 245002 [details] [review]
0001-metadata-Downgrade-journal-short-read-to-debug.patch

Comment 4 Tomas Bzatek 2013-05-22 10:44:15 UTC

Oh, is the g_warning() too intrusive for you? It was really meant as a warning since we're breaking journal iteration effectively losing data written after the corrupted entry (we don't know how much to seek, heuristics won't help here either since there's no magic header to look for).

(In reply to comment #2)
> Ok, it's worth noting that glib/file.test creates a *lot* of threads.  From
> https://bugzilla.gnome.org/show_bug.cgi?id=637095#c9
> 
> "One thing I find disturbing is no protection from concurrent access to the
> same
> journal file from multiple clients. Clients have the journal file partially
> mmaped - the data area but not the header, containing number of records etc.,
> which is updated separately."

I should note that this was meant in context of multiple systems sharing one physical storage, client here means a machine, isolated operating system.

In a single environment write access is controlled by single daemon, coordinating writes and database rotation. Unless you have multiple instances of gvfsd-metadata running (e.g. chroot or separate d-bus session sharing the same homedir), you shouldn't run into problem.

(In reply to comment #1)
> Running glib/file.test twice in a row triggers the corruption.

What was your setup exactly?

> This trivially reproduces for me by the way in jhbuild in Fedora 19, *assuming*
> that I build gvfs in the jhbuild root.

Not sure how jhbuild works, mixing various gvfs versions shouldn't matter in this case.

Comment 5 Colin Walters 2013-05-22 13:10:55 UTC

(In reply to comment #4)

> In a single environment write access is controlled by single daemon,
> coordinating writes and database rotation. Unless you have multiple instances
> of gvfsd-metadata running (e.g. chroot or separate d-bus session sharing the
> same homedir), you shouldn't run into problem.

But multiple processes are attempting to read from the journal which is being concurrently written, and it's possible to hit that g_warning().

> (In reply to comment #1)
> > Running glib/file.test twice in a row triggers the corruption.
> 
> What was your setup exactly?

This was inside the GnomeOSTree continuous integration/testing system:
https://live.gnome.org/OSTree/GnomeOSTree which runs the glib installed tests (such as glib/file from glib/gio/tests/file.c).

However it reproduced inside jhbuild for me as well.

> Not sure how jhbuild works, mixing various gvfs versions shouldn't matter in
> this case.

Right.

So...for now I guess, let's accept the possibilty of "short journal iteration".  But we should really figure out a better scheme for concurrent read access.  

http://sqlite.org/atomiccommit.html is pretty in depth on how SQLite does multi-process read/write.  The tradeoffs gvfs metadata is going for were somewhat different (mainly the emphasis on performance), but it's still probably worth using as a reference.

Comment 6 Colin Walters 2013-05-22 13:15:00 UTC

Comment on attachment 245002 [details] [review]
0001-metadata-Downgrade-journal-short-read-to-debug.patch

Leaving this bug open for further discussion.

Comment 7 Tomas Bzatek 2013-05-23 10:24:31 UTC

(In reply to comment #5)
> But multiple processes are attempting to read from the journal which is being
> concurrently written, and it's possible to hit that g_warning().

I see your point now. Looking at the code, it should be safe to have concurrent read access from multiple clients while daemon is writing new data. The whole journal file is mmap'ed incl. header and data.

When reading, clients iterate from the end of the journal back to the beginning, having the journal length stored just before that.

When writing to the journal, data are written (appended) first and journal header is changed as the last thing. Since clients are retrieving journal length at the beginning, they don't see the newly written entry.

When rotating, new journal file is created and the old (current) one is unlinked, no more writes, no corruption possible from that side.

Perhaps it's caused by unflushed blocks at various locations of the file? We mmap PROT_READ | PROT_WRITE and MAP_SHARED on an fd that is opened using just standard O_RDONLY/O_RDWR flags. Perhaps O_DIRECT could help to stay uncached? Or explicit flush on the fd? Performance will suffer either way.

> So...for now I guess, let's accept the possibilty of "short journal iteration".
>  But we should really figure out a better scheme for concurrent read access.  
> 
> http://sqlite.org/atomiccommit.html is pretty in depth on how SQLite does
> multi-process read/write.  The tradeoffs gvfs metadata is going for were
> somewhat different (mainly the emphasis on performance), but it's still
> probably worth using as a reference.

Thanks, this is useful. I'll leave it to somebody else taking gvfs over :-)

Comment 8 Alexander Larsson 2013-06-07 14:51:02 UTC

I don't see how the readers being "concurrent" should in any way affect this. Even a single reader will be concurrent with the single writer, but the ordering should be such that the reader doesn't see the update until everything is fully written.

It seem like its getting the new journal length before all the data is written, which seems to be like a memory barrier issue or something like that...

Comment 9 Alexander Larsson 2013-06-13 18:06:52 UTC

In fact, i can reproduce this just doing:

$ gvfs-set-attribute file.txt metadata::foo test5
$ gvfs-set-attribute file.txt metadata::foo test6
$ gvfs-set-attribute file.txt metadata::foo test7

(gvfs-set-attribute:13983): GVFS-WARNING **: meta_journal_iterate: found short sized entry, possible journal corruption

seems like we have some serious issue with metadata writing.

Comment 10 Alexander Larsson 2013-06-14 09:38:01 UTC

Fixed this, but accidentally put the patch in:
https://bugzilla.gnome.org/show_bug.cgi?id=637095#c27