Bug 598561 – gvfsd-metadata crashes on start

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 598561 - gvfsd-metadata crashes on start


Summary:	gvfsd-metadata crashes on start


Status:	RESOLVED OBSOLETE

Product:	gvfs
Classification:	Core
Component:	metadata
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gvfs-maint
QA Contact:	gvfs-maint

URL:
Whiteboard:

Duplicates:	590060 601504 603645 609594 643268 741216 (view as bug list)
Depends on:
Blocks:

Reported:	2009-10-15 14:13 UTC by Bastien Nocera
Modified:	2018-09-21 16:57 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
home.tar.gz (1.18 KB, application/x-compressed-tar) 2009-10-15 14:55 UTC, Bastien Nocera		Details
Check if MetaTree exists before flushing it (966 bytes, patch) 2009-11-04 14:15 UTC, Christian Kellner	none	Details \| Review
Nuke metadata file if magic blob is wrong (875 bytes, patch) 2009-11-04 15:43 UTC, Christian Kellner	none	Details \| Review
metadata: Handle failure if files are not readable (10.23 KB, patch) 2014-12-08 08:26 UTC, Ross Lagerwall	reviewed	Details \| Review
metadata: don't crash if meta_tree_init fails (draft) (14.50 KB, patch) 2014-12-16 14:03 UTC, Ondrej Holy	reviewed	Details \| Review
metadata: don't crash if meta_tree_init fails (14.72 KB, patch) 2014-12-17 14:02 UTC, Ross Lagerwall	none	Details \| Review
metadata: don't crash if meta_tree_init fails (14.88 KB, patch) 2014-12-18 12:33 UTC, Ondrej Holy	reviewed	Details \| Review
client: Strip remote dbus error when setting metadata fails (1.10 KB, patch) 2014-12-20 05:19 UTC, Ross Lagerwall	committed	Details \| Review
metadata: don't crash if meta_tree_init fails (14.88 KB, patch) 2015-01-05 11:40 UTC, Ondrej Holy	none	Details \| Review
metadata: don't crash if meta_tree_init fails (14.90 KB, patch) 2015-01-05 15:46 UTC, Ondrej Holy	committed	Details \| Review
metadata: don't shedule flush if operation failed (3.75 KB, patch) 2015-02-10 08:49 UTC, Ondrej Holy	reviewed	Details \| Review
metadata: return if meta_tree_refresh failed (985 bytes, patch) 2015-02-24 13:59 UTC, Ondrej Holy	committed	Details \| Review
metadata: don't shedule flush if operation failed (2.57 KB, patch) 2015-02-24 14:03 UTC, Ondrej Holy	needs-work	Details \| Review
metadata: don't crash if write out failed (3.45 KB, patch) 2015-03-10 13:15 UTC, Ondrej Holy	none	Details \| Review
metadata: don't crash if write out failed (2.83 KB, patch) 2015-03-10 13:22 UTC, Ondrej Holy	committed	Details \| Review

Description Bastien Nocera 2009-10-15 14:13:21 UTC

Crashes about 30 seconds after startup.

gvfs-1.4.0-3.fc12.x86_64

Program terminated with signal 11, Segmentation fault.

+ Trace 218336

#0 verify_metadata_block
at metatree.c line 223
#0 verify_metadata_block
at metatree.c line 223
#1 copy_tree_to_builder
at metatree.c line 2183
#2 meta_tree_flush_locked
at metatree.c line 2344
#3 meta_tree_flush
at metatree.c line 2365
#4 writeout_timeout
at meta-daemon.c line 61
#5 g_timeout_dispatch
at gmain.c line 3396
#6 g_main_dispatch
at gmain.c line 1960
#7 IA__g_main_context_dispatch
at gmain.c line 2513
#8 g_main_context_iterate
at gmain.c line 2591
#9 IA__g_main_loop_run
at gmain.c line 2799
#10 main
at meta-daemon.c line 783

Comment 1 Bastien Nocera 2009-10-15 14:16:11 UTC

And running under valgrind:
==3488== Invalid read of size 4
==3488==    at 0x405736: copy_tree_to_builder (metatree.c:223)
==3488==    by 0x40616F: meta_tree_flush_locked (metatree.c:2344)
==3488==    by 0x406397: meta_tree_flush (metatree.c:2365)
==3488==    by 0x403AAC: writeout_timeout (meta-daemon.c:61)
==3488==    by 0x34CC43993A: g_timeout_dispatch (gmain.c:3396)
==3488==    by 0x34CC43922D: g_main_context_dispatch (gmain.c:1960)
==3488==    by 0x34CC43CC17: g_main_context_iterate (gmain.c:2591)
==3488==    by 0x34CC43D064: g_main_loop_run (gmain.c:2799)
==3488==    by 0x4049DC: main (meta-daemon.c:783)
==3488==  Address 0x8 is not stack'd, malloc'd or (recently) free'd
==3488== 
==3488== 
==3488== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==3488==  Access not within mapped region at address 0x8
==3488==    at 0x405736: copy_tree_to_builder (metatree.c:223)
==3488==    by 0x40616F: meta_tree_flush_locked (metatree.c:2344)
==3488==    by 0x406397: meta_tree_flush (metatree.c:2365)
==3488==    by 0x403AAC: writeout_timeout (meta-daemon.c:61)
==3488==    by 0x34CC43993A: g_timeout_dispatch (gmain.c:3396)
==3488==    by 0x34CC43922D: g_main_context_dispatch (gmain.c:1960)
==3488==    by 0x34CC43CC17: g_main_context_iterate (gmain.c:2591)
==3488==    by 0x34CC43D064: g_main_loop_run (gmain.c:2799)
==3488==    by 0x4049DC: main (meta-daemon.c:783)

Comment 2 Bastien Nocera 2009-10-15 14:55:35 UTC

Created attachment 145523 [details]
home.tar.gz

The files that caused the crash.

Seems to work fine when those files are removed.

Comment 3 Christian Kellner 2009-10-21 08:55:42 UTC

*** Bug 590060 has been marked as a duplicate of this bug. ***

Comment 4 Tomas Bzatek 2009-10-21 15:59:35 UTC

To reproduce the issue with attached file:
 1. create new user 'hadess' and copy the files to ~/.local/share/gvfs-metadata
 2. open Nautilus, choose some file in your home directory, set an emblem
 3. wait #define WRITEOUT_TIMEOUT_SECS 60 seconds, crash

The `home` file is missing #define MAGIC "\xda\x1ameta" at the beginning, file is malformed. That could have happened in two cases - filesystem error (e.g. xfs zeroing opened files) or invalid write from the daemon. The latter needs more investigation and we really need good repro steps leading to malformed file.

Either way, we need to fix the daemon to drop the malformed data (users won't be happy!) and start with a clean file.

Comment 5 Christian Kellner 2009-10-22 09:38:41 UTC

As long as Alex is away, I will try to come up with a fix for at least the first part of the problem, namely that we don't detect corrupted metadata files at all.

Comment 6 Alexander Larsson 2009-11-02 13:01:16 UTC

Wth happened to that file?
The "meta" magic and version info is not at the top of the file but almost at the end, and all strings in the file seem partial. There is no table listing all the keys in use in the file (which is generally at the start of the file after the header). The strings look weird too, parts of full pathnames where generally only relative pathnames are used in these files (unless as stored values by some app), nor are they stored in blocks related to each directory as they normally are.

No app maps the "home" file writable and its only written by gvfsd-metadata, and then quite carefully with fsyncs and whatnot. I have no idea what mangled this file. Did you have a system crash at some time before this?

Comment 7 Bastien Nocera 2009-11-02 13:17:22 UTC

I don't know what happened to that file, and yes, there was a crash before the problem happened. If gvfsd-metadata could just detect that the file is damaged and ignore it, it would be a good start...

Note that I saw something similar in another folder, without a system crash being necessary.

Comment 8 Christian Kellner 2009-11-04 14:15:57 UTC

Created attachment 146914 [details] [review]
Check if MetaTree exists before flushing it

Also change meta_tree_clear () so it sets fd to -1; this will make
meta_tree_exists () work as expected. Bug 598561.

Comment 9 Christian Kellner 2009-11-04 14:29:09 UTC

The attached patch will fix the crasher but will leave the render the specific metatree unusable (since we have neither file nor journal to write to).

I guess we need a meta_tree_reset () or maybe meta_tree_recover () functions that will just truncate the file in case the magic header is broken. Since the data most likely is already lost at that point it shouldn't matter.

Comment 10 Christian Kellner 2009-11-04 15:43:45 UTC

Created attachment 146922 [details] [review]
Nuke metadata file if magic blob is wrong

This is more a hack/proof of concept. It nukes a metadata file if the magic blob is wrong. Works ok here.

Comment 11 Alexander Larsson 2009-11-05 08:36:32 UTC

I'm not sure nuking it is the best approach, maybe we can rename it, adding ".error" to the name or something.

Comment 12 Alexander Larsson 2009-11-19 09:58:57 UTC

Interesting, the "home" file does have a fully correct, but empty of data, metadata file starting at offset 0x1121 until the end of the file. And the journal tag it refers to is the correct one of the attached journal file, which is also a correct empty journal file.

Comment 13 Alexander Larsson 2009-11-19 11:02:37 UTC

I believe commit f45b677201abb6b8471fa5bc935afda420c28c39 fixes the reason for the bugged file. However, we should get the robust handling of buggy files in anyway.

Comment 14 Alexander Larsson 2009-11-24 13:12:44 UTC

*** Bug 601504 has been marked as a duplicate of this bug. ***

Comment 15 Christian Kellner 2009-12-11 13:02:49 UTC

*** Bug 603645 has been marked as a duplicate of this bug. ***

Comment 16 Christian Kellner 2010-02-14 21:51:15 UTC

*** Bug 609594 has been marked as a duplicate of this bug. ***

Comment 17 Christian Kellner 2010-02-14 21:51:46 UTC

Stacktrace from #609594

(gdb) bt full

+ Trace 220570

#0 copy_tree_to_builder
at metatree.c line 2183
#1 meta_tree_flush_locked
at metatree.c line 2344
#2 meta_tree_flush
at metatree.c line 2365
#3 writeout_timeout
at meta-daemon.c line 61
#4 g_timeout_dispatch
at /build/buildd/glib2.0-2.23.1/glib/gmain.c line 3396
#5 g_main_dispatch
at /build/buildd/glib2.0-2.23.1/glib/gmain.c line 1960
#6 IA__g_main_context_dispatch
at /build/buildd/glib2.0-2.23.1/glib/gmain.c line 2513
#7 g_main_context_iterate
at /build/buildd/glib2.0-2.23.1/glib/gmain.c line 2591
#8 IA__g_main_loop_run
at /build/buildd/glib2.0-2.23.1/glib/gmain.c line 2799
#9 main
at meta-daemon.c line 783

Comment 18 Martin Pitt 2011-05-25 08:50:49 UTC

I stumbled over this as well, in the report that I've got (https://launchpad.net/bugs/744836) it happens when starting the session with AT and orca.

Bug 643268 seems to be the same issue (tree->root == NULL), I'll dupe this. It has a patch, but I'm afraid I know too little about gvfs-metadata to say whether it's correct.

Comment 19 Martin Pitt 2011-05-25 08:51:09 UTC

*** Bug 643268 has been marked as a duplicate of this bug. ***

Comment 20 Ondrej Holy 2014-12-08 08:23:35 UTC

*** Bug 741216 has been marked as a duplicate of this bug. ***

Comment 21 Ross Lagerwall 2014-12-08 08:26:10 UTC

Created attachment 292287 [details] [review]
metadata: Handle failure if files are not readable

If the gvfs-metadata directory is not readable, handle the
failure by returning NULL if meta_tree_open fails.
Note that failing to open a tree for reading is not an
error, but failing to open for writing is an error.

Fix up several callers to handle a NULL tree without
crashing.

Comment 22 Ross Lagerwall 2014-12-08 08:27:23 UTC

The attach patch fixes the issue for me, tested by making gvfs-metadata unreadable and trying various combinations of gvfs-ls gvfs-info and gvfs-set-attribute on local and remote files.

Comment 23 Ondrej Holy 2014-12-08 08:34:05 UTC

Thanks Ross, will look at the patch in next days (not feeling well today) and compare it with my draft how to fix it...

Comment 24 Ross Lagerwall 2014-12-08 08:48:53 UTC

Bug 741216 has easy steps to reproduce the crash.

Ok I made this patch before I saw you had a draft patch... anyway, get better soon :-)

Comment 25 Ondrej Holy 2014-12-16 14:01:13 UTC

Review of attachment 292287 [details] [review]:

It looks good. Most of lines we fix in same way.

::: client/gdaemonfile.c
@@ +2658,3 @@
+      g_set_error (error, G_IO_ERROR,
+                   G_IO_ERROR_FAILED,
+                   _("Error setting file metadata"));

Maybe would be good to specify the error (also in other places), something like:
_("Error setting file metadata: %s\n"),
_("can't open metadata file")

::: client/gdaemonvfs.c
@@ +1346,3 @@
+                  g_free (tree_path);
+                }
+            }

I hope there are changes only in indentation... unfortunately there isn't probably any nice way how to fix it without changing to many lines.

::: metadata/metatree.c
@@ +395,3 @@
 	}
       tree->fd = -1;
+      return !tree->for_write;

I'm not really sure if it is ok, I need to do more investigation... but makes sense.

@@ +479,3 @@
+  if (!meta_tree_init (tree))
+    {
+      g_free (tree);

meta_tree_unref (tree)?

@@ +480,3 @@
+    {
+      g_free (tree);
+      return NULL;

Why not use tree = NULL; to do it same as in other parts of the code...

@@ +618,3 @@
 	meta_tree_clear (tree);
+
+      /* XXX what happens if this fails? */

I think we can handle it, see my patch...

Comment 26 Ondrej Holy 2014-12-16 14:03:07 UTC

Created attachment 292831 [details] [review]
metadata: don't crash if meta_tree_init fails (draft)

Also I have some debug outputs in meta_tree_init...

Comment 27 Ross Lagerwall 2014-12-17 09:44:31 UTC

Review of attachment 292287 [details] [review]:

::: client/gdaemonfile.c
@@ +2658,3 @@
+      g_set_error (error, G_IO_ERROR,
+                   G_IO_ERROR_FAILED,
+                   _("Error setting file metadata"));

OK

::: client/gdaemonvfs.c
@@ +1346,3 @@
+                  g_free (tree_path);
+                }
+            }

Yeah, it is just a change in indentation.

::: metadata/metatree.c
@@ +479,3 @@
+  if (!meta_tree_init (tree))
+    {
+      g_free (tree);

OK

@@ +480,3 @@
+    {
+      g_free (tree);
+      return NULL;

OK

@@ +618,3 @@
 	meta_tree_clear (tree);
+
+      /* XXX what happens if this fails? */

OK...

Comment 28 Ross Lagerwall 2014-12-17 10:17:55 UTC

Review of attachment 292831 [details] [review]:

As you said, this is pretty similar to the patch I submitted.

However, it is missing the "return !tree->for_write" line which makes setting metadata fail if no metadata tree file exists.
E.g. try rm ~/.local/share/gvfs-metadata/*; gvfs-set-attribute ...

The following applied on top of your patch should fix it:
diff --git a/metadata/metatree.c b/metadata/metatree.c
index 5c320f6..af72c1e 100644
--- a/metadata/metatree.c
+++ b/metadata/metatree.c
@@ -376,9 +376,6 @@ meta_tree_init (MetaTree *tree)
   fd = safe_open (tree, tree->filename, O_RDONLY);
   if (fd == -1)
     {
-      errsv = errno;
-      g_warning ("can't init metadata tree %s: %s", tree->filename, g_strerror (errsv));
-
       if (tree->for_write && !retried)
 	{
 	  MetaBuilder *builder;
@@ -397,9 +394,14 @@ meta_tree_init (MetaTree *tree)
 	    }
 	  meta_builder_free (builder);
 	}
+      else if (tree->for_write)
+        {
+          errsv = errno;
+          g_warning ("can't init metadata tree %s: %s", tree->filename, g_strerror (errsv));
+        }
       tree->fd = -1;
 
-      return FALSE;
+      return !tree->for_write;
     }
 
   if (fstat (fd, &statbuf) != 0)

::: metadata/metatree.c
@@ +415,2 @@
     {
+      g_warning ("can't init metadata tree %s: %s", tree->filename, "wrong size");

This would be simpler like:
g_warning ("can't init metadata tree %s: wrong size", tree->filename);

And same for the others below.

Comment 29 Ondrej Holy 2014-12-17 13:21:30 UTC

(In reply to comment #28)
> Review of attachment 292831 [details] [review]:
> 
> As you said, this is pretty similar to the patch I submitted.
> 
> However, it is missing the "return !tree->for_write" line which makes setting
> metadata fail if no metadata tree file exists.
> E.g. try rm ~/.local/share/gvfs-metadata/*; gvfs-set-attribute ...

Correct... :-) could you merge those patches and add comment, why we are returning !tree->for_write in this case?

Maybe this patch could help us to find root cause, why it is crashing so frequently:
https://retrace.fedoraproject.org/faf2/problems/438841/

Also maybe good time to resurrect attachment 146922 [details] [review].

Comment 30 Ross Lagerwall 2014-12-17 14:02:32 UTC

Created attachment 292901 [details] [review]
metadata: don't crash if meta_tree_init fails

It fails e.g. when gvfs-metadata dir isn't readable.

Comment 31 Ondrej Holy 2014-12-18 12:33:25 UTC

Created attachment 292977 [details] [review]
metadata: don't crash if meta_tree_init fails

Thanks. I've been playing with metadata some time. It seems all meta_tree_init callers are handled properly. I've simplify those warnings also as you suggested and remove some whitespace. So I think it is ready for committing and we will see what happens, because the number of crashes wouldn't be due to wrong permissions...

Comment 32 Ross Lagerwall 2014-12-18 15:13:55 UTC

Review of attachment 292977 [details] [review]:

Looks good other than a minor comment

::: metadata/metatree.c
@@ +399,3 @@
+      else if (tree->for_write || errsv != ENOENT)
+        {
+          g_warning ("can't init metadata tree %s: %s", tree->filename, g_strerror (errsv));

Perhaps the message could give some indication of what failed (e.g. open) to differentiate from the other messages.

@@ +411,3 @@
+    {
+      errsv = errno;
+      g_warning ("can't init metadata tree %s: %s", tree->filename, g_strerror (errsv));

Perhaps the message could give some indication of what failed (e.g. fstat) to differentiate from the other messages.

@@ +429,3 @@
     {
+      errsv = errno;
+      g_warning ("can't init metadata tree %s: %s", tree->filename, g_strerror (errsv));

Perhaps the message could give some indication of what failed (e.g. mmap) to differentiate from the other messages.

Comment 33 Ross Lagerwall 2014-12-20 05:11:57 UTC

Review of attachment 292977 [details] [review]:

::: client/gdaemonfile.c
@@ +2657,3 @@
+    {
+      g_set_error (error, G_IO_ERROR, G_IO_ERROR_FAILED,
+                   _("Error setting file metadata: %s\n"),

Error messages like this shouldn't end with newlines...

::: client/gdaemonvfs.c
@@ +1284,3 @@
+            {
+              g_set_error (error, G_IO_ERROR, G_IO_ERROR_FAILED,
+                           _("Error setting file metadata: %s\n"),

Error messages like this shouldn't end with new lines

@@ +1286,3 @@
+                           _("Error setting file metadata: %s\n"),
+                           _("can't open metadata tree"));
+              res = FALSE;

For consistency with the rest of the function, add:
error = NULL; /* Don't set further errors */

Comment 34 Ross Lagerwall 2014-12-20 05:19:34 UTC

Created attachment 293120 [details] [review]
client: Strip remote dbus error when setting metadata fails

Without this, if the function fails, the error presented to the user looks
like:

GDBus.Error:org.gtk.GDBus.UnmappedGError.Quark._g_2dio_2derror_2dquark.Code1:
Can't find metadata file /home/ross/.local/share/gvfs-metadata/home

Comment 35 Tomas Bzatek 2014-12-22 15:04:39 UTC

Review of attachment 293120 [details] [review]:

Looks good.

Comment 36 Ondrej Holy 2015-01-05 11:40:40 UTC

Created attachment 293803 [details] [review]
metadata: don't crash if meta_tree_init fails

Thanks for the review! Sorry for the delay, I was sick and then on vacation. There is fixed patch. Let's go in...

Comment 37 Ross Lagerwall 2015-01-05 13:51:32 UTC

Did you see comment 33?

Comment 38 Ondrej Holy 2015-01-05 15:13:22 UTC

I saw it, what's wrong? I've double checked it now. Your notes should be addressed in the attachment 293803 [details] [review].

Comment 39 Ross Lagerwall 2015-01-05 15:21:18 UTC

Sorry, I meant comment 32, just to differentiate between a few of the error messages (open, fstat and mmap).

Comment 40 Ondrej Holy 2015-01-05 15:46:01 UTC

Created attachment 293831 [details] [review]
metadata: don't crash if meta_tree_init fails

Ah, sorry, I saw it before, but I've forgotten on it now. That's good idea! There is modified version, hopefully last one :-D

Comment 41 Ross Lagerwall 2015-01-05 18:24:45 UTC

Review of attachment 293831 [details] [review]:

Looks good!

Comment 42 Ondrej Holy 2015-01-06 09:41:54 UTC

Comment on attachment 293831 [details] [review]
metadata: don't crash if meta_tree_init fails

commit a03635ee68f043b68c1bb4cb95114af3ecb65ec4

Comment 43 Ondrej Holy 2015-01-06 09:42:11 UTC

Comment on attachment 293120 [details] [review]
client: Strip remote dbus error when setting metadata fails

commit 754b8750abbd01ae8a429b5f75d5e9f9622ab010

Comment 44 Ondrej Holy 2015-01-06 10:08:59 UTC

Also pushed for gnome-3-14, gnome-3-12, gnome-3-10. Hopefully we will reduce the enormous number of crashes:
https://retrace.fedoraproject.org/faf2/problems/537424/

Thanks Ross for cooperation!

Comment 45 Ondrej Holy 2015-02-10 08:47:28 UTC

Unfortunately this bug isn't still fixed, there are new downstream bugreports (same as before):
https://bugzilla.redhat.com/show_bug.cgi?id=1187492

I made some elaboration on the backtrace. Full bactrace is in the mentioned downstream report. There you can see:

+ Trace 234637

#1 copy_tree_to_builder
at metatree.c line 2144


The dirent is NULL, so tree->root is NULL. I suspect that this could happen only if meta_tree_init failed (but I would really like to know why).

So if e.g. meta-daemon.c:handle_set() is executed and meta_tree_set_string failed, we still execute tree_info_schedule_writeout. If meta_tree_set_string failed due to meta_tree_init (e.g. in meta_tree_refresh), the tree->root is NULL. So there is crash if writeout_timeout is executed. I don't know about different case, when tree->root could be NULL, do you?

Unfortunately I don't have idea, how to reproduce the mentioned case. Playing with permissions doesn't work as I intended, because meta_builder_write has to success and consequential meta_tree_init in meta_tree_refresh has to fail...

So I think we improved the behavioral by the previous patch, because handle_set() should return error in this case, however did not fix the crashing.

Comment 46 Ondrej Holy 2015-02-10 08:49:47 UTC

Created attachment 296445 [details] [review]
metadata: don't shedule flush if operation failed

There is proposed solution which should remove invalid trees from tree_infos cache and don't allow tree_info_schedule_writeout to be executed.

Comment 47 Ondrej Holy 2015-02-13 10:01:59 UTC

Also different metadata crasher appeared with lot of hits:
https://retrace.fedoraproject.org/faf2/reports/483385/

Unfortunately there isn't much info, but we don't break it (either fix) by the previous patch according package versions...

Comment 48 Ondrej Holy 2015-02-18 12:51:25 UTC

Review of attachment 296445 [details] [review]:

I've been playing with metadata daemon a bit. The crash with mentioned backtrace can be "reproduced", when I manually inject error in meta_tree_init (to cause what I mentioned before in comment #45). The provided patch fixes the crash, however meta_tree_exists doesn't work as intended, because meta_tree_clear sets fd = 0. So, currently if meta_tree_init fails once, metadata daemon doesn't try to read it again (because there is invalid tree in tree_infos cache) and just returns error. I intended to remove it from the cache and try it again next time initially. But hard to say what is better behavioral. It would be really useful to know reason, why it is crashing. If it is just fails when opening tree file due to e.g. lack of memory, or if there are some corrupted files. If there is corrupted file, meta_tree_init never succeeds, but it can succeed in other cases...

However in all cases we should to fix at least the consequences to not crash the daemon. 

We have recently added warnings in meta_tree_init, so if it is crashing for somebody, he can provide the output to determine reason of crashes. Isn't possible to get those warnings from some logs? Or is only way how to get them to run the daemon manually? I can't find them anywhere unfortunately...

Comment 49 Ondrej Holy 2015-02-20 08:46:30 UTC

(In reply to Ondrej Holy from comment #48)
> Review of attachment 296445 [details] [review] [review]:
> (snip)
> We have recently added warnings in meta_tree_init, so if it is crashing for
> somebody, he can provide the output to determine reason of crashes. Isn't
> possible to get those warnings from some logs? Or is only way how to get
> them to run the daemon manually? I can't find them anywhere unfortunately...

I answer myself, e.g.:
journalctl _COMM=gnome-session | grep gvfsd-metadata > gvfsd-metadata.log

Comment 50 Ondrej Holy 2015-02-24 13:59:33 UTC

Created attachment 297762 [details] [review]
metadata: return if meta_tree_refresh failed

There is fix for bug which was introduced by the attachment 293831 [details] [review].

G_UNLOCK is called twice if meta_tree_refresh fails and also concurrent access is possible on cached_trees. However this isn't problem currently, because meta_tree_lookup_by_name is always called with for_write=FALSE (except for meta-set testing utility), so this if statement is never issued

Comment 51 Ondrej Holy 2015-02-24 14:03:37 UTC

Created attachment 297763 [details] [review]
metadata: don't shedule flush if operation failed

There is simplified version of the fix, which requires attachment 146914 [details] [review] to be committed also...

Comment 52 Alexander Larsson 2015-03-02 15:08:13 UTC

Review of attachment 297762 [details] [review]:

Looks good to me.

Comment 53 Alexander Larsson 2015-03-02 16:04:34 UTC

Maybe we're really seeing a use-after-free situation rather than init failing? meta_tree_unref() clears the MetaTree on last unref.

Comment 54 Alexander Larsson 2015-03-02 17:06:25 UTC

Review of attachment 297763 [details] [review]:

Can't we just use meta_tree_exists() in writeout_timeout() instead?

Although we'd have to change meta_tree_clear() to set fd to -1, not 0. This seems better in general, as there seems to be a risk that we close fd 0 if meta_tree_clear() is called twice.

It would be nice to get some debug info back from users though, as this bug seems to indicate we're flushing a tree and creating an invalid file...

::: metadata/meta-daemon.c
@@ +125,3 @@
   if (info)
+    {
+      if (meta_tree_refresh (info->tree))

I'm not super fond of this. The daemon is the only thing that triggers the need for a refresh (when flushing), so this kind of "see if we need to refresh" call in the daemon is kind of weird.

We should be able to have full knowledge of when the tree goes bad.

@@ +214,3 @@
   else
     {
+      tree_info_schedule_writeout (info);

This is not enough, is it?
We may succeed in a write and schedule a timeout, but then a later set() call fails and makes the tree invalid.

::: metadata/metatree.c
@@ +644,3 @@
 }
 
+/* The tree is uninicialized if FALSE is returned! */

uninitialized

Comment 55 Matthias Clasen 2015-03-02 18:15:53 UTC

putting crashes on the blocker list

Comment 56 Ondrej Holy 2015-03-03 13:35:11 UTC

Thanks for looking at this...

(In reply to Alexander Larsson from comment #53)
> Maybe we're really seeing a use-after-free situation rather than init
> failing? meta_tree_unref() clears the MetaTree on last unref.

That's another option, but I don't see where it could happen. Metadata daemon has own cache of trees. meta_tree_unref is called only in tree_info_free and tree_info_free is used only as argument of g_hash_table. It seems hash table isn't never destroyed, so only case when tree_info_free can be called is when g_hash_table_insert is called with already used filename. g_hash_table_insert is used only in tree_info_lookup, but tree_info_lookup is called from handle_ callbacks, which should be executed on with one context, so it can't be called concurrently... am I right?

Comment 57 Ondrej Holy 2015-03-03 13:41:26 UTC

(In reply to Alexander Larsson from comment #54)
> Review of attachment 297763 [details] [review] [review]:
> 
> Can't we just use meta_tree_exists() in writeout_timeout() instead?
>
> Although we'd have to change meta_tree_clear() to set fd to -1, not 0. This
> seems better in general, as there seems to be a risk that we close fd 0 if
> meta_tree_clear() is called twice.

As I wrote before this patch requires attachment 146914 [details] [review] [review], which does those changes and I think there isn't difference whether we use meta_tree_exists in meta_tree_flush_locked, or in writeout_timeout...

But I don't think it is enough. We shouldn't return tree info with invalid tree from tree_info_lookup.

> It would be nice to get some debug info back from users though, as this bug
> seems to indicate we're flushing a tree and creating an invalid file...

It would be nice, but there is almost anything what we know about:
https://bugzilla.redhat.com/show_bug.cgi?id=699109

There is over 70 users in CC, however nothing useful, and people doesn't respond - there are unanswered questions from Tomas and me...

> ::: metadata/meta-daemon.c
> @@ +125,3 @@
>    if (info)
> +    {
> +      if (meta_tree_refresh (info->tree))
> 
> I'm not super fond of this. The daemon is the only thing that triggers the
> need for a refresh (when flushing), so this kind of "see if we need to
> refresh" call in the daemon is kind of weird.
> 
> We should be able to have full knowledge of when the tree goes bad.

So what to do when e.g. meta_tree_flush failed due to meta_tree_init and we have uninitialized tree in tree_infos cache?

> @@ +214,3 @@
>    else
>      {
> +      tree_info_schedule_writeout (info);
> 
> This is not enough, is it?
> We may succeed in a write and schedule a timeout, but then a later set()
> call fails and makes the tree invalid.

This isn't enough, but doesn't make sense to schedule timeout if there is error...

Comment 58 Ondrej Holy 2015-03-03 13:43:46 UTC

(In reply to Matthias Clasen from comment #55)
> putting crashes on the blocker list

Not sure it is real blocker since this bug is there for several releases (and years!)... We don't have reproducer and people don't responding on downstream bugzilla, probably because it occurs randomly and rarely.

Comment 59 Matthias Clasen 2015-03-03 14:34:44 UTC

(In reply to Ondrej Holy from comment #58)
> (In reply to Matthias Clasen from comment #55)
> > putting crashes on the blocker list
> 
> Not sure it is real blocker since this bug is there for several releases
> (and years!)... We don't have reproducer and people don't responding on
> downstream bugzilla, probably because it occurs randomly and rarely.

Fair enough. I just saw it swim by in my bug stream and picked it up. Let me drop it again.

Comment 60 Ondrej Holy 2015-03-04 13:24:37 UTC

Comment on attachment 297762 [details] [review]
metadata: return if meta_tree_refresh failed

commit 56326e463a56afa0e632baad4c94faa7aa154486

Comment 61 Alexander Larsson 2015-03-09 10:06:47 UTC

>> I'm not super fond of this. The daemon is the only thing that triggers the
>> need for a refresh (when flushing), so this kind of "see if we need to
>> refresh" call in the daemon is kind of weird.
>> 
>> We should be able to have full knowledge of when the tree goes bad.
>
> So what to do when e.g. meta_tree_flush failed due to meta_tree_init and we 
> have uninitialized tree in tree_infos cache?

This is the million dollar question is it not. When we failed to write out an updated tree last time we lost all the data in the previous tree. Now we crash when flushing out the new journal entries because of the NULL tree->root.

Can't we just skip the copy_tree_to_builder() call if root is NULL and accept the fact that we lost the data. Delaying the writeout schedule is not going to bring back the data, but delaying it forever will lose the *new* data we've gotten since then.

Comment 62 Ondrej Holy 2015-03-09 12:14:26 UTC

Patch to always clear fd to -1 was already pushed as:
commit 26c3dc3986760483625262a4a3564f1696f829b3

Comment 63 Ondrej Holy 2015-03-10 13:15:57 UTC

Created attachment 298991 [details] [review]
metadata: don't crash if write out failed

Comment 64 Ondrej Holy 2015-03-10 13:22:04 UTC

Created attachment 298993 [details] [review]
metadata: don't crash if write out failed

Comment 65 Alexander Larsson 2015-03-10 13:26:57 UTC

Review of attachment 298993 [details] [review]:

Looks good to me

Comment 66 Ross Lagerwall 2015-03-10 20:32:32 UTC

Review of attachment 298993 [details] [review]:

Looks good to me, but I don't know the code so well :-)

::: metadata/metatree.c
@@ +2365,3 @@
+	{
+	  /* It shouldn't happen. We failed to write out an updated tree
+	   * probably, therefor all the data are lost. Backup the file and

therefore

Comment 67 Ondrej Holy 2015-03-11 09:54:56 UTC

Comment on attachment 298993 [details] [review]
metadata: don't crash if write out failed

I've pushed it only into master and let's see what will happen...

commit f7f18bb6e7576672c2240c8ead8dfa4a2668d478

Comment 68 Ondrej Holy 2016-02-08 09:21:02 UTC

Comment on attachment 146914 [details] [review]
Check if MetaTree exists before flushing it

Marking as obsolete, because those changes are already in by another commits:
commit 26c3dc3986760483625262a4a3564f1696f829b3
commit a03635ee68f043b68c1bb4cb95114af3ecb65ec4

Comment 69 Ondrej Holy 2016-02-08 09:22:40 UTC

Comment on attachment 146922 [details] [review]
Nuke metadata file if magic blob is wrong

Marking as obsolete, see Bug 750113.

Comment 70 GNOME Infrastructure Team 2018-09-21 16:57:44 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gvfs/issues/122.