GNOME Bugzilla – Bug 640611
Crash in dconf_engine_setup_user
Last modified: 2011-02-04 14:20:05 UTC
I have a test suite for Deja Dup that seems to (sort of) reliably hit a dconf crash. The assertion it trips on is: ERROR:../engine/dconf-engine.c:140:dconf_engine_setup_user: assertion failed: ((engine->gvdbs[0] == NULL) >= (engine->shm == NULL)) I got a traceback, but it doesn't make sense to me: ...
+ Trace 225704
Now, when I examine gvdbs and shm, I see that shm is NULL (as expected), but bizarrely: (gdb) print *engine->gvdbs[0] $5 = {ref_count = 1, data = 0x7f1e643be000 <Address 0x7f1e643be000 out of bounds>, size = 548, mapped = 0x1480b60, byteswapped = 0, trusted = 0, bloom_words = 0x7f1e643be020, n_bloom_words = 0, bloom_shift = 0, hash_buckets = 0x7f1e643be020, n_buckets = 12, hash_items = 0x7f1e643be050, n_hash_items = 12} This should be impossible if I'm reading the code right. Are there threading issues here? Now, my situation is funky. My test suite runs in its own private dbus-launch environment with special HOME & XDG_* variables so as not to interact with my own dconf instance. So maybe that's doing something? I'm running Ubuntu natty, with dconf 0.7.1-0ubuntu2. I can provide more info about the crash or how to reproduce (which may be slightly involved). You could also grab me on IRC.
I saw this once or twice too, but I looked at the code and couldn't figure out how it was possible at all. Your threading argument is a good possibility. The other thing I was considering is heap corruption. Can you distill a good/small test case that triggers this somewhat reliably? I'd love to nail this one.
So the reason that this hasn't shown up in testing is that three things have to occur at the same time: - a write so that the file needs to be reloaded - a read so that the file is reloaded in the thread of the reader (ie: the main application thread) - adding a watch so that the file is reloaded in dconf's worker threader I don't have any test cases that do all of those at once, so it's not caught. Probably this can be fixed by adding locks, but maybe I can come up with a clever way to avoid that.
Does this mean you found a new test case yourself? I would have difficulty parsing what I have down to something manageable. But a full run of the Deja Dup test suite always hits it at some point. If running that would be useful, see http://live.gnome.org/DejaDup/GettingInvolved/Coding to get set up and then enter the 'tests' subdirectory of a deja-dup checkout and type 'make test'.
I think I fixed this. Testing is appreciated.
Yeah, a couple run throughs of my suite didn't hit it, so I'm happy. (It would usually trigger about once per run.)