After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 737768 - fs: TrackerCrawler can crash due to invalid container pointers
fs: TrackerCrawler can crash due to invalid container pointers
Status: RESOLVED FIXED
Product: tracker
Classification: Core
Component: Miners
1.0.x
Other All
: Normal normal
: ---
Assigned To: tracker-general
: 726190 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2014-10-02 08:26 UTC by Debarshi Ray
Modified: 2014-10-14 14:21 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
libtracker-miner: Ensure that the async function uses owned data (821 bytes, patch)
2014-10-02 08:28 UTC, Debarshi Ray
none Details | Review
libtracker-miner: Simplify parent directory tracking (1.84 KB, patch)
2014-10-02 08:29 UTC, Debarshi Ray
none Details | Review
libtracker-miner: Simplify parent directory tracking (2.49 KB, patch)
2014-10-02 09:32 UTC, Debarshi Ray
none Details | Review
libtracker-miner: Ensure that the async function uses owned data (1.17 KB, patch)
2014-10-02 13:10 UTC, Debarshi Ray
none Details | Review
libtracker-miner: Simplify parent directory tracking (3.59 KB, patch)
2014-10-02 13:11 UTC, Debarshi Ray
none Details | Review
libtracker-miner: Ensure that the async function uses owned data (3.66 KB, patch)
2014-10-02 13:13 UTC, Debarshi Ray
committed Details | Review
[tracker-1.2] libtracker-miner: Ensure that the async function uses owned data (2.46 KB, patch)
2014-10-14 13:42 UTC, Debarshi Ray
committed Details | Review

Description Debarshi Ray 2014-10-02 08:26:16 UTC
This one is a bit mysterious.

Been triaging some downstream Fedora and RHEL crash reports and finally figured one of them out. Most of the reports, including this one, are against 1.0/0.16.x since that is what most users are using. The code in master/1.2.x has changed a bit so things don't match line to line, but I think that the bug might still be there. I'll try to explain it based on the old code, and then see if master/1.2.x needs a fix too.

https://bugzilla.redhat.com/show_bug.cgi?id=1096307 (RHEL bug so not everything might be public) had this:

Core was generated by `/usr/libexec/tracker-miner-fs'.
Program terminated with signal 11, Segmentation fault.
  • #0 enumerator_data_process
    at tracker-crawler.c line 623
  • #0 enumerator_data_process
    at tracker-crawler.c line 623
  • #1 file_enumerate_next_cb
    at tracker-crawler.c line 713
  • #2 next_async_callback_wrapper
    at gfileenumerator.c line 311
  • #3 g_task_return_now
    at gtask.c line 1105
  • #4 complete_in_idle_cb
    at gtask.c line 1114
  • #5 g_main_dispatch
    at gmain.c line 3058
  • #6 g_main_context_dispatch
    at gmain.c line 3634
  • #7 g_main_context_iterate
    at gmain.c line 3705
  • #8 g_main_loop_run
    at gmain.c line 3899
  • #9 main
    at tracker-main.c line 1090
  • #1 file_enumerate_next_cb
    at tracker-crawler.c line 713
  • #0 enumerator_data_process
    at tracker-crawler.c line 623
  • #0 magazine_chain_pop_head
    at gslice.c line 528
  • #1 thread_memory_magazine1_alloc
    at gslice.c line 835
  • #2 g_slice_alloc
    at gslice.c line 994
  • #3 g_slist_prepend
    at gslist.c line 267
  • #4 directory_processing_data_add_child
    at tracker-crawler.c line 370
  • #5 file_enumerate_next_cb
    at tracker-crawler.c line 747
  • #6 next_async_callback_wrapper
    at gfileenumerator.c line 311
  • #7 g_task_return_now
    at gtask.c line 1105
  • #8 complete_in_idle_cb
    at gtask.c line 1114
  • #9 g_main_dispatch
    at gmain.c line 3058
  • #10 g_main_context_dispatch
    at gmain.c line 3634
  • #11 g_main_context_iterate
    at gmain.c line 3705
  • #12 g_main_loop_run
    at gmain.c line 3899
  • #13 main
    at tracker-main.c line 1090
  • #5 file_enumerate_next_cb
    at tracker-crawler.c line 747
child_name = <optimized out>
is_dir = 0
crawler = 0x7f0ca0003660
ed = 0x7f0c7c0e8e10
enumerator = 0x7f0c8800ac50
parent = 0x7f0ca4019b20
child = <optimized out>
info = 0x287a2a0
files = 0x2bae8c0 = {0x7f0c7c0ca2c0, 0x7f0c7c0ca290, 0x298c380, 0x298ba10, 0x298bb60, 0x277bc10, 
  0x298aac0, 0x287a2a0, 0x28cca60, 0x2df40f0, 0x7f0c7c126c70, 0x2868f20, 0x2785700, 0x27856a0, 
  0x2a7f990, 0x2acc430, 0x299b660, 0x28bd2c0, 0x298ad00, 0x28c52d0, 0x7f0c7c05f2d0, 0x26e6930, 
  0x299b750, 0x28a9ca0}
l = 0x2bb6580 = {0x287a2a0, 0x28cca60, 0x2df40f0, 0x7f0c7c126c70, 0x2868f20, 0x2785700, 
  0x27856a0, 0x2a7f990, 0x2acc430, 0x299b660, 0x28bd2c0, 0x298ad00, 0x28c52d0, 0x7f0c7c05f2d0, 
  0x26e6930, 0x299b750, 0x28a9ca0}
error = 0x0
cancelled = <optimized out>
(gdb) print *(GLocalFile *)parent
$1 = {parent_instance = {g_type_instance = {g_class = 0x2bb34c0}, ref_count = 0, qdata = 0x80}, 
  filename = 0x2d077e0 "\260E\325\002"}
(gdb) print ed->dir_info->node->data
$2 = (gpointer) 0x25c91b0
(gdb) print *(GLocalFile *)(ed->dir_info->node->data)
$3 = {parent_instance = {g_type_instance = {g_class = 0x28ba450}, ref_count = 0, qdata = 0x80}, 
  filename = 0x0}
(gdb)

Both child_name and child are optimized out, but there is definitely something fishy with parent and ed->dir_info->node->data. Also, no clue why they are not pointing to the same thing.

Going through the session logs of of all the tracker-miner-fs crash reports, I found this recurring pattern:
     GLib-GIO-CRITICAL **: g_file_get_child: assertion `G_IS_FILE (file)' failed
     GLib-GObject-CRITICAL **: g_object_set_qdata_full: assertion `G_IS_OBJECT (object)' failed
     GLib-GObject-CRITICAL **: g_object_ref: assertion `G_IS_OBJECT (object)' failed
     GLib-GObject-CRITICAL **: g_object_unref: assertion `G_IS_OBJECT 

That looks like these failures:
    * G_IS_FILE fails inside g_file_get_child, so child=0x0
    * G_IS_OBJECT fails inside g_object_set_qdata_full because child=0x0
    * G_IS_OBJECT fails inside g_object_ref (child) called from directory_processing_data_add_child -> directory_child_data_new
    * G_IS_OBJECT fails inside g_object_unref (child)

You can find these session logs here (Fedora bugs, so nothing hidden there):
https://bugzilla.redhat.com/show_bug.cgi?id=971182
https://bugzilla.redhat.com/show_bug.cgi?id=986779
https://bugzilla.redhat.com/show_bug.cgi?id=990605

Continuing our previous GDB session, we have the following in frame #5:
(gdb) print *(GLocalFile*)ed->dir_file
$5 = {parent_instance = {g_type_instance = {g_class = 0x258d0a0}, ref_count = 8, 
    qdata = 0x298d0a0}, 
  filename = 0x287b0f0 "/home/john/Documents/Books/Documentation/books/"}
(gdb) print *(GLocalFile*)enumerator->priv->container
$6 = {parent_instance = {g_type_instance = {g_class = 0x258d0a0}, ref_count = 8, 
    qdata = 0x298d0a0}, 
  filename = 0x287b0f0 "/home/john/Documents/Books/Documentation/books/"}
(gdb)

That looks sane.

My theory is that ed->dir_info->node is getting destroyed while we are still using it. Possibly due to a bug in the way we track the lifetime of the DirectoryRootInfo instances. (Note how the above pattern in the session logs usually appear in multiples of 6, which is the number of default roots in the configuration.)

In any case ed->dir_file was meant to keep the directory that is currently being enumerated, so I think we should just use it instead of ed->dir_info->node->data. In fact, we don't really need ed->dir_file because we can just use g_file_enumerator_get_container which serves the same purpose.
Comment 1 Debarshi Ray 2014-10-02 08:26:52 UTC
*** Bug 726190 has been marked as a duplicate of this bug. ***
Comment 2 Debarshi Ray 2014-10-02 08:28:48 UTC
Created attachment 287563 [details] [review]
libtracker-miner: Ensure that the async function uses owned data
Comment 3 Debarshi Ray 2014-10-02 08:29:21 UTC
Created attachment 287564 [details] [review]
libtracker-miner: Simplify parent directory tracking
Comment 4 Debarshi Ray 2014-10-02 09:32:04 UTC
Created attachment 287568 [details] [review]
libtracker-miner: Simplify parent directory tracking
Comment 5 Debarshi Ray 2014-10-02 13:09:07 UTC
I stumbled across:
https://bugzilla.redhat.com/show_bug.cgi?id=1031416

It looks like the emission of check-directory-contents is similarly affected:
g_signal_emit (crawler, signals[CHECK_DIRECTORY_CONTENTS], 0, ed->dir_info->node->data, children, &use);
Comment 6 Debarshi Ray 2014-10-02 13:10:12 UTC
Created attachment 287581 [details] [review]
libtracker-miner: Ensure that the async function uses owned data
Comment 7 Debarshi Ray 2014-10-02 13:11:25 UTC
Created attachment 287582 [details] [review]
libtracker-miner: Simplify parent directory tracking
Comment 8 Debarshi Ray 2014-10-02 13:13:53 UTC
Created attachment 287584 [details] [review]
libtracker-miner: Ensure that the async function uses owned data

Merged the two patches together as suggested by Carlos.
Comment 9 Martyn Russell 2014-10-03 10:39:43 UTC
Debarshi, the latest patch looks ok, I've just skimmed over it though. I also wonder if you plan to fix the newer work or if the bug even exists there?
Comment 10 Debarshi Ray 2014-10-05 09:45:55 UTC
Comment on attachment 287584 [details] [review]
libtracker-miner: Ensure that the async function uses owned data

Pushed to tracker-1.0 and tracker-0.16. I will check the newer branches now.
Comment 11 Debarshi Ray 2014-10-14 13:42:19 UTC
Created attachment 288517 [details] [review]
[tracker-1.2] libtracker-miner: Ensure that the async function uses owned data
Comment 12 Martyn Russell 2014-10-14 13:50:01 UTC
Comment on attachment 288517 [details] [review]
[tracker-1.2] libtracker-miner: Ensure that the async function uses owned data

Thanks again for this patch, please go ahead. It looks right to me. Presumably the unit tests are passing too?
Comment 13 Debarshi Ray 2014-10-14 14:21:00 UTC
(In reply to comment #12)
> (From update of attachment 288517 [details] [review])
> Thanks again for this patch, please go ahead. It looks right to me. Presumably
> the unit tests are passing too?

Yes, the test suite is passing. Thanks for taking a look!