After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 630293 - SEGV in magazine_cache_trim()
SEGV in magazine_cache_trim()
Status: RESOLVED OBSOLETE
Product: evolution
Classification: Applications
Component: general
3.0.x (obsolete)
Other Linux
: Normal critical
: ---
Assigned To: Evolution Shell Maintainers Team
Evolution QA team
Depends on:
Blocks:
 
 
Reported: 2010-09-21 20:44 UTC by David Woodhouse
Modified: 2014-03-07 10:04 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
valgrind annotations for gslice (2.76 KB, patch)
2010-10-12 20:57 UTC, David Woodhouse
needs-work Details | Review

Description David Woodhouse 2010-09-21 20:44:44 UTC
Program received signal SIGSEGV, Segmentation fault.

Thread 140735164679952 (LWP 3485)

  • #0 magazine_cache_trim
    at gslice.c line 596
  • #1 magazine_cache_push_magazine
    at gslice.c line 657
  • #2 private_thread_memory_cleanup
    at gslice.c line 724
  • #3 __nptl_deallocate_tsd
    at pthread_create.c line 154
  • #4 start_thread
    at pthread_create.c line 308
  • #5 clone
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S line 115

Comment 1 David Woodhouse 2010-09-21 20:46:29 UTC
(gdb) t a a bt

Thread 2559 (Thread 0x7fffa5dd5710 (LWP 3879))

  • #0 __lll_lock_wait
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S line 136
  • #1 _L_lock_868
    from /lib64/libpthread.so.0
  • #2 __pthread_mutex_lock
    at pthread_mutex_lock.c line 61
  • #3 magazine_cache_push_magazine
    at gslice.c line 640
  • #4 private_thread_memory_cleanup
    at gslice.c line 724
  • #5 __nptl_deallocate_tsd
    at pthread_create.c line 154
  • #6 start_thread
    at pthread_create.c line 308
  • #7 clone
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S line 115

Thread 2500 (Thread 0x7fff757f8710 (LWP 3485))

  • #0 magazine_cache_trim
    at gslice.c line 596
  • #1 magazine_cache_push_magazine
    at gslice.c line 657
  • #2 private_thread_memory_cleanup
    at gslice.c line 724
  • #3 __nptl_deallocate_tsd
    at pthread_create.c line 154
  • #4 start_thread
    at pthread_create.c line 308
  • #5 clone
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S line 115

Thread 1 (Thread 0x7ffff4d96940 (LWP 28720))

  • #0 g_str_hash
    at gstring.c line 135
  • #1 g_hash_table_lookup_node
    at ghash.c line 308
  • #2 g_hash_table_lookup
    at ghash.c line 897
  • #3 g_quark_from_string_internal
    at gdataset.c line 1047
  • #4 g_quark_from_string
    at gdataset.c line 1077
  • #5 g_object_dispatch_properties_changed
    at gobject.c line 799
  • #6 g_object_notify_queue_thaw
    at gobjectnotifyqueue.c line 132
  • #7 g_object_newv
    at gobject.c line 1386
  • #8 g_object_new
    at gobject.c line 1178
  • #9 mail_msg_new
    at mail-mt.c line 95
  • #10 ping_store
    at mail-folder-cache.c line 830
  • #11 g_hash_table_foreach
    at ghash.c line 1324
  • #12 ping_cb
    at mail-folder-cache.c line 841
  • #13 g_timeout_dispatch
    at gmain.c line 3555
  • #14 g_main_dispatch
    at gmain.c line 2119
  • #15 g_main_context_dispatch
    at gmain.c line 2672
  • #16 g_main_context_iterate
    at gmain.c line 2750
  • #17 g_main_loop_run
    at gmain.c line 2958
  • #18 IA__gtk_main
    at gtkmain.c line 1219
  • #19 main
    at main.c line 671

Comment 2 David Woodhouse 2010-09-22 15:45:10 UTC
I suspect this one is probably related...

Program received signal SIGSEGV, Segmentation fault.

Thread 140734917236496 (LWP 22528)

  • #0 magazine_chain_pop_head
    at gslice.c line 486
  • #1 thread_memory_magazine1_alloc
    at gslice.c line 789
  • #2 g_slice_alloc
    at gslice.c line 827
  • #3 g_slist_prepend
    at gslist.c line 272
  • #4 pool_depth_list
    at gparam.c line 1218
  • #5 g_hash_table_foreach
    at ghash.c line 1324
  • #6 g_param_spec_pool_list
    at gparam.c line 1279
  • #7 g_object_class_list_properties
    at gobject.c line 659
  • #8 object_state_write
    at camel-object.c line 269
  • #9 camel_object_state_write
    at camel-object.c line 453
  • #10 vee_folder_sync
    at camel-vee-folder.c line 1087
  • #11 camel_folder_sync
    at camel-folder.c line 1124
  • #12 refresh_folders_exec
    at mail-send-recv.c line 910
  • #13 mail_msg_proxy
    at mail-mt.c line 487
  • #14 g_thread_pool_thread_proxy
    at gthreadpool.c line 314
  • #15 g_thread_create_proxy
    at gthread.c line 1897
  • #16 start_thread
    at pthread_create.c line 301
  • #17 clone
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S line 115
$1 = (ChunkLink **) 0x7fffcbe62810
(gdb) p *magazine_chunks
$2 = (ChunkLink *) 0x9350f0
(gdb) p chunk
$3 = (ChunkLink *) 0x7470697263736564
(gdb) p (char *)&chunk
$4 = 0x9350f8 "descript\a"
Comment 3 Milan Crha 2010-09-24 14:16:43 UTC
I recall similar issue, which was, if I recall correctly, caused by freeing GSLice'ed memory with g_free. The above seems to me like just a random crash, so do you have any steps how to reproduce this, please? Maybe valgrind with G_SLICE=always-malloc may help here.
Comment 4 David Woodhouse 2010-09-24 15:14:32 UTC
I've no idea how to reproduce, I'm afraid -- apart from the fact that in the Red Hat abrt bug, it seems it happened when I hit Ctrl-R to reply to a message.

It seems very random. I've been running in valgrind a lot and haven't seen anything that seems related. And wouldn't valgrind complain about using g_free on GSlice allocations? Or does using G_SLICE=always-malloc make that work out OK from valgrind's point of view?
Comment 5 Milan Crha 2010-09-27 08:55:29 UTC
To be honest I do not know. Valgrind usually claims when you try to free something "in the middle" of an allocated memory block (where I suppose G_SLICE is one large memory block), so you might be right that using G_SLICE=always-malloc can be counter productive for such issues, though for most other it perfectly fits.
Comment 6 David Woodhouse 2010-10-08 09:08:24 UTC
Just seen it again with current master. Had just started evolution and was reading mail.

Program received signal SIGSEGV, Segmentation fault.

Thread 140736181815056 (LWP 14337)

  • #0 magazine_chain_pop_head
    at gslice.c line 492
  • #1 thread_memory_magazine1_alloc
    at gslice.c line 795
  • #2 g_slice_alloc
    at gslice.c line 833
  • #3 g_slist_prepend
    at gslist.c line 273
  • #4 pool_depth_list
    at gparam.c line 1224
  • #5 g_hash_table_foreach
    at ghash.c line 1328
  • #6 g_param_spec_pool_list
    at gparam.c line 1285
  • #7 g_object_class_list_properties
    at gobject.c line 774
  • #8 object_state_write
    at camel-object.c line 269
  • #9 camel_object_state_write
    at camel-object.c line 453
  • #10 vee_folder_synchronize_sync
    at camel-vee-folder.c line 1366
  • #11 camel_folder_synchronize_sync
    at camel-folder.c line 3324
  • #12 refresh_folders_exec
    at mail-send-recv.c line 914
  • #13 mail_msg_proxy
    at mail-mt.c line 473

Comment 7 David Woodhouse 2010-10-08 09:27:01 UTC
This looks like it's the same as bug 624081, 618128, 623822, 623246, and maybe 621314
Comment 8 David Woodhouse 2010-10-12 08:43:41 UTC
Came down this morning and found evolution sitting at the same crash again. I don't think there's any special trick to reproducing it.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffdddf5710 (LWP 31367)]
magazine_chain_pop_head (mem_size=16) at gslice.c:492
492	      (*magazine_chunks)->data = chunk->next;
Comment 9 David Woodhouse 2010-10-12 10:13:00 UTC
What we should really do is put the appropriate valgrind client calls into gslice.c -- see http://valgrind.org/docs/manual/mc-manual.html#mc-manual.mempools

Then we wouldn't need to use G_SLICE=always-malloc to get sane results out of valgrind, and our use of valgrind wouldn't hide this problem.

But I'm far too lazy to do that today for both allocators, so I'm using a dirty hack to ensure that using g_free() on GSlice memory will upset valgrind even with GSLICE=always-malloc:

--- gslice.c~	2010-09-13 16:57:51.000000000 +0100
+++ gslice.c	2010-10-12 11:08:21.000000000 +0100
@@ -839,7 +839,7 @@ g_slice_alloc (gsize mem_size)
       g_mutex_unlock (allocator->slab_mutex);
     }
   else                          /* delegate to system malloc */
-    mem = g_malloc (mem_size);
+    mem = g_malloc (mem_size + 8) + 8;
   if (G_UNLIKELY (allocator->config.debug_blocks))
     smc_notify_alloc (mem, mem_size);
 
@@ -904,7 +904,7 @@ g_slice_free1 (gsize    mem_size,
     {
       if (G_UNLIKELY (g_mem_gc_friendly))
         memset (mem_block, 0, mem_size);
-      g_free (mem_block);
+      g_free (mem_block - 8);
     }
   TRACE (GLIB_SLICE_FREE((void*)mem_block, mem_size));
 }
@@ -980,7 +980,7 @@ g_slice_free_chain_with_offset (gsize   
           abort();
         if (G_UNLIKELY (g_mem_gc_friendly))
           memset (current, 0, mem_size);
-        g_free (current);
+        g_free (current - 8);
       }
 }
Comment 10 David Woodhouse 2010-10-12 20:57:23 UTC
Created attachment 172216 [details] [review]
valgrind annotations for gslice

This adds annotations to gslice, although it's not quite right because it still keeps track of the pages allocated with memalign(). So although it mostly seems to give the right complaints, it gives the wrong allocation trace:

==2251== 1 errors in context 1 of 2:
==2251== Invalid free() / delete / delete[]
==2251==    at 0x4A04D72: free (vg_replace_malloc.c:325)
==2251==    by 0x400588: main (in /home/dwmw2/slice)
==2251==  Address 0x4f38020 is 32 bytes inside a block of size 496 alloc'd
==2251==    at 0x4A04360: memalign (vg_replace_malloc.c:532)
==2251==    by 0x4A043B9: posix_memalign (vg_replace_malloc.c:660)
==2251==    by 0x4C683EB: slab_allocator_alloc_chunk (gslice.c:1164)
==2251==    by 0x4C69970: g_slice_alloc (gslice.c:682)
==2251==    by 0x400565: main (in /home/dwmw2/slice)

In this case, 0x4f38020 was *also* allocated by a call to g_slice_alloc()...
Comment 11 Milan Crha 2010-10-13 07:07:47 UTC
(In reply to comment #10)
> valgrind annotations for gslice

Well, you should rather open a bug against glib and offer them this change. They will not look for glib patches in evolution bugs for sure. On the other hand, if this patch fixes the initial issue then this bug can be safely moved to glib, instead of filling the new bug.
Comment 12 David Woodhouse 2010-10-13 10:11:55 UTC
The patch doesn't fix anything -- just makes valgrind work a little better with gslice, so we don't have to use GSLICE=always-malloc, and running in valgrind should actually catch this bug.

But it isn't finished yet; I'll submit it when I've got a response to http://www.mail-archive.com/valgrind-users@lists.sourceforge.net/msg02045.html and fixed the remaining issues.
Comment 13 David Woodhouse 2010-10-19 21:59:26 UTC
See bug 335126 for gslice/valgrind discussion. Since implementing the offset allocation as described in comment 9 above, I still haven't managed to trigger this.
Comment 14 David Woodhouse 2010-10-26 11:04:26 UTC
I still haven't managed to reproduce this... until last night, when I tried to connect with 2.32 when I wasn't on the VPN. (Yes, it should *know* that it needs a VPN connection, and it should ask NetworkManager to make one before that particular account comes online, but that's about three separate RFEs for another day).

This wasn't the box with my debugging version of glib, but it does make me strongly suspect that this is a manifestation of bug 631290 and bug 632212.
Comment 15 Milan Crha 2010-10-27 06:47:59 UTC
(In reply to comment #14)
> This wasn't the box with my debugging version of glib, but it does make me
> strongly suspect that this is a manifestation of bug 631290 and bug 632212.

Do you propose to close this one in favour of one of these bugs? Maybe also part of bug #631804? I saw a strange crash in mail_msg_free, but after a workaround on this it didn't crash for a day (at other user's machine).
Comment 16 David Woodhouse 2010-10-27 13:06:31 UTC
Dunno. In comment 7 I said this looked very similar to a number of other bugs, and I'm not sure that *all* of those are going to be the same as the imapx connect failure one. I suspect there may be a few memory corruptors, and we'll really need the gslice/valgrind stuff working to properly get to the bottom of all of them.
Comment 17 André Klapper 2013-03-27 08:46:46 UTC
Comment on attachment 172216 [details] [review]
valgrind annotations for gslice

Patch "isn't finished yet" according to comment 12 --> needs-work
Comment 18 Milan Crha 2014-03-07 10:04:44 UTC
I'm closing this as obsolete.