After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 334217 - Huge leak in at-spi-registryd
Huge leak in at-spi-registryd
Status: RESOLVED FIXED
Product: at-spi
Classification: Platform
Component: registry
1.7.x
Other Linux
: High major
: ---
Assigned To: bill.haneman
bill.haneman
Depends on:
Blocks:
 
 
Reported: 2006-03-11 14:12 UTC by Vincent Untz
Modified: 2006-03-14 17:42 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch removing some CORBA_string_dup() causing leaks (4.97 KB, patch)
2006-03-11 17:53 UTC, Vincent Untz
committed Details | Review

Description Vincent Untz 2006-03-11 14:12:47 UTC
A few days ago, I killed at-spi-registryd and got back 100M of RAM.

Today, I ran it in valgrind for one or two hours. I didn't have all the symbols, but I found something interesting anyway. Note that I might be wrong...

==5043==    at 0x401C422: malloc (vg_replace_malloc.c:149)
==5043==    by 0x475DF36: g_malloc (gmem.c:131)
==5043==    by 0x415C7F7: ORBit_alloc_string (in /usr/lib/libORBit-2.so.0.1.0)
==5043==    by 0x415C2FB: CORBA_string_dup (in /usr/lib/libORBit-2.so.0.1.0)
==5043==    by 0x406AE4A: spi_init_any_nil (util.c:129)
==5043==    by 0x804C730: (within /usr/lib/at-spi/at-spi-registryd)
==5043==    by 0x804C9E0: (within /usr/lib/at-spi/at-spi-registryd)
==5043==    by 0x804CA7D: (within /usr/lib/at-spi/at-spi-registryd)
==5043==    by 0x47583A7: ??? (gmain.c:3292)
==5043==    by 0x47567D6: g_main_context_dispatch (gmain.c:1916)

==5043== 623,008 (590,220 direct, 32,788 indirect) bytes in 16,395 blocks are definitely lost in loss record 1,216 of 1,217
==5043==    at 0x401D7AA: calloc (vg_replace_malloc.c:279)
==5043==    by 0x475DEBF: g_malloc0 (gmem.c:150)
==5043==    by 0x415CA7C: ORBit_alloc_by_tc (in /usr/lib/libORBit-2.so.0.1.0)
==5043==    by 0x415724C: ORBit_small_alloc (in /usr/lib/libORBit-2.so.0.1.0)
==5043==    by 0x406AE10: spi_init_any_nil (util.c:121)
==5043==    by 0x804C730: (within /usr/lib/at-spi/at-spi-registryd)
==5043==    by 0x804C9E0: (within /usr/lib/at-spi/at-spi-registryd)
==5043==    by 0x804CA7D: (within /usr/lib/at-spi/at-spi-registryd)
==5043==    by 0x47583A7: ??? (gmain.c:3292)
==5043==    by 0x47567D6: g_main_context_dispatch (gmain.c:1916)

So, with this knowledge, I looked at the code. Here's what I found:

 + in the code of the registry, spi_init_any_nil() is always called with 
   CORBA_string_dup ("") as the last argument. But in spi_init_any_nil(),
   we use the last argument (name) to do this:
   details->source_name = CORBA_string_dup (name);
   eeeek
   We leak the argument.

 + here's a typical example of the code:

  spi_init_any_nil (&e.any_data,
		    e.source,
		    Accessibility_ROLE_DESKTOP_FRAME,
		    CORBA_string_dup (""));
  Accessibility_Accessible_unref (a, &ev);
  Accessibility_Registry_notifyEvent (BONOBO_OBJREF (registry),
				      &e, &ev);
  Accessibility_Desktop_unref (e.source, &ev);
  CORBA_exception_free (&ev);

   Now look at the spi_init_any_nil() code:

void 
spi_init_any_nil (CORBA_any *any_details, 
		  Accessibility_Application app, 
		  Accessibility_Role role,
		  CORBA_string name)
{
  Accessibility_EventDetails *details = Accessibility_EventDetails__alloc();

  any_details->_type = TC_Accessibility_EventDetails;
  any_details->_value = details;
  any_details->_release = TRUE;

  details->host_application = app;
  details->source_role = role;
  details->source_name = CORBA_string_dup (name);

  details->any_data._type = TC_null;
  details->any_data._value = NULL;
  details->any_data._release = TRUE;
}

    Note that we alloc details, and it's never freed after we're done notifying
    the event. This leaks the details structure, but also details->source_name.
Comment 1 bill.haneman 2006-03-11 14:57:00 UTC
"    Note that we alloc details, and it's never freed after we're done notifying
    the event. This leaks the details structure, but also details->source_name."

I think you'll find that we do free the details after notifying the event; see bridge.c line 697, at the end of spi_atk_emit_eventv.

Any extra dups in args spi_init_any_nil in registryd.c do need to be fixed of course.  

Are you using CVS HEAD or at least 1.7.6+?  There was indeed a huge leak like this in 1.7.5.  See bug 332924.

Comment 2 Kjartan Maraas 2006-03-11 16:28:07 UTC
I still see this and other leaks in at-spi with 1.7.6+ here too.

This trace is from an evolution run on march 7th:

 ==2764== 669,394 (491,748 direct, 177,646 indirect) bytes in 13,643 blocks are definitely lost in loss record 313 of 319
==2764==    at 0x40044E5: calloc (vg_replace_malloc.c:260)
==2764==    by 0x4E8A0BD: g_malloc0 (gmem.c:150)
==2764==    by 0x47B8EB5: ORBit_alloc_by_tc (allocators.c:373)
==2764==    by 0x47B429C: ORBit_small_alloc (orbit-small.c:44)
==2764==    by 0x52A6117: spi_init_any_string (util.c:163)
==2764==    by 0x4123853: spi_atk_bridge_init_string (bridge.c:1208)
==2764==    by 0x412400C: spi_atk_bridge_window_event_listener (bridge.c:1121)
==2764==    by 0x4E39F3C: signal_emit_unlocked_R (gsignal.c:2404)
==2764==    by 0x4E3B686: g_signal_emit_valist (gsignal.c:2197)
==2764==    by 0x4E3B848: g_signal_emit (gsignal.c:2241)
==2764==    by 0x5253A70: window_focus (gailutil.c:537)
==2764==    by 0x49B0B6D: _gtk_marshal_BOOLEAN__BOXED (gtkmarshalers.c:83)
==2764==    by 0x4E29ECC: g_closure_invoke (gclosure.c:490)
==2764==    by 0x4E3A162: signal_emit_unlocked_R (gsignal.c:2438)
==2764==    by 0x4E3B44E: g_signal_emit_valist (gsignal.c:2207)
==2764==    by 0x4E3B848: g_signal_emit (gsignal.c:2241)
==2764==    by 0x4A9B1F7: gtk_widget_event_internal (gtkwidget.c:3732)
==2764==    by 0x49AB8C8: gtk_main_do_event (gtkmain.c:1390)
==2764==    by 0x4CBE059: gdk_event_dispatch (gdkevents-x11.c:2291)
==2764==    by 0x4E82B5C: g_main_context_dispatch (gmain.c:1916)
==2764==    by 0x4E85DDE: g_main_context_iterate (gmain.c:2547)
==2764==    by 0x4E86188: g_main_loop_run (gmain.c:2751)
==2764==    by 0x470FAD2: bonobo_main (bonobo-main.c:311)
==2764==    by 0x805C839: main (main.c:610)

Perhaps more disturbingly I see these invalid reads in there too:

==2764== Invalid read of size 1
==2764==    at 0x4006397: memcpy (mac_replace_strmem.c:403)
==2764==    by 0x47B8E6D: CORBA_string_dup (corba-string.c:23)
==2764==    by 0x52A623B: spi_init_any_nil (util.c:129)
==2764==    by 0x41238C6: spi_atk_bridge_init_nil (bridge.c:1188)
==2764==    by 0x41241BB: spi_atk_bridge_signal_listener (bridge.c:1076)
==2764==    by 0x4E39F3C: signal_emit_unlocked_R (gsignal.c:2404)
==2764==    by 0x4E3B686: g_signal_emit_valist (gsignal.c:2197)
==2764==    by 0x4E3E42D: g_signal_emit_by_name (gsignal.c:2265)
==2764==    by 0x52329FC: gail_combo_box_changed_gtk (gailcombobox.c:213)
==2764==    by 0x4E370F8: g_cclosure_marshal_VOID__VOID (gmarshal.c:77)
==2764==    by 0x4E29ECC: g_closure_invoke (gclosure.c:490)
==2764==    by 0x4E3A162: signal_emit_unlocked_R (gsignal.c:2438)
==2764==    by 0x4E3B686: g_signal_emit_valist (gsignal.c:2197)
==2764==    by 0x4E3B848: g_signal_emit (gsignal.c:2241)
==2764==    by 0x4916FBD: gtk_combo_box_set_active_internal (gtkcombobox.c:4463)
==2764==    by 0x65C7AE1: emae_refresh_authtype (em-account-editor.c:1529)
==2764==    by 0x65C8D58: emae_service_provider_changed (em-account-editor.c:1290)
==2764==    by 0x65C97E3: emae_setup_service (em-account-editor.c:1681)
==2764==    by 0x65C9B2E: emae_receive_page (em-account-editor.c:1809)
==2764==    by 0x41E6B30: ec_rebuild (e-config.c:568)
==2764==    by 0x41E76F0: ec_idle_handler_for_rebuild (e-config.c:932)
==2764==    by 0x4E811A0: g_idle_dispatch (gmain.c:3796)
==2764==    by 0x4E82B5C: g_main_context_dispatch (gmain.c:1916)
==2764==    by 0x4E85DDE: g_main_context_iterate (gmain.c:2547)
==2764==    by 0x4E86188: g_main_loop_run (gmain.c:2751)
==2764==    by 0x470FAD2: bonobo_main (bonobo-main.c:311)
==2764==    by 0x805C839: main (main.c:610)
==2764==  Address 0x9063484 is 4 bytes inside a block of size 8 free'd
==2764==    at 0x4004B6C: free (vg_replace_malloc.c:222)
==2764==    by 0x4E89FC0: g_free (gmem.c:187)
==2764==    by 0x5232DC7: gail_combo_box_get_name (gailcombobox.c:244)
==2764==    by 0x4D048C8: atk_object_get_name (atkobject.c:571)
==2764==    by 0x4123796: spi_atk_bridge_init_base (bridge.c:1175)
==2764==    by 0x41238A9: spi_atk_bridge_init_nil (bridge.c:1187)
==2764==    by 0x41241BB: spi_atk_bridge_signal_listener (bridge.c:1076)
==2764==    by 0x4E39F3C: signal_emit_unlocked_R (gsignal.c:2404)
==2764==    by 0x4E3B686: g_signal_emit_valist (gsignal.c:2197)
==2764==    by 0x4E3E42D: g_signal_emit_by_name (gsignal.c:2265)
==2764==    by 0x52329FC: gail_combo_box_changed_gtk (gailcombobox.c:213)
==2764==    by 0x4E370F8: g_cclosure_marshal_VOID__VOID (gmarshal.c:77)
==2764==    by 0x4E29ECC: g_closure_invoke (gclosure.c:490)
==2764==    by 0x4E3A162: signal_emit_unlocked_R (gsignal.c:2438)
==2764==    by 0x4E3B686: g_signal_emit_valist (gsignal.c:2197)
==2764==    by 0x4E3B848: g_signal_emit (gsignal.c:2241)
==2764==    by 0x4916FBD: gtk_combo_box_set_active_internal (gtkcombobox.c:4463)
==2764==    by 0x65C7AE1: emae_refresh_authtype (em-account-editor.c:1529)
==2764==    by 0x65C8D58: emae_service_provider_changed (em-account-editor.c:1290)
==2764==    by 0x65C97E3: emae_setup_service (em-account-editor.c:1681)
==2764==    by 0x65C9B2E: emae_receive_page (em-account-editor.c:1809)
==2764==    by 0x41E6B30: ec_rebuild (e-config.c:568)
==2764==    by 0x41E76F0: ec_idle_handler_for_rebuild (e-config.c:932)
==2764==    by 0x4E811A0: g_idle_dispatch (gmain.c:3796)
==2764==    by 0x4E82B5C: g_main_context_dispatch (gmain.c:1916)
==2764==    by 0x4E85DDE: g_main_context_iterate (gmain.c:2547)
==2764==    by 0x4E86188: g_main_loop_run (gmain.c:2751)

gailcombobox.c:244 is the g_free() here:

  combo_box = GTK_COMBO_BOX (widget);
  gail_combo_box = GAIL_COMBO_BOX (obj);
  g_free (gail_combo_box->name);
  if (gtk_combo_box_get_active_iter (combo_box, &iter))

Is that correct?
Comment 3 Vincent Untz 2006-03-11 17:45:34 UTC
(In reply to comment #1)
> "    Note that we alloc details, and it's never freed after we're done
> notifying
>     the event. This leaks the details structure, but also
> details->source_name."
> 
> I think you'll find that we do free the details after notifying the event; see
> bridge.c line 697, at the end of spi_atk_emit_eventv.

Oh, right. I'm happy that I warned that I might be wrong ;-)

Does this also free details->source_name or just details? I get the impression it just frees details.

> Any extra dups in args spi_init_any_nil in registryd.c do need to be fixed of
> course.  
> 
> Are you using CVS HEAD or at least 1.7.6+?  There was indeed a huge leak like
> this in 1.7.5.  See bug 332924.

I have 1.7.6 now, and I'm 75% sure I already had it when I saw my huge leak. I'll watch the issue again.
Comment 4 Vincent Untz 2006-03-11 17:53:27 UTC
Created attachment 61097 [details] [review]
Patch removing some CORBA_string_dup() causing leaks
Comment 5 bill.haneman 2006-03-11 20:01:16 UTC
Thanks Vincent.  I was talking to Michael Meeks about the CORBA_free, he said it's a deep free and should free details->source_name OK.  But I would not be surprised if there are other nasties lurking, thanks for checking!

Removing unnecessary dup's is always treacherous, I'll have to check the code paths carefully to make sure there isn't some condition that can cause an un-dup'ed string to get freed...
Comment 6 bill.haneman 2006-03-13 13:34:47 UTC
Vincent, I checked your patch and I believe it's correct.  But I wonder if the leak you're seeing could be accounted for by just leaking an empty string...

Kjartan, the g_free in gailcombobox in your comment #2 looks very suspicious.  I'll have a look at that as well.
Comment 7 bill.haneman 2006-03-13 13:54:21 UTC
Kjartan: The gtkcombobox free now is tracked as its own bugid, bug 334438.
Comment 8 bill.haneman 2006-03-14 17:42:08 UTC
Closing this bug for now; if (ahem, when) you find another at-spi leak, possibly a new bug should be opened with the relevant trace info.  Thanks Vincent!