Bug 446277 – gnome-panel is leaking memory if orca is running

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 446277 - gnome-panel is leaking memory if orca is running


Summary:	gnome-panel is leaking memory if orca is running


Status:	RESOLVED FIXED

Product:	pyatspi2
Classification:	Applications
Component:	general
Version:	unspecified
Hardware:	Other Linux

Importance:	Urgent critical
Target Milestone:	---
Assigned To:	Eitan Isaacson
QA Contact:	Li Yuan

URL:
Whiteboard:

Duplicates:	491845 (view as bug list)
Depends on:
Blocks:	491756

Reported:	2007-06-11 08:51 UTC by Mario Lang
Modified:	2019-03-27 20:11 UTC

See Also:
GNOME target:	---
GNOME version:	2.17/2.18

Attachments
Standalone test case (8.51 KB, text/plain) 2007-06-13 19:12 UTC, Willie Walker		Details
Revised test case to make memory grow faster (8.66 KB, text/plain) 2007-06-13 19:33 UTC, Willie Walker		Details
at-spi patch (769 bytes, patch) 2007-07-09 10:59 UTC, Li Yuan	committed	Details \| Review
Test script (1.66 KB, text/plain) 2007-10-30 20:09 UTC, Eitan Isaacson		Details
Proposed pyatspi fix (1.28 KB, patch) 2007-11-01 21:32 UTC, Eitan Isaacson	none	Details \| Review
Proposed pyatspi fix (1.18 KB, patch) 2007-11-01 21:46 UTC, Eitan Isaacson	needs-work	Details \| Review
Test script (1.67 KB, text/x-python) 2007-11-12 20:16 UTC, Eitan Isaacson		Details
Fix (1.28 KB, patch) 2007-11-15 18:32 UTC, Eitan Isaacson	committed	Details \| Review

Description Mario Lang 2007-06-11 08:51:55 UTC

I noticed that gnome-panel needs 820MB of RAM on one machine
where I have orca running since some days now.

A little investigation showed that I can reproduce this problem on
another machine of mine easily.  Now gnome-panel is growing about
200KB ram every 10 secunds or so...

How to reproduce:
1. Navigate to the panel clock.
2. Enter clock preferences and enable "show seconds" checkbox.
3. Close clock preferences and wait.

If you now watch gnome-panel in top, you should be able to see
it grow more and more memory consumption steadily.

Comment 1 Willie Walker 2007-06-11 13:21:09 UTC

Please see also bug 435199 - Orca is bloating the swap partition, so the system is no more usable after a short time.  If you set orca.settings.debugMemoryUsage=True in your ~/.orca/orca-customizations.py file, Insert+Ctrl+F8 will dump some brief memory statistics to the console.

For this bug, the only way I can think that Orca would be causing the problem is if it wasn't properly unref'ing CORBA objects so that gnome-panel could free them.  I tried running the test case listed in the opener for this bug, though, and could not reproduce the behavior.  :-(  Using the new memory debug stuff, I also couldn't see any new objects being tracked by the garbage collector.  What is the rate (e.g., how many meg's a minute) at which you are observing memory growth?

Comment 2 Willie Walker 2007-06-12 00:50:51 UTC

(In reply to comment #1)
> Please see also bug 435199 - Orca is bloating the swap partition, so the system
> is no more usable after a short time.  If you set
> orca.settings.debugMemoryUsage=True in your ~/.orca/orca-customizations.py
> file, Insert+Ctrl+F8 will dump some brief memory statistics to the console.

I followed the test case described in this bug and let it run for 15 minutes.  During that time, I monitored the number of objects known to Python via Insert+Ctrl+F8 and the stuff from bug 435199.  I didn't see the overall number of objects in the Orca Python process increase at all.  But, I did observe that the memory usage (via top) of gnome-panel climbed during that time.  In particular, RES grew from 83m to 92m.  When I killed Orca and let the test case continue to run, gnome-panel no longer continued to grow.

So, I'm perplexed.  If Orca isn't growing, I'm not sure what it might be hanging onto that's causing gnome-panel to grow.  This makes me lean towards calling this a bug on the gnome-panel side, either in gnome-panel itself or the atk/gail/libgail-gnome support.

Comment 3 Willie Walker 2007-06-13 18:45:41 UTC

I did some more debugging, which consisted of the following:

1) Running "top -p `ps -elf | grep gnome-panel | grep -v grep |  awk '{ print $4 }'`" in one shell to watch the gnome-panel process

2) Selectively trimming (i.e., just adding 'return' statements to prevent code from being reached) where Orca was processing events and then rerunning Orca to see if gnome-panel grew

It turns out that the check for event.source.state.count(atspi.Accessibility.STATE_DEFUNCT) in focus_tracking_presenter.py:_processObjectEvent was a contributor to the problem.  By returning immediately after this call, gnome-panel grew.  By returning immediately before it, gnome-panel remained the same size.

When I then began trimming atspi.py:__get_state, it turns out that just the call to self.accessible.getState() was a contributor.  Now...it's off to check out why this is causing a leak.  I'm going to start by looking at some AT-SPI bugs: bug 375319, bug 427836, and bug 428007.

Comment 4 Willie Walker 2007-06-13 19:12:54 UTC

Created attachment 89901 [details]
Standalone test case

Here's a standlone test case that shows the problem.  If you comment out line 253 (foo = e.source.getState()), gnome-panel's memory doesn't grow when you run this test case.  If you uncomment the line, gnome-panel's memory slowly grows.  Still digging and will be trying the latest AT-SPI stuff to see if it makes the problem go away.

Comment 5 Willie Walker 2007-06-13 19:33:08 UTC

Created attachment 89903 [details]
Revised test case to make memory grow faster

This test case is the same as the first except it makes the gnome-panel memory grow faster by calling getState() 1000 times for each event.  It also tries to unref and del the result of the call to see if that solved any problems.  It didn't.  In addition, the problem continues to exist even with the latest AT-SPI.  So, I'm guessing there's a leak in the atk somewhere.

Comment 6 Willie Walker 2007-06-13 20:17:17 UTC

If I make the following changes to the event-listener-test of at-spi, I can also reproduce the memory growth when I run the ensuing event-listener-test:

wwalker@wwalker-laptop:~/at-spi/trunk/test$ svn diff
Index: event-listener-test.c
===================================================================
--- event-listener-test.c       (revision 933)
+++ event-listener-test.c       (working copy)
@@ -320,9 +320,12 @@
 void
 report_event (const AccessibleEvent *event, void *user_data)
 {
+  int i;
   static long count = 0;
   char *s = Accessible_getName (event->source);
-  fprintf (stderr, "%s %s\n", event->type, s);
+  fprintf (stderr, "FOO %s %s\n", event->type, s);
+  for (i = 0; i < 1000; i++)
+          Accessible_getStateSet (event->source);
   if (s) SPI_freeString (s);
   if (count == 0) {
          g_timer_reset (timer);

Comment 7 Willie Walker 2007-06-13 20:39:47 UTC

When I investigate this more, and change the call from getState() to getRelationSet() in the python test case, I don't see the memory grow.  I did that as a means to see if somehow the Python CORBA support was causing issues.  So, I'm still leaning towards this being something on the application side.

Li - this seems like there might be a leak in atk or gail.  Do you have any ideas?

Comment 8 Willie Walker 2007-06-13 22:57:23 UTC

BTW, to give an indication of the severity of the problem: I think any call to getState() will cause the application to leak.  

Orca uses the state of Accessible's all over the place as a means to help Orca make decisions about things.  A quick analysis of the Orca code shows about 137 references to ".state" in the code, each of which results in a call to getState().  When I add some debug to atspi.py:__get_state, I see there are typically about 3 calls to getState per event, and typically double that when Orca is actually trying to figure out what to present to the user.

When we consider things such as the wireless network signal strength indicator applet are constantly emitting events, this really is a bad problem, and the most convenient way to recover is by logging out and logging back in.

I'm marking this as "Urgent: This bug blocks usability of a large portion of the product, and really should be fixed before the next planned release." and "Critical: Crashes, causes loss of data, or is a severe memory leak."

Comment 9 Li Yuan 2007-06-14 15:56:02 UTC

Probably in at-spi/libspi/stateset.c, I will verify my patch when I am back to office tomorrow.

Comment 10 Willie Walker 2007-06-14 23:57:06 UTC

(In reply to comment #9)
> Probably in at-spi/libspi/stateset.c, I will verify my patch when I am back to
> office tomorrow.
> 

Yeah Li!  Thanks!  Do you have a patch you can post for future reference?

Comment 11 Li Yuan 2007-06-15 12:31:06 UTC

I'd like to. But I find the patch only solve part of the problem, trying to find other leaks. And I will paste the final patch.

Comment 12 Li Yuan 2007-06-25 10:39:24 UTC

Sorry for update the bug so lately. I tried several ways but still can not exterminate leak. I ran valgrind with gedit, and found this leak is relative  to getState. But I have no idea about how getState generates the leak. Maybe I need a little more time to look into it.

==9845== 354,003 (353,993 direct, 10 indirect) bytes in 8,852 blocks are definitely lost in loss record 234 of 237 
==9845==    at 0x4021765: malloc (vg_replace_malloc.c:149)                             
==9845==    by 0x4B27174: g_malloc (gmem.c:131)
==9845==    by 0x4F3FAFF: ORBit_alloc_string (in /usr/lib/libORBit-2.so.0.1.0)         
==9845==    by 0x4F3F7B8: CORBA_string_dup (in /usr/lib/libORBit-2.so.0.1.0)
==9845==    by 0x4F3F2FA: CORBA_exception_set (in /usr/lib/libORBit-2.so.0.1.0)        
==9845==    by 0x4F3F3A2: CORBA_exception_set_system (in /usr/lib/libORBit-2.so.0.1.0) 
==9845==    by 0x4F4D47D: (within /usr/lib/libORBit-2.so.0.1.0)
==9845==    by 0x4F4E7C1: ORBit_handle_request (in /usr/lib/libORBit-2.so.0.1.0)       
==9845==    by 0x4F379E6: giop_connection_handle_input (in /usr/lib/libORBit-2.so.0.1.0)
==9845==    by 0x4F5542C: (within /usr/lib/libORBit-2.so.0.1.0)
==9845==    by 0x4F5831D: (within /usr/lib/libORBit-2.so.0.1.0)                        
==9845==    by 0x4B1EFA3: g_main_dispatch (gmain.c:2061)

Comment 13 Li Yuan 2007-06-25 11:01:05 UTC

The full trace:

==17068== 335,643 (335,633 direct, 10 indirect) bytes in 8,393 blocks are definitely lost in loss record 230 of 234
==17068==    at 0x4021765: malloc (vg_replace_malloc.c:149)
==17068==    by 0x4B27174: g_malloc (gmem.c:131)
==17068==    by 0x4F44F65: ORBit_alloc_string (allocators.c:228)
==17068==    by 0x4F4488A: CORBA_string_dup (corba-string.c:22)
==17068==    by 0x4F441F0: CORBA_exception_set (corba-env.c:118)
==17068==    by 0x4F4418B: CORBA_exception_set_system (corba-env.c:102)
==17068==    by 0x4F53FAF: ORBit_POA_handle_request (poa.c:1555)
==17068==    by 0x4F5915D: ORBit_handle_request (orbit-adaptor.c:298)
==17068==    by 0x4F3B7E1: giop_connection_handle_input (giop-recv-buffer.c:1308)
==17068==    by 0x4F61148: link_connection_io_handler (linc-connection.c:1387)
==17068==    by 0x4F6389F: link_source_dispatch (linc-source.c:159)
==17068==    by 0x4B1EFA3: g_main_dispatch (gmain.c:2061)

Comment 14 Li Yuan 2007-06-25 11:02:20 UTC

Hi Michael, do you have any idea about this?

Comment 15 Li Yuan 2007-06-26 08:47:07 UTC

Ha, I got an explanation. There is a memory leak in registryd, and a simple patch can fix it:
Index: libspi/accessible.c
===================================================================
--- libspi/accessible.c (revision 933)
+++ libspi/accessible.c (working copy)
@@ -331,6 +331,10 @@
 
   bonobo_return_val_if_fail (object != NULL, NULL, ev);
 
+  old_set = g_object_get_data (object, "atk_state_set");
+  if (old_set)
+    g_object_unref (old_set);
+
   atk_set = atk_object_ref_state_set (object);
   
   set = spi_state_set_new (atk_set);
@@ -338,6 +342,7 @@
                                  BONOBO_OBJREF(set),
                                  ev);
 
+  g_object_set_data (object, "atk_state_set", atk_set);
   return retval;
 }

But after applied the patch, the memory is still growing when I ran the python script. Thus I found the trace I pasted yesterday. It seems like a CORBA momery leak when there is an exception (maybe it is not a leak, I'm not very familiar with that section of code in CORBA). I looked into the script, it tries to unref the state until there is an exception. So I changed the script to:

--- bug_446277.py       2007-06-26 16:38:24.000000000 +0800
+++ bug_446277_modified.py      2007-06-26 16:22:39.000000000 +0800
@@ -252,11 +252,8 @@
 def notifyEvent(e):
     for i in range(0, 100):
         foo = e.source.getState()
-        while True:
-            try:
-                foo.unref()
-            except:
-                break
+        foo.unref()
+        foo.unref()
         del foo
     pass

I unref the state twice because I know the ref_count will be 2.
After this, the memory just stop growing. So I think my original patch can fix the leak.

Comment 16 Willie Walker 2007-06-26 11:51:38 UTC

(In reply to comment #15)
> Ha, I got an explanation. There is a memory leak in registryd, and a simple
> patch can fix it:

Yeah!  Thanks Li!

> --- bug_446277.py       2007-06-26 16:38:24.000000000 +0800
> +++ bug_446277_modified.py      2007-06-26 16:22:39.000000000 +0800
> @@ -252,11 +252,8 @@
>  def notifyEvent(e):
>      for i in range(0, 100):
>          foo = e.source.getState()
> -        while True:
> -            try:
> -                foo.unref()
> -            except:
> -                break
> +        foo.unref()
> +        foo.unref()
>          del foo
>      pass
> 
> I unref the state twice because I know the ref_count will be 2.
> After this, the memory just stop growing. So I think my original patch can fix
> the leak.

With your patch, if you remove the unref()'s altogether, does the leak still occur?  I'm trying to figure out when we have to call unref --- I'm assuming we shouldn't have to.

Comment 17 Li Yuan 2007-06-26 13:22:09 UTC

I think the best way is to unref it on the Orca side. Because at-spi doesn't know when Orca doesn't use the state anymore.

Comment 18 Li Yuan 2007-06-26 14:13:45 UTC

Oh sorry, I answer your question in a wrong order :)
Yes, we need to unref it to avoid leak.
If Orca doesn't want to unref the state, at-spi can unref it when the next time Orca calls getState on the same object. Does this sound OK?

The patch will be:

Index: libspi/accessible.c
===================================================================
--- libspi/accessible.c (revision 933)
+++ libspi/accessible.c (working copy)
@@ -325,12 +325,21 @@
                                         CORBA_Environment     *ev)
 {
   AtkObject *object = get_atkobject_from_servant (servant);
-  AtkStateSet *atk_set;
+  AtkStateSet *atk_set, *old_set;
   SpiStateSet *set;
-  Accessibility_StateSet retval;
+  Accessibility_StateSet retval, old_retval;
 
   bonobo_return_val_if_fail (object != NULL, NULL, ev);
 
+  old_set = g_object_get_data (object, "spi_state_set");
+  if (old_set)
+    g_object_unref (old_set);
+
+  old_retval = g_object_get_data (object, "return_state");
+  if (old_retval) {
+    Bonobo_Unknown_unref (old_retval, ev);
+    Bonobo_Unknown_unref (old_retval, ev);
+  }
   atk_set = atk_object_ref_state_set (object);
   
   set = spi_state_set_new (atk_set);
@@ -338,6 +347,8 @@
                                  BONOBO_OBJREF(set),
                                  ev);
 
+  g_object_set_data (object, "spi_state_set", atk_set);
+  g_object_set_data (object, "return_state", retval);
   return retval;
 }

Comment 19 Willie Walker 2007-06-26 21:18:17 UTC

I'm not sure of the right thing to do here.  Maybe Michael Meeks can help us? 
Michael?

Comment 20 Li Yuan 2007-07-01 15:29:46 UTC

What about at-spi reduces the ref_count to 1, and Orca unref it once? I think it is reasonable because only Orca itself knows when it doesn't use the state anymore.

Comment 21 Michael Meeks 2007-07-04 16:04:42 UTC

So - yes, this is a rather simple resource lifecycle issue. If you call 'getStateSet' then you have to unref the handle later - period :-)

It is of course, unfortunate that the StateSet is an object reference instead of a blob of data we can throw over in one chunk, but ... an unfortunate mistake of API design that can't easily be helped now [ burns valuable round-trips too ]

Comment 22 Willie Walker 2007-07-08 16:45:24 UTC

(In reply to comment #21)
> So - yes, this is a rather simple resource lifecycle issue. If you call
> 'getStateSet' then you have to unref the handle later - period :-)
> 
> It is of course, unfortunate that the StateSet is an object reference instead
> of a blob of data we can throw over in one chunk, but ... an unfortunate
> mistake of API design that can't easily be helped now [ burns valuable
> round-trips too ]
> 

Yikes!  I'm going to get getRelationSet has the same problem.  Is the unref'ing of this kind of thing something that can be done by PyORBit (e.g., in the destructor for these objects)?

Comment 23 Michael Meeks 2007-07-09 08:36:40 UTC

Willie - basically no; the ORB has no knowledge of such reference counts: indeed CORBA has no lifecycle awareness at all.
IIWY I would wrap the 'getRelationSet' call - and return an 'OrcaRelationSet' (or whatever) that does unref the object on dispose.
Of course - the same problem holds true for ~all handles to remote Accessible objects, so ... this question is a little concerning ;-) - it cuts to the heart of the a11y / lifecycle debate :-) [ in which I contend that using reference counting was a "really bad idea" (TM) ;-].

Comment 24 Li Yuan 2007-07-09 10:59:47 UTC

Created attachment 91486 [details] [review]
at-spi patch

Comment 25 Willie Walker 2007-07-09 11:48:24 UTC

(In reply to comment #24)
> Created an attachment (id=91486) [edit]
> at-spi patch
> 

Thanks! Do we need something similar for impl_accessibility_accessible_get_relation_set as well?

Comment 26 Michael Meeks 2007-07-09 12:27:52 UTC

yes - you need to unref each member of the sequence.

Comment 27 Willie Walker 2007-07-09 12:59:01 UTC

The StateSet code for Orca does this:

    def __get_state(self):
        """Returns the Accessible StateSeq of this object, which is a
        sequence of Accessible StateTypes.
        """

        try:
            stateSet = self.accessible.getState()
        except:
            stateSet = None
        if stateSet:
            try:
                state = stateSet._narrow(Accessibility.StateSet).getStates()
            except:
                state = []
        else:
            state = []

        # [[[WDW - we don't seem to always get appropriate state changed
        # information, so we will not cache state information.]]]
        #
        #if state and settings.cacheValues:
        #    self.state = state

        return state

I'm guessing we could just do a "stateSet.unref()" right before the "return state".  Does that seem right?  Do we also need to unref the 'state' thing, too?

In addition, the RelationSet code for Orca does this:

    def __get_relations(self):
        """Returns the Accessible RelationSet of this object as a list.
        """

        relations = []

        relationSet = self.accessible.getRelationSet()

        for relation in relationSet:
            try:
                relations.append(relation._narrow(Accessibility.Relation))
            except:
                pass

        return relations

I'm guessing the same thing applies -- we could just do a "relationSet.unref()" right before the "return relations".  Does this seem right?  Do we also need to unref each element of relations after we've used it?

Comment 28 Michael Meeks 2007-07-09 16:09:17 UTC

The stateset unref code looks fine - yes.
Unfortunately the relationSet is itself a sequence; you can't unref that. The sequence however contains a set of 'Relation' objects - each of which need to be unreffed when 'relations' is itself freed: so if you can handle the lifecycle of 'relations' and unref each item in that on dispose you'll have it nailed I think.

The answers to your questions though all live in at-spi/libspi/accessible.c and friends - and a few moments thought from the IDL :-)

Comment 29 Willie Walker 2007-07-10 12:15:58 UTC

Reassigning this back to Orca for the rest of the work.

Comment 30 Willie Walker 2007-07-21 23:20:46 UTC

Michael Meeks was very generous with his time at GUADEC and we had a good chat about this. The result is a summary of considerations I think we might need to make to reduce/eliminate leaks we might be causing in applications. It all tends to revolve around us not doing unref()'s on objects when we need to.

We should only do ref() on a non-primitive type (e.g., a CORBA Accessible) if we got it via an event and we expect it to persist after we handle the event. If we get non-primitive types in other ways (e.g., method calls and field references), the ref() is implicitly done for us. This also includes the result of calls to queryInterface.

We always need to unref() non-primitive types that we get via method calls that we make (e.g., obj.getChildAtIndex(i)). We also need to unref() fields that are non-primitive types (e.g., obj.parent) in objects we get via method calls.

So...this, includes anything where we might get an Accessible. This includes the following method calls, and I suspect we can tackle most of this issue via our 'makeAccessible' method: add a incrementRefCount=False parameter to this method and only set it to True if we are calling this with an object we get via an Event. The Accessible.__del__ method will take care of the unref() for us.

Accessible accessible.parent
Accessible accessible.getChildAtIndex
Accessible accessible.component.getAccessibleAtPoint
Accessible accessible.hyperlink.getObject
Accessible accessible.selection.getSelectedChild
Accessible accessible.table.caption
Accessible accessible.table.summary
Accessible accessible.table.getAccessibleAt
Accessible accessible.table.getRowHeader
Accessible accessible.table.getColumnHeader
Application accessible.getApplication

We also need to free up non-primitive types we get directly. These include:

The objects returned by all calls to queryInterface. We probably
can check for these in the accessible's dictionary in the
Accessible.__del__ method. The specific fields are:
action, component, hyperlink, hypertext, image, selection, document,
table, text, value

StateSet accessible.getState
-> Probably can fix this by calling stateSet.unref() just before
returning from atspi.py:__get_state.

BoundingBox accessible.component.getExtents
-> We might consider copying the BoundingBox fields to a local
object and then unref'ing the original. The problem is that
this is a call on the component interface and not the
accessible itself. We don't have a wrapper for a component.

BoundingBox accessible.image.getImageExtents
-> We might consider copying the BoundingBox fields to a local
object and then unref'ing the original. The problem is that
this is a call on the image interface and not the
accessible itself. We don't have a wrapper for an image.

Desktop registry.getDesktop
-> Probably can fix this by creating a __del__ for atspi.py:Registry
and doing an unref() of registry.desktop in it.

DevEvtCtrlr registry.getDeviceEventController
-> This is used in atspi.py:KeystrokeListener. Probably should save
a single reference to the result of
"d = self.registry.getDeviceEventController"
and then do an unref() on it in the __del__ method for
atspi.py:KeystrokeListener.

RelationSet accessible.getRelationSet
-> We don't need to unref() the result of the call (it's an array).
Instead, we need to unref each Relation. I'm not sure of the
best way to do this. We might consider just providing an
atspi.py:deleteRelationSet or something like that that does
the unref()'ing for us.

DesktopSeq registry.getDesktopList
-> Same command as RelationSet (need to unref each Desktop)

RangeList accessible.text.getBoundedRanges
-> Same comment as RelationSet (need to unref each Range), though we
might copy each of the Range fields to a local object and then
unref() the original. The problem, however, is similar to that
of component.getExtents and image.getImageExtents -- we're working
with the text interface and not a wrapped accessible.

Comment 31 Willie Walker 2007-08-13 19:06:44 UTC

We are going to postpone working on this bug until after the pyatspi integration is done.  Then, Eitan will look at this for us.  :-)

Comment 32 Eitan Isaacson 2007-10-30 20:09:54 UTC

Created attachment 98203 [details]
Test script

This will probably only work in an English locale. The script takes an accessible object in gedit's "open files" dialog which has a relation and hammers it with getState() and getRelation() 100,000 times each.

To use this test script:
1. Run gedit
2. Do ctrl+o to get "Open Files" dialog
3. To see gedit's memory usage run in a terminal: top -p `ps -elf | grep gedit | grep -v grep |  awk '{ print $4 }'`
4. Execute given script.
5. Watch gedit's memory usage quadruple.

Comment 33 Eitan Isaacson 2007-10-30 20:18:33 UTC

See bug 491845 for a fix that solves the test case above. Through trial and error I found out that everything needs to be unreffed twice. Any ideas why?

Comment 34 Willie Walker 2007-10-31 14:39:51 UTC

(In reply to comment #33)
> See bug 491845 for a fix that solves the test case above. Through trial and
> error I found out that everything needs to be unreffed twice. Any ideas why?

In comment #15, Li also noticed the need to unref twice.  Comment #30 might contain some information about this.  In particular: 

"We should only do ref() on a non-primitive type (e.g., a CORBA Accessible) if
we got it via an event and we expect it to persist after we handle the event. 
If we get non-primitive types in other ways (e.g., method calls and field
references), the ref() is implicitly done for us.  This also includes the
result of calls to queryInterface.

We always need to unref() non-primitive types that we get via method calls that
we make (e.g., obj.getChildAtIndex(i)).  We also need to unref() fields that
are non-primitive types (e.g., obj.parent) in objects we get via method calls."

So...if this is true, maybe pyatspi is doing one too many refs?  However, maybe some debugging using the new ref count feature provided by bug 471391 will shed some insight on the accuracy of when things are already ref'd for us and when they are not.

Comment 35 Eitan Isaacson 2007-11-01 21:28:43 UTC

Yep, pyatspi was doing one too many refs, it was a result of my erroneous patch in bug #490202. So StateSet only needs to be unreffed once now. For some reason Relations still need two unrefs(), anyone know why?

Comment 36 Eitan Isaacson 2007-11-01 21:29:18 UTC

*** Bug 491845 has been marked as a duplicate of this bug. ***

Comment 37 Eitan Isaacson 2007-11-01 21:32:30 UTC

Created attachment 98345 [details] [review]
Proposed pyatspi fix

Just copied this over from bug #491845.

Comment 38 Eitan Isaacson 2007-11-01 21:46:02 UTC

Created attachment 98346 [details] [review]
Proposed pyatspi fix

The previous patch's changes were commented out :P
Ok, last spam for today.
The next post will be interesting.

Comment 39 Willie Walker 2007-11-06 17:36:14 UTC

The leakage existed prior to the pyatpi migration, so I'm removing this as a blocker for pyatspi migration bug 448848.  It's still a blocker for performance bug 491756.

Comment 40 Willie Walker 2007-11-06 18:23:42 UTC

(In reply to comment #38)
> Created an attachment (id=98346) [edit]
> Proposed pyatspi fix

This patch does a double unref() of the relation, which seems strange.  Eitan and I discussed this today, and we're perplexed about where the extra ref is coming from.

Michael and Li, we could use your expertise on this one.  Li fixed an AccessibleState reference problem with the patch at http://bugzilla.gnome.org/attachment.cgi?id=91486.  I'm wondering if something similar might be needed for AccessibleRelation stuff.  

Michael and Li, here's a link to accessible.c.  The AccessibleState code starts at line 322 and the AccessibleRelation code starts at line 349.  Would you be able to take a look and offer us some guidance?

http://svn.gnome.org/viewvc/at-spi/trunk/libspi/accessible.c?view=annotate

In particular, I'm wondering if anything glaring might be (or not be) in impl_accessibility_accessible_get_relation_set that might the source of the double ref count.

Thanks!

Will

Comment 41 Willie Walker 2007-11-07 15:07:46 UTC

From private e-mail -- This Member's Excellent Engineering Kickbutt Summary helps a lot.  I'm withholding the name in the event this member doesn't want to be identified as the goto guy for speaking into the Bonobo/ORBit Michaelrophone:

"The bug looks like it is in at-spi/libspi/accessible.c:

    retval->_buffer[i] =
        bonobo_object_dup_ref (
          BONOBO_OBJREF (
            spi_relation_new (atk_relation_set_get_relation(relation_set, i))),
	  ev);

Is broken - we create the relation with 1 ref; so this incs that to two; instead we should use:

	CORBA_Object_duplicate (BONOBO_OBJREF (spi_relation_new...),
				NULL);

but not object_dup_ref."

Thanks, Mysterious Enquiring Enigma of Keen Smarts!  :-)

Eitan, can you give this a shot with your test environment?

Comment 42 Li Yuan 2007-11-12 08:40:27 UTC

Hi Will, I made the patch. But I found no matter if I apply the patch, when I ran the test script, memory grows when getState, and stop growing when getRelationSet. I am a little confused about the bug, are we leaking memory from getState but not getRelationset? Or am I doing something wrong?

Comment 43 Eitan Isaacson 2007-11-12 20:16:08 UTC

Created attachment 98989 [details]
Test script

This test script uses a better sample accessible for the test "Character Coding:". The "Location:" label is not always visible, so it has no relations.

Comment 44 Eitan Isaacson 2007-11-15 18:32:02 UTC

Created attachment 99155 [details] [review]
Fix

This fixes the unref() issue for both StateSet and Relation

Comment 45 Willie Walker 2007-11-27 15:29:19 UTC

It seems as though this bug probably can be closed now.  Thoughts?

Comment 46 Eitan Isaacson 2007-11-27 20:55:17 UTC

This is fixed in at-spi head (both at-spi proper, and in pyatspi). We should see this in the next at-spi dev release.