After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 112222 - poll problem (busy loop)
poll problem (busy loop)
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: general
2.2.x
Other Linux
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
: 145225 (view as bug list)
Depends on:
Blocks: 51157
 
 
Reported: 2003-05-04 17:42 UTC by Christian Krause
Modified: 2004-12-22 21:47 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
small hack to solve the issue (1.34 KB, patch)
2003-05-04 17:45 UTC, Christian Krause
none Details | Review
Attempt at afix (5.04 KB, patch)
2003-06-02 19:25 UTC, Owen Taylor
none Details | Review
New version of patch with a bug-fix (5.15 KB, patch)
2003-06-03 23:23 UTC, Owen Taylor
none Details | Review

Description Christian Krause 2003-05-04 17:42:35 UTC
While debugging a strange problem in galeon
(http://bugzilla.gnome.org/show_bug.cgi?id=51157) we have found a bug in glib:

In short: galeon takes nearly 100% CPU time when showing modal dialogs.

Galeon start the main event processing loop. This loop is part of
the glib library and monitors a set of so called "sources" for events. Most
sources are represented by one or more filedescriptors. The events are sent
by other threads or processes and will be handled by event handlers.

Waiting for events on filedescriptors is realized by calling poll() on the 
filedescriptor set. If data is available on one filedescriptor, the
corresponding event handler of the source holding this filedescriptor is
called. 

Some event handlers are reentrant and may be called recursively others not. 
This is stated by the flag G_SOURCE_CAN_RECURSE. 

The problem is, that it is possible (and occurs in galeon) that a second
recursive mainloop is started (e.g. via gtk_dialog_run) during the
execution of an event handler which is not reentrant. If more data is
available on the corresponding filedescriptor during execution of this
second recursive mainloop, the event handler cannot be started because it
has not set G_SOURCE_CAN_RECURSE. The filedescriptor is still in the
pollset. But this will have a very nasty effect: The next call to poll will
return immediately since the waiting data is still there. As poll is called
in a loop we now have a busy loop that ends when the second main loop
finishes (i.e. the dialog gets closed).

The fix for this is rather simple: Exclude all filedescriptors of monitored
sources that will result in executing an event handler that is currently in
execution state and may not be called recursively. The included patch for
glib-2.2.1 is not very efficient but works. We think that a more efficient
solution is possible by modifying some datastructures of glib. But this
part we leave to the glib programmers.


Christian & Ron
Comment 1 Christian Krause 2003-05-04 17:45:21 UTC
Created attachment 16254 [details] [review]
small hack to solve the issue
Comment 2 Owen Taylor 2003-06-02 19:25:32 UTC
Created attachment 17067 [details] [review]
Attempt at afix
Comment 3 Owen Taylor 2003-06-02 19:31:36 UTC
Could you test the patch I've attached; it takes a somewhat
different approach.
 
 - Your patch: for each file descriptor, search for it
   blocked sources before adding it to poll()

 - Your suggestion: keep a reverse mapping from file descriptor
   to source().

 - My patch: when blocking a source, remove it from the global
   file descriptor list; add it back afterwards.

Your suggestion would be more time efficient, but would require
some considerable more complexity in the list of poll descriptors,
since we can't add a field to GPollFD, so I think I like this
better.

(The patch also contains a fix for bug 114274 which I noticed
when writing the patch) 
Comment 4 Owen Taylor 2003-06-03 23:23:31 UTC
Created attachment 17116 [details] [review]
New version of patch with a bug-fix
Comment 5 Owen Taylor 2003-06-06 03:55:59 UTC
Thu Jun  5 23:40:31 2003  Owen Taylor  <otaylor@redhat.com>
 
        * glib/gmain.c: When dispatching a source that is
        !CAN_RECURSE, temporarily remove any file descriptors
        that that source has registered from the main loop, to keep
        recursive main loops from busy-waiting if input
        becomes available on one of those file descriptors.
        (#112222, Christian Krause)

[ and Ron who? ]
Comment 6 Christian Persch 2004-03-19 16:28:44 UTC
This bug is back (see Epiphany bug 137617).

It looks like the checkin from bug 50296 (gmain.c rev 1.101 -> 1.102)
effectively backed out the patch from this bug.
http://bugzilla.gnome.org/showattachment.cgi?attach_id=18402 from
50296 didn't revert this, the next patch
http://bugzilla.gnome.org/showattachment.cgi?attach_id=23330 does -- I
don't know if this was intentional or not.
Comment 7 Owen Taylor 2004-03-19 20:26:12 UTC
Thanks for catching it - completely not intentional

Fri Mar 19 15:21:09 2004  Owen Taylor  <otaylor@redhat.com>
                                                                     
          
        * glib/gmain.c: Fix the accidental revert of the
        fixes from #112222 that happened when the GChildWatch
        code was added. (Caught by Christian Persch)
Comment 8 Crispin Flowerday (not receiving bugmail) 2004-07-01 15:49:38 UTC
*** Bug 145225 has been marked as a duplicate of this bug. ***