After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 661201 - Looping in poll() in every application using GPollFDs.
Looping in poll() in every application using GPollFDs.
Status: RESOLVED OBSOLETE
Product: glib
Classification: Platform
Component: gthread
unspecified
Other OpenBSD
: Normal major
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2011-10-07 16:14 UTC by Robert Nagy
Modified: 2018-05-24 13:26 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Robert Nagy 2011-10-07 16:14:14 UTC
Hi

We have been suffering from a problem on OpenBSD where glib causes several applications to hang if they are using GPollFDs.
One of these applications is clutter.

(gdb) bt
  • #0 poll
    from /usr/lib/libc.so.60.1
  • #1 poll
    at /usr/src/lib/libpthread/uthread/uthread_poll.c line 80
  • #2 g_main_context_check
    from /home/pobj/clutter-1.8.0/clutter-1.8.0/tests/micro-bench/.libs/libglib-2.0.so.2992.0
  • #3 g_main_loop_run
    from /home/pobj/clutter-1.8.0/clutter-1.8.0/tests/micro-bench/.libs/libglib-2.0.so.2992.0
  • #4 clutter_main
    at ./clutter-main.c line 675
  • #5 main
    at test-text.c line 116

By attaching to the process and doing a backtrace after SIGINT shows the above output.

Tracing the process reveals that the process is looping in poll():

 31557 test-text CALL  poll(0x20dc3c9d0,0x2,0)
 31557 test-text RET   poll 0
 31557 test-text CALL  read(0x5,0x20286202c,0x1000)
 31557 test-text RET   read -1 errno 35 Resource temporarily unavailable
 31557 test-text CALL  poll(0x20dc3c9d0,0x2,0)
 31557 test-text RET   poll 0
 31557 test-text CALL  read(0x5,0x20286202c,0x1000)
 31557 test-text RET   read -1 errno 35 Resource temporarily unavailable

read() returns EAGAIN all the time and the application's UI is totally frozen.


Running the process with G_MAIN_POLL_DEBUG shows the following:

created context=0x20ffbc400
default context=0x20ffbc400
polling context=0x20ffbc400 n=2 timeout=0
g_main_poll(2) timeout: 0 - elapsed 0.0000710000 seconds [5 :i]
polling context=0x20ffbc400 n=2 timeout=0
g_main_poll(2) timeout: 0 - elapsed 0.0000030000 seconds
polling context=0x20ffbc400 n=2 timeout=0
g_main_poll(2) timeout: 0 - elapsed 0.0000030000 seconds
polling context=0x20ffbc400 n=2 timeout=0
g_main_poll(2) timeout: 0 - elapsed 0.0000040000 seconds

And so on ... until the process gets killed.
Comment 1 Robert Nagy 2011-10-07 16:28:37 UTC
Since this might be a pthread issue I've been running the regression tests in gio to see if it all works out fine, but i've also run into issues there:

#1
/tls-interaction/ask-password/invoke-without-loop/async-implementation-success: 
GThread-ERROR **: file gthread-posix.c: line 175 (g_mutex_free_posix_impl): error 'Device busy' during 'pthread_mutex_destroy ((pthread_mutex_t *) mutex)'

#2
/gdbus/method-calls-in-thread:                                       AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA**
ERROR:gdbus-threading.c:445:test_method_calls_on_proxy: assertion failed (elapsed_msec < 6000): (6010 < 6000)

** WARNING **: unknown command from parent ''
FAIL

And so on...a lot of tests are failing for different reasons.
Comment 2 Robert Nagy 2011-10-07 16:44:11 UTC
Oh and by the way, I would happy give access to an OpenBSD machine to any glib developer out there who would have the time and interest to look at these issues so we can figure out what's going on so that glib would be working 100% on OpenBSD.
Comment 3 Antoine Jacoutot 2011-10-07 17:17:02 UTC
I'm getting the exact same trace (G_MAIN_POLL_DEBUG enabled) when trying to unmount an FTP volume with nautilus (gvfs).
Console just gets filed with
    g_main_poll(2) timeout: 0 - elapsed XXX seconds
and nautilus process goes into a loop (100% cpu).
Comment 4 Dan Winship 2011-10-07 21:27:21 UTC
If it's polling with a 0 second timeout, then the problem isn't with the GPollFDs. The problem is that there's some *other* source that is claiming to already be ready, which then makes glib poll with a 0-second timeout. But then, apparently, this other source is not getting removed (or else is getting immediately re-triggered), and so glib keeps doing 0-second polls because it always thinks this source is ready.

You'd want to break/printf in g_main_dispatch() to see what source it is.
Comment 5 Robert Nagy 2011-10-07 22:29:39 UTC
Dan,

I've set a breakpoint in g_main_dispatch() but it seems I never end up in that function.

The process is being stuck in g_main_context_poll() where the actual poll happens because errno is set to EINTR:
if ((*poll_func) (fds, n_fds, timeout) < 0 && errno != EINTR)

This is another bt while running one of the tests. I've checked xcb_in and it seems that there is a poll there which gets EINTR all the time too.

(gdb) bt
  • #0 read
    from /usr/lib/libc.so.60.1
  • #1 read
    at /usr/src/lib/libpthread/uthread/uthread_read.c line 72
  • #2 _xcb_in_read
    at /home/xenocara/lib/libxcb/libxcb/../../../dist/libxcb/src/xcb_in.c line 666
  • #3 xcb_poll_for_event
    at /home/xenocara/lib/libxcb/libxcb/../../../dist/libxcb/src/xcb_in.c line 551
  • #4 poll_for_event
    from /home/ports/pobj/clutter-1.8.0/clutter-1.8.0/tests/micro-bench/.libs/libX11.so.15.0
  • #5 poll_for_response
    from /home/ports/pobj/clutter-1.8.0/clutter-1.8.0/tests/micro-bench/.libs/libX11.so.15.0
  • #6 _XEventsQueued
    from /home/ports/pobj/clutter-1.8.0/clutter-1.8.0/tests/micro-bench/.libs/libX11.so.15.0
  • #7 XPending
    from /home/ports/pobj/clutter-1.8.0/clutter-1.8.0/tests/micro-bench/.libs/libX11.so.15.0
  • #8 clutter_event_prepare
    at ./x11/clutter-event-x11.c line 366
  • #9 g_main_context_prepare
    at gmain.c line 2766
  • #10 g_main_context_iterate
    at gmain.c line 3073
  • #11 g_main_loop_run
    at gmain.c line 3301
  • #12 clutter_main
    at ./clutter-main.c line 675
  • #13 main
    at test-picking.c line 133

Comment 6 Dan Winship 2011-10-08 15:48:24 UTC
(In reply to comment #5)
> The process is being stuck in g_main_context_poll() where the actual poll
> happens because errno is set to EINTR:
> if ((*poll_func) (fds, n_fds, timeout) < 0 && errno != EINTR)

hm. then that's different from the ktrace output in comment 0, which shows poll() returning 0, right?
Comment 7 Robert Nagy 2011-10-08 15:50:56 UTC
Yes it is, but I am pretty sure that this is still valid. ktrace might be wrong.
Comment 8 Dan Winship 2011-10-08 15:53:04 UTC
but the G_MAIN_POLL_DEBUG output also shows it returning 0. so, there are two bugs here (or at least, two separate manifestations of the same underlying bug)
Comment 9 GNOME Infrastructure Team 2018-05-24 13:26:38 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/462.