Bug 645746 – [gstpoll] Regression causes 100% cpu usage in multifdsink

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 645746 - [gstpoll] Regression causes 100% cpu usage in multifdsink


Summary:	[gstpoll] Regression causes 100% cpu usage in multifdsink


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gstreamer (core)
Version:	0.10.32
Hardware:	Other Linux

Importance:	Normal blocker
Target Milestone:	0.10.33
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Duplicates:	645877 (view as bug list)
Depends on:
Blocks:

Reported:	2011-03-26 13:45 UTC by Nicola
Modified:	2011-04-07 11:47 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
retry reading the control socket until we succeed (1.21 KB, patch) 2011-04-04 01:51 UTC, Andoni Morales	none	Details \| Review

Description Nicola 2011-03-26 13:45:47 UTC

When I see the following messages in gstreamer log:

2:46:50.535363829  7547      0x155d2d0 WARN                GST_POLL gstpoll.c:1029:gst_poll_fd_has_closed: 0x1399540: couldn't find fd !
2:46:50.535392395  7547      0x155d2d0 WARN                GST_POLL gstpoll.c:1078:gst_poll_fd_has_error: 0x1399540: couldn't find fd !
2:46:50.535403779  7547      0x155d2d0 WARN                GST_POLL gstpoll.c:1106:gst_poll_fd_can_read_unlocked: 0x1399540: couldn't find fd !
2:46:50.535414046  7547      0x155d2d0 WARN                GST_POLL gstpoll.c:1178:gst_poll_fd_can_write: 0x1399540: couldn't find fd !

I have a 100% cpu usage, probably the client disconnect and cause this behaviour. Setting the pipeline to NULL state make the cpu usage return as normal. This seems an infinite loop.

My pipeline is something like this:

rtspsrc .. ! *pay ! mastroskamux ! multifdsink

Comment 1 Andoni Morales 2011-03-31 16:06:47 UTC

I can confirm this issue.
I have also experienced a 100% of multifdsink because of GstPoll waiking up constantly, even without clients connected, being waken up by the control socket. I'm not sure whether there related or not. It was introduced by this commit in gstpoll.c:
http://cgit.freedesktop.org/gstreamer/gstreamer/commit/?id=22fa4470e2497a1b766322497e285ca199709fcc

Comment 2 Andoni Morales 2011-04-01 16:06:56 UTC

Something very weird happens. I can only reproduce it if multifdsink is after a muxer and if audio and video are muxed.

Usage around 2%:
gst-launch audiotestsrc ! vorbisenc ! queue ! matroskamux ! tcpserversink port=8888

100% usage:
gst-launch audiotestsrc ! vorbisenc ! queue ! matroskamux name=mux ! tcpserversink port=8888 videotestsrc ! vp8enc ! queue ! mux.

Comment 3 Andoni Morales 2011-04-01 16:11:17 UTC

It looks like this is only triggered when lots of small buffers are pushed into the sink.

Comment 4 Nicola 2011-04-01 16:36:42 UTC

in my case is reproducible with video only after several client added/client removed sequence

Comment 5 Andoni Morales 2011-04-01 16:41:16 UTC

In my case it's reproducible without any client, just by running the pipeline and I don't get any of the warnings you get.

Comment 6 Andoni Morales 2011-04-01 17:18:41 UTC

Forget the above example, the cpu usage is due to the encoder :P
But anyway I can confirm the 100% usage of multifdsink with flumotion's streamer, and that it was introduced by the commit in comment#1

Comment 7 Andoni Morales 2011-04-04 01:51:41 UTC

Created attachment 185080 [details] [review]
retry reading the control socket until we succeed

After several hours of debugging this issue I realized that the process started eating 100% of cpu when RELEASE_EVENT(set) returned FALSE in release_all_wakeups().
When this happens, we have already set set->control_pending=0, but we haven't read successfully the control socket. In the next _wait(), the control socket will make us wake-up again, but this time set->control_pending is already 0, so release_all_wakeups() return in the first "if" causing an infinite loop.

The proposed patch seems to be fixing at least my issue.

Comment 8 Wim Taymans 2011-04-04 09:01:25 UTC

I think what happens is that the raise_wakeup() managed to increase the 
control_pending variable before it could write a byte and that release_wakeup() then decrements but fails to read.

I can't think of a better way to fix this than the patch you propose.

Comment 9 Wim Taymans 2011-04-04 09:51:43 UTC

commit 7c6d9c2725a9632fbd96fd41b60ed814649d3bec
Author: Andoni Morales Alastruey <ylatuya@gmail.com>
Date:   Mon Apr 4 03:33:46 2011 +0200

    gstpoll: retry reading the control socket to release properly all wakeups
    
    if set->control_pending is set to 0 but we didn't not succed reading
    the control socket, future calls to gst_poll_wait() will be awaiken
    by the control socket which will not be released properly because
    set->control_pending is already 0, causing an infinite loop.

Comment 10 Jonathan Matthew 2011-04-07 11:47:13 UTC

*** Bug 645877 has been marked as a duplicate of this bug. ***