GNOME Bugzilla – Bug 567857
[udpsrc] loop on gst_poll_wait when POLLERR because of icmp
Last modified: 2009-02-24 20:10:01 UTC
I am using the same fd to send and recv rtp. This fd is used in gstudpsink and gstudpsrc. udpsrc tries to read from fd while udpsink sends RTP. When distant is unreachable, udpsink sendto will lead to a icmp packet sent back. Therefore, ppoll in gst_poll_wait called from udpsrc will return 1 with POLLERR set on rtp fd. udpsrc then loops on poll because FIONREAD returns 0 (icmp packet). When next RTP packet is sent (20ms later in ALAW there), sendto returns an error and the udpsink post a message on bus : Connection refused. Meanwhile (20ms) udpsrc looped about a hundred times on poll, consuming all cpu. The comment before retry in gstudpsrc.c is right : "We know someone will also do something with the socket so that we don't go into an infinite loop in the select()" Problem is when ? We can choose to stop the udpsrc when the message 'Connection refused' is received on bus. This means that all cpu will be consumed for 20 ms. Moreover, the icmp error may only be temporary. It may happen at the beginning of the process, just let the distant time to bind its ports (It should have done it before, right, but who knows ? : linphone for instance has such bug) So the problem is about icmp errors on RTP sockets. Are they considered to be fatal (therefore, why loop on udpsrc ?) or not and we have a performance issue. fd has POLLERR revent anytime the poll is done, this is why loop is so fast. But pollerr is set only when icmp is received because of a send. POLLERR should be flushed before loop on poll. This can be done using recvmsg : patch attached. side effect : udpsink does not see error anymore and does not return GST_FLOW_ERROR
Created attachment 126501 [details] [review] Flush POLLERR in udpsrc before retry poll
I can't reproduce. Are you allocating the socket yourself or using the one from udpsrc?
I am allocating the socket, passing the fd with the sockfd property of udpsrc. I use gstreamer to handle media in sip calls. The distant did not bind it's socket though provided RTP endpoints when sending its SIP INVITE. The distant was sipp. I'll try to build a simple application which is able to reproduce this issue. There is a problem becasuse both udpsrc and udpsink shares the same fd. So icmp packet caused by udpsink influence udpsrc behavior.
Here is server-PCMA.c to reproduce the issue. This happens because the socket is connected to remote. If not connected, everything is ok. GST_DEBUG=udpsrc:5 ./server-PCMA
Created attachment 126535 [details] test case
Why connect an udp socket ? Well remote is well identified and seen with lsof or netstat. Udp packets from wrong source are not delivered. You can gain performance by using send and recv instead of sendto and recvfrom. We plan to add property to use connected udp socket on udpsrc / udpsink. http://bugzilla.gnome.org/show_bug.cgi?id=563323
I see know and I'm asking myself the following questions: 1) should we not disable the POLLERR in the udpsrc? we don't really want to receive errors from the sender here. This requires us to add a method to the GstPoll to unset the POLLERR flag on the sockets. 2) we should likely read the errors on the udp socket in multiudpsink instead. We don't need to add GstPoll to multiudpsink for that, I think. We can simply read/clear the error queue when the last sendto returned an error. Would this work too?
You cannot disable pollerr ... man poll : The bits returned in revents can include any of those specified in events, or one of the values POLLERR, POLLHUP, or POLLNVAL. (These three bits are meaningless in the events field, and will be set in the revents field whenever the corresponding condition is true.) for point 2, the sendto is not on error the first time. It is on error because of ICMP return. so if errors are read from udpsink, you'll still have the gst_poll burst. maybe half less, but still.
Yes, I figured that out experimenting a bit. So it seems your patch is pretty much the only thing that can be done.
commit 969622b439cc7232e8889cd5c7a5579cba6b665f Author: Aurelien Grimaud <gstelzz at yahoo dot fr> Date: Mon Feb 23 20:49:37 2009 +0100 Read ICMP error messages instead of looping When we are dealing with connected sockets shared between a udpsrc and a udpsink we might receive ICMP connection refused error messages in udpsrc that will cause it to go into a bursty loop because the poll returns right away without a message to read. Instead of looping, read the error message from the error queue in udpsrc. Fixes #567857.
should we post a WARNING message on the bus when the source receives an error message? this way the application could at least get some idea about errors.
Well, it sure would be better for the application to be aware of ICMP problems instead of silently hiding them. Application will _have to_ deal with it, since there will have a bus post each 20ms (for pcma). Can one bus post each 20ms lead to major performances problem ? If so, a property "warn for icmp" could be added. This could be a way to detect runaway calls from the caller and take signaling decisions (Hang Up) based on media failures (icmp errors) (We use gstreamer for telephony applications)