Bug 781218 – rtsp-stream: issue when getting EADDRINUSE on bind and only one address is in the pool (alloc_ports_one_family)

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 781218 - rtsp-stream: issue when getting EADDRINUSE on bind and only one address is in the pool (alloc_ports_one_family)


Summary:	rtsp-stream: issue when getting EADDRINUSE on bind and only one address is in...


Status:	RESOLVED INCOMPLETE

Product:	GStreamer
Classification:	Platform
Component:	gst-rtsp-server
Version:	1.11.90
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	git master
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2017-04-12 13:19 UTC by Jonathan Karlsson
Modified:	2018-05-06 12:53 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Patch to fix issue when there is only one address in the pool and bind fails. (2.36 KB, patch) 2017-04-12 13:19 UTC, Jonathan Karlsson	none	Details \| Review
Patch to fix issue when there is only one address in the pool and bind fails. (3.31 KB, patch) 2017-04-12 15:48 UTC, Jonathan Karlsson	none	Details \| Review

Description Jonathan Karlsson 2017-04-12 13:19:59 UTC

Created attachment 349728 [details] [review]
Patch to fix issue when there is only one address in the pool and bind fails.

There is an issue when bind fails in alloc_ports_one_family and only one address is in the address pool.
It goes back to the again label and that time the gst_rtsp_address_pool_acquire_address returns NULL and the function returns. With this patch, if we get EADDRINUSE, we just try to bind to another port instead of going all the way back to the again label. We use the existing 'count' variable to limit it to a maximum of 20 tries.

Comment 1 Nicolas Dufresne (ndufresne) 2017-04-12 13:38:15 UTC

Review of attachment 349728 [details] [review]:

::: gst/rtsp-server/rtsp-stream.c
@@ +1314,3 @@
+      if (tmp_rtp > 65534 || ++count > 20) {
+        /* port outside of range or failed 20 times */
+        goto port_error;

This code block is duplicated, maybe add a helper ?

@@ +1318,3 @@
+      /* rtp_socket is already bound to a port, close and allocate another */
+      g_clear_object (&rtp_socket);
+      if (rtp_socket == NULL) {

This check is a no-op, since g_clear_object will always set rtp_socket to NULL.

Comment 2 Jonathan Karlsson 2017-04-12 15:48:00 UTC

Created attachment 349741 [details] [review]
Patch to fix issue when there is only one address in the pool and bind fails.

Updated the patch with a helper function 'next_port'.
Also moved the part where the rtp_socket is being created into the 'bind_again' label.

Comment 3 Jonathan Karlsson 2017-04-18 12:17:13 UTC

Let's put this on hold for now.. We are investigating if our test case uses a correct approach for this. It uses UDP sockets and a networkshare, and failed very rarely.
This patch might be just a workaround for some other underlying problem.

Comment 4 Tim-Philipp Müller 2017-04-18 12:22:49 UTC

Don't know if this is what you're running into, but for what it's worth I've seen issues with 'make check' where depending on the timings at which various tests are run some tests would try to use the same ports at other tests (and always work fine if the tests are run individually of course). I've fixed a bunch of those a while back, but I'm sure there are more such cases (IIRC).

Comment 5 Jonathan Karlsson 2017-04-18 12:44:38 UTC

I don't know if that is related to this. We have a Jenkins host and bind() is run on the target. Maybe the test that runs just before happens to use the same port, and then the port is somehow unavailable. But that should not be the case for UDP anyway. Investigation continues..

Comment 6 Sebastian Dröge (slomo) 2017-04-19 09:27:46 UTC

I take it that this is not a regression then? Please let us know what your investigation results in, and ideally provide a testcase for the behaviour you see. Maybe that also helps to understand if there really is a bug or if things are working as intended and whatever you're doing should be solved differently.

Comment 7 Vivia Nikolaidou 2018-05-06 12:53:43 UTC

Closing this bug report as no further information has been provided. Please feel free to reopen this bug report if you can provide the information that was asked for in a previous comment.
Thanks!