After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 736365 - Loosing context after dhcp renew
Loosing context after dhcp renew
Status: RESOLVED FIXED
Product: GUPnP
Classification: Other
Component: gupnp
0.20.x
Other Linux
: Normal normal
: ---
Assigned To: GUPnP Maintainers
GUPnP Maintainers
Depends on:
Blocks:
 
 
Reported: 2014-09-09 22:31 UTC by Andrew
Modified: 2016-01-01 05:17 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
server: Avoid circular reference (1.99 KB, patch)
2015-01-31 00:56 UTC, Jens Georg
none Details | Review
This patch addresses my issue with the Context disappearing (6.26 KB, patch)
2015-10-12 16:12 UTC, Nick Glencross
needs-work Details | Review
Small perl script that displays contexts being created/removed (1.20 KB, text/plain)
2015-10-12 17:07 UTC, Nick Glencross
  Details
Small perl script that displays contexts being created/removed (2.28 KB, text/plain)
2015-10-12 17:23 UTC, Nick Glencross
  Details
Only create context when we don't have one (1.22 KB, patch)
2015-11-09 08:04 UTC, Andrew
none Details | Review
nl-dump utility (3.56 KB, text/x-csrc)
2015-11-21 17:25 UTC, Jens Georg
  Details
linux-cm: No need to look up device name (5.72 KB, patch)
2015-12-08 20:33 UTC, Jens Georg
none Details | Review
linux-cm: Add netlink debugging (10.54 KB, patch)
2015-12-08 20:33 UTC, Jens Georg
none Details | Review
linux-cm: Fix various issues with DHCP (9.38 KB, patch)
2015-12-08 20:33 UTC, Jens Georg
none Details | Review

Description Andrew 2014-09-09 22:31:11 UTC
While playing videos in rygel, I'm getting random dropped connections. If rygel is setup with a random port, the server is able to recover and can be used again albeit with the stream interruption. If the server is setup with a static port, rygel needs to be restarted before the server can be used again.

$ G_MESSAGES_DEBUG=all rygel -g 5
...
(rygel:4655): Rygel-DEBUG: rygel-main.vala:182: Network 192.168.1.0 (eth0) context now unavailable. IP: 192.168.1.65
(rygel:4655): GLib-GObject-CRITICAL **: object SoupServer 0x1a81200 finalized while still in-construction
(rygel:4655): GLib-GObject-CRITICAL **: Custom constructor for class SoupServer returned NULL (which is invalid). Please use GInitable instead. 
** (rygel:4655): WARNING **: Error creating GUPnP context: Could not create HTTP server on port 31982
(rygel:4655): Rygel-DEBUG: rygel-http-server.vala:136: HTTP GET request for URI 'http://192.168.1.65:31982/MediaExport/i/MjdmN2RmNjVkZDk5MTNjNzQzYmYyOGQ3MmNjNjRhZTY=.mp4' handled.
(rygel:4655): Rygel-DEBUG: rygel-http-server.vala:167: HTTP client aborted GET request for URI 'http://192.168.1.65:31982/MediaExport/i/MjdmN2RmNjVkZDk5MTNjNzQzYmYyOGQ3MmNjNjRhZTY=.mp4'.

I've managed to track the problem down to something relating to renewing my dhcp ipv4 address. After some investigation, I'm seeing a RTM_NEWADDR on the socket when a dhcp renew occurs. GUPnP then proceeds to kill and recreate the context (and fails if using a static port). I don't know if this process was suppose to be transparent with regard to the stream, or if socket message is not suppose to bubble up on dhcp renew.

Using a manually configured IP on the interface appears to fix the problem, but isn't really a solution.

Rygel: 0.22.3
GUPnP: 0.20.12

$ uname -a
Linux debian 3.14-2-amd64 #1 SMP Debian 3.14.15-2 (2014-08-09) x86_64 GNU/Linux
Comment 1 Andrew 2014-09-11 07:53:36 UTC
I've been able to reproduce this on another machine in the same network with a recent net installation of a Debian Jessie.
Comment 2 Jens Georg 2014-09-13 19:37:53 UTC
This is odd, I've never seen this on my sid server.
Comment 3 Andrew 2014-09-14 07:42:12 UTC
This is a very strange bug indeed. I've been seeing it for a few months now and thought I tracked it down, but may not have the root cause (a byproduct maybe?). I use a static port for rygel so there's always a manual step to fix it when it occurs. Its difficult to see on the client with the vast range of upnp implementations out there, so I'm basing the problem off the server logs as opposed to the stream interruption. The client that I usually use (a samsung tv) just happens to get stream interruptions.

I've been able to reproduce this when connecting same machine that showed the problem to a different router in an isolated network. The problem happens almost immediately because the dhcp renewal interval is quite short (30s). It must have something to do with rygel, not the network.

The problem must be with my system is setup if you cannot reproduce it, but I hesitate to conclude that since it happens on a new installation of debian as well (configuration is all default except for a static port). Rygel and related packages appear to have the same versions in testing and sid though, so I'm at a loss.
Comment 4 Jens Georg 2014-09-14 08:17:53 UTC
Same here, static port, although its a static lease. I'll have to check in more detail. maybe the DELADDR message is missing in your case...

As for the samsung: You might also be hitting

https://bugzilla.gnome.org/show_bug.cgi?id=726840
Comment 5 Andrew 2014-09-17 07:02:14 UTC
I don't think I'm hitting the same issue, at least not with the evidence below.

I've taken some packet capture logs at the server and am seeing RST packets (from the tv) when a disconnection happens. Now when it tries to reconnect to the server, it fails because there's no server listening anymore (due to the context issue described above with a static port). A dynamic port would still cause an interruption since the port is likely to change.

The logs show 2 RST packets (both from the tv). The first one was able to recover successfully without interruption, the second failed as above. There was over an hour of successful packets between the two. The only FIN packets were during menu navigation.

There's only a switch between the server and tv so there shouldn't be any interference from other devices.
Comment 6 Andrew 2014-09-21 07:44:49 UTC
The same log message occurs when changing the status of unrelated interfaces both with and without dynamic interface configuration. It appears to have the same result as above. In my case, I was playing with a vpn interface.

More testing shows that server is not actually down. Using telnet to connect to the server still works. Using one of the http urls in the server logs (such as the one in the description) also works as expected. But the server is no longer advertised. Checked with a few different upnp clients: the samsung tv still sees the server for several minutes after this (and can connect), but it eventually disappears (maybe 30m). Grilo in totem doesn't disappear, but can never be found again when restarting the app. Some android upnp app acts as totem/grilo. Packet logs from the server confirm this---no more multicast packets. This would be why I thought the server died. I have not checked to see if the gupnp context is still alive, but I suspect not. I only see the following rygel log entry when using a dynamic port:

(rygel:2689): Rygel-DEBUG : rygel-main.vala:149:Ne network 192.168.1.0 (eth0) context available. IP: 192.168.1.65

When using a dynamic port, the server remains advertised, albeit attached to a different port when this occurs. Confirmed with packet logs and upnp clients.

I'll need to spend a bit more time to figure out why the samsung tv disconnects a while after this happens though. It only disconnects when this happens.
Comment 7 Jens Georg 2014-09-22 07:31:05 UTC
There's actually two issues here:

1) The DCHP renewal should probably not lead to a context destruction
2) There is a reference count issue on the GUPnPContext that causes it not to be destroyed properly, so when the new one is created, the port is already bound, leading to the warnings/criticals you see.

I still have some trouble reproducing 1). Which dhcp server and client are you using?

I can reproduce 2) quite fine, though.
Comment 8 Andrew 2014-09-23 07:31:52 UTC
dhcp server: ISC 4.2.6
dhcp client: default installed with gnome on Debian, Network Manager 0.9.10.0 I believe

I don't believe it depends on the dhcp server as I've reproduced it with the manufacturer firmware of a home router (ar680w). Both computers I tested the server were Debian with gnome and network manager. Dhcp renewals happened as part of the daemon, no user interaction.
Comment 9 Jens Georg 2014-09-28 11:58:40 UTC
Provided
Comment 10 Jens Georg 2014-09-28 11:59:25 UTC
for your special situation, recompiling gupnp with netork-manager context manager might  be a feasible work-around.
Comment 11 Jens Georg 2015-01-04 12:10:59 UTC
2) was fixed in bug 742075
Comment 12 Andrew 2015-01-05 23:49:02 UTC
Tested with 0.20.13 (untagged revision 6b3ddff) and (1) still appears to be a problem on this system. Attaching trace for completeness in case it helps others who see this problem. Comment 10 may provide a work-around, but I haven't tried that yet. Tested by changing dhcp-lease-time in dhclient.conf to a low value and waiting for the renew. No clients connected to gupnp at the time.

(2) appears to be fixed from the trace though.

$ G_MESSAGES_DEBUG=all rygel -g 5
...
** (rygel:12972): WARNING **: GUPnPContext: Unable to listen on 192.168.1.65:31982 Could not listen on address 192.168.1.65, port 31982
(rygel:12972): libsoup-CRITICAL **: soup_uri_copy: assertion 'uri != NULL' failed
(rygel:12972): libsoup-CRITICAL **: soup_uri_set_path: assertion 'uri != NULL' failed
(rygel:12972): libsoup-CRITICAL **: soup_uri_to_string_internal: assertion 'uri != NULL' failed
(rygel:12972): libsoup-CRITICAL **: soup_uri_free: assertion 'uri != NULL' failed
(rygel:12972): libsoup-WARNING **: (soup-uri.c:604):soup_uri_copy: runtime check failed: (SOUP_URI_IS_VALID (uri))
(rygel:12972): libsoup-WARNING **: (soup-uri.c:604):soup_uri_copy: runtime check failed: (SOUP_URI_IS_VALID (uri))
(rygel:12972): libsoup-WARNING **: (soup-uri.c:604):soup_uri_copy: runtime check failed: (SOUP_URI_IS_VALID (uri))
** (rygel:12972): CRITICAL **: gupnp_resource_factory_create_service: assertion 'location != NULL' failed
** (rygel:12972): CRITICAL **: gupnp_resource_factory_create_service: assertion 'location != NULL' failed
** (rygel:12972): CRITICAL **: gupnp_resource_factory_create_service: assertion 'location != NULL' failed
(rygel:12972): Rygel-DEBUG: rygel-main.vala:182: Network 192.168.1.0 (eth0) context now unavailable. IP: 192.168.1.65
(rygel:12972): Rygel-DEBUG: rygel-main.vala:149: New network 192.168.1.0 (eth0) context available. IP: 192.168.1.65
(rygel:12972): Rygel-DEBUG: rygel-root-device-factory.vala:180: No icon provided by plugin 'MediaExport'. Using Rygel logo.
(rygel:12972): Rygel-DEBUG: rygel-root-device-factory.vala:180: No icon provided by plugin 'Playbin'. Using Rygel logo.
(rygel:12972): libsoup-CRITICAL **: soup_uri_copy: assertion 'uri != NULL' failed
(rygel:12972): libsoup-CRITICAL **: soup_uri_set_path: assertion 'uri != NULL' failed
(rygel:12972): libsoup-CRITICAL **: soup_uri_to_string_internal: assertion 'uri != NULL' failed
(rygel:12972): libsoup-CRITICAL **: soup_uri_free: assertion 'uri != NULL' failed
(rygel:12972): libsoup-WARNING **: (soup-uri.c:604):soup_uri_copy: runtime check failed: (SOUP_URI_IS_VALID (uri))
(rygel:12972): libsoup-WARNING **: (soup-uri.c:604):soup_uri_copy: runtime check failed: (SOUP_URI_IS_VALID (uri))
(rygel:12972): libsoup-WARNING **: (soup-uri.c:604):soup_uri_copy: runtime check failed: (SOUP_URI_IS_VALID (uri))
** (rygel:12972): CRITICAL **: gupnp_resource_factory_create_service: assertion 'location != NULL' failed
** (rygel:12972): CRITICAL **: gupnp_resource_factory_create_service: assertion 'location != NULL' failed
** (rygel:12972): CRITICAL **: gupnp_resource_factory_create_service: assertion 'location != NULL' failed
Comment 13 Jens Georg 2015-01-29 19:37:29 UTC
Ok, I've seen this now as well
Comment 14 Jens Georg 2015-01-30 20:20:24 UTC
** (lt-rygel:5786): WARNING **: GUPnPContext: Unable to listen on 192.168.34.116:50273 Could not listen on address 192.168.34.116, port 50273
(lt-rygel:5786): Rygel-DEBUG: rygel-main.vala:155: New network NotTheWifiYoureLookingFor (wlan0) context available. IP: 192.168.34.116
(lt-rygel:5786): RygelCore-DEBUG: rygel-root-device-factory.vala:180: No icon provided by plugin 'MediaExport'. Using Rygel logo.

Breakpoint 1, g_logv (log_domain=0x7ffff5e41bff "libsoup", log_level=G_LOG_LEVEL_CRITICAL, format=0x7ffff709223d "%s: assertion '%s' failed", args=args@entry=0x7fffffffcbd0)
    at gmessages.c:927
927	{
(gdb) bt
  • #0 g_logv
    at gmessages.c line 927
  • #1 g_log
    at gmessages.c line 1079
  • #2 g_return_if_fail_warning
    at gmessages.c line 1088
  • #3 soup_uri_copy
    at soup-uri.c line 603
  • #4 gupnp_root_device_constructor
    at gupnp-root-device.c line 424
  • #5 g_object_new_with_custom_constructor
    at gobject.c line 1692
  • #6 g_object_new_internal
    at gobject.c line 1772
  • #7 g_object_new_valist
    at gobject.c line 2034
  • #8 g_object_new
    at gobject.c line 1617
  • #9 rygel_root_device_construct
    at /home/jens/Source/rygel/src/librygel-core/rygel-root-device.vala line 48
  • #10 rygel_root_device_new
    at /home/jens/Source/rygel/src/librygel-core/rygel-root-device.vala line 43
  • #11 rygel_root_device_factory_create
    at /home/jens/Source/rygel/src/librygel-core/rygel-root-device-factory.vala line 111
  • #12 rygel_main_create_device_co
    at /home/jens/Source/rygel/src/rygel/rygel-main.vala line 220
  • #13 _rygel_main_create_device_co_gsource_func
    at rygel-main.c line 1345
  • #14 g_main_dispatch
    at gmain.c line 3122
  • #15 g_main_context_dispatch
    at gmain.c line 3737
  • #16 g_main_context_iterate
    at gmain.c line 3808
  • #17 g_main_loop_run
    at gmain.c line 4002
  • #18 rygel_main_run
    at /home/jens/Source/rygel/src/rygel/rygel-main.vala line 89
  • #19 rygel_main_main
    at /home/jens/Source/rygel/src/rygel/rygel-main.vala line 304
  • #20 main
    at /home/jens/Source/rygel/src/rygel/rygel-main.vala line 277

Comment 15 Jens Georg 2015-01-30 22:18:22 UTC
Issue is that the server fails to listen on the address but is valid so the constructor of the context is created but in fact is incomplete.

Could be that the linux cm is signalling the context creation too early.
Comment 16 Jens Georg 2015-01-31 00:29:40 UTC
Looks like another reference leak on GUPnPContext - not sure where that comes from.
Comment 17 Jens Georg 2015-01-31 00:56:39 UTC
Created attachment 295842 [details] [review]
server: Avoid circular reference

Signed-off-by: Jens Georg <mail@jensge.org>
Comment 18 Jens Georg 2015-01-31 08:23:20 UTC
Can you please check whether this completely fixes the problem for you? I've pushed the patch to master and rygel-0-24 branch
Comment 19 Andrew 2015-02-13 01:21:37 UTC
The patch you supplied doesn't appear to fix it with gupnp 0.20.12. Testing with the next version of gupnp may be blocked by a different issue (see log of Comment 12). Each rygel version was built with your patch applied on top.

rygel 0.24.2, gupnp 0.20.12 -- failed: Appears to work, but server disappears after some time (seems client impl dependent), ~30min. Stream also interrupts if playing at this time. Clients that have not seen the server are unable to locate it if first looking after the dhcp renew.
rygel 0.25.1, gupnp 0.20.12 -- failed: Same as above.

rygel 0.24.2, gupnp 0.20.13 -- failed: Server dies instantly, possibly suffering/blocked by different bug.
rygel 0.25.1, gupnp 0.20.13 -- failed: Same as above.
Comment 20 Jens Georg 2015-02-13 11:37:11 UTC
Oookay... o.O
Comment 21 Nick Glencross 2015-10-12 05:10:22 UTC
Hello,

I should have looked up the bug before starting an investigation yesterday! The gupnp library is really cool, and the code very easy to read!

I think I have some information that may help the original issue.

As the original poster says, there's an issue with the context disappearing when a connected router is restarted. I've only used the netlink context manager as the device I wish to target doesn't use network manager. I have verified it in 0.20.13

I added some debugging into the code, and was seeing:

  NEWLINK, DELADDR,
  NEWLINK, NEWADDR,
  NEWADDR, DELADDR

So we end on a DELADDR which removes the context that we want. Using 'ip monitor' you can see why:

$ sudo ip monitor link address
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN   
    link/ether b8:27:eb:3d:9b:80 brd ff:ff:ff:ff:ff:ff
Deleted 2: eth0    inet 192.168.1.2/24 brd 192.168.1.255 scope global eth0
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP
    link/ether b8:27:eb:3d:9b:80 brd ff:ff:ff:ff:ff:ff
2: eth0    inet 169.254.90.41/16 brd 169.254.255.255 scope global eth0
       valid_lft forever preferred_lft forever
2: eth0    inet 192.168.1.2/24 brd 192.168.1.255 scope global eth0
       valid_lft forever preferred_lft forever
Deleted 2: eth0    inet 169.254.90.41/16 brd 169.254.255.255 scope global eth0
       valid_lft forever preferred_lft forever

As you'll see, both 169.254.90.41 and 192.168.1.2 coexist on the interface before 169.254.90.41 is deleted, but the context hash table is just using the interface name 'eth0' and clears the context.

I suspect the solution is to use the interface name & ip address to identify the context. I shall have a go at trying this.

Cheers,

Nick
Comment 22 Jens Georg 2015-10-12 15:56:24 UTC
ah, great analysis. We do have problems when devices have more than one ip adress
Comment 23 Nick Glencross 2015-10-12 16:05:14 UTC
Hi again,

I'm not a very proficient C coder, but the patch that I'm about to attach seems to fix the problem I've been seeing.

However, I've not been able to convince myself that it handles the case of preconfigured network interfaces, such as entering network_device_create_context from network_device_up.

I basically use the IP address in the key to the device->contexts hash.

(I've left in a check for a NULL address, which I've not seen triggered in my experiments)

Probably best not to apply the patch as is as I've not been too careful with white space etc., but you're welcome to use the gist of it if I'm on the right track.

Cheers,

Nick
Comment 24 Nick Glencross 2015-10-12 16:12:31 UTC
Created attachment 313127 [details] [review]
This patch addresses my issue with the Context disappearing

I'm not proficient in C, but this patch has resolved my issue with the Context disappearing when a network interfaces loses its IP address and gets reassigned by DHCP.

Ideally it needs reviewing by an expert!
Comment 25 Jens Georg 2015-10-12 16:22:19 UTC
Thanks. I will have a look. One more question: How did you reproduce this?
Comment 26 Nick Glencross 2015-10-12 17:07:46 UTC
Created attachment 313128 [details]
Small perl script that displays contexts being created/removed

This is a Perl script that I use (which relies on Glib::Object::Introspection). I'm sure you can have a C version if this won't easily run for you.

To be able to see the issue you need to be on a wired network as it's the DHCP after getting a Link-local address (169.254.x.y) which seems to be the factor.

I see this consistently on a Debian Pi, but I see that Ubuntu doesn't seem to assign 169.254.x addresses when it loses connectivity. (But I can see in the settings how you'd enable it)

As restarting my router only takes a minute, that's the usual way I've been testing it, and see something like this, with the last 'unavailable' being on the 169.254 address.

UPnP Context becomes available GUPnP::Context=HASH(0x113bb8) (192.168.1.2)
Control Point GUPnP::ControlPoint=HASH(0x248078) associated with GUPnP::Context=HASH(0x113bb8)
<reboot router>
UPnP Context becomes unavailable GUPnP::Context=HASH(0x113bb8) (192.168.1.2)
UPnP Context becomes available GUPnP::Context=HASH(0x247df0) (169.254.90.41)
Control Point GUPnP::ControlPoint=HASH(0x248048) associated with GUPnP::Context=HASH(0x247df0)
UPnP Context becomes available GUPnP::Context=HASH(0x247dc0) (192.168.1.2)
Control Point GUPnP::ControlPoint=HASH(0x24ac58) associated with GUPnP::Context=HASH(0x247dc0)
UPnP Context becomes unavailable GUPnP::Context=HASH(0x247df0) (169.254.90.41)

One other thought is whether gupnp needs to work with 169.265.x.y addresses. (I know bonjour does, but maybe not upnp?).
If not, perhaps these IP address changes can just be ignored?

Cheers,

Nick
Comment 27 Jens Georg 2015-10-12 17:15:34 UTC
No, it has to be able to handle link-local addresses.
Comment 28 Nick Glencross 2015-10-12 17:23:58 UTC
Created attachment 313130 [details]
Small perl script that displays contexts being created/removed

Updated script which shows devices becoming available/unavailable, which makes it clearer when the context has gone missing. (Because all devices are unavailable when they shouldn't be)

Cheers,

Nick
Comment 29 Jens Georg 2015-11-06 08:10:40 UTC
No more need for info. Can reproduce here as well with ip addr add/ip addr del sequence adding link-local v4 address to an interface.

Thanks for debugging this :)
Comment 30 Andrew 2015-11-09 08:04:38 UTC
Created attachment 315105 [details] [review]
Only create context when we don't have one

Did a quick test with Nick's patch and no dice. My system dependencies have changed a bit mind you, but I still see the same symptoms.

I created a patch that fixes my issue. It probably breaks some other use case I haven't considered otherwise you would have thought of it already.

Patched from gupnp 0.20.14.
Comment 31 Jens Georg 2015-11-09 12:32:15 UTC
(In reply to Andrew from comment #30)
> Created attachment 315105 [details] [review] [review]
> Only create context when we don't have one
> 
> Did a quick test with Nick's patch and no dice. My system dependencies have
> changed a bit mind you, but I still see the same symptoms.
> 
> I created a patch that fixes my issue. It probably breaks some other use
> case I haven't considered otherwise you would have thought of it already.
> 
> Patched from gupnp 0.20.14.

Ok, then we have another issue :-/ Can you track the netlink messages that you get with nick's script?

I'm not sure about your patch, it might break alias interfaces.
Comment 32 Andrew 2015-11-09 20:21:30 UTC
Here's the output of Nick's script after for 2 dhcp renews. Each time the renew occurs, I receive a netlink message of type RTM_NEWADDR only.

UPnP Context becomes available GUPnP::Context=HASH(0x15b1660) (127.0.0.1)
Control Point GUPnP::ControlPoint=HASH(0x15b1750) associated with GUPnP::Context=HASH(0x15b1660)
UPnP Context becomes available GUPnP::Context=HASH(0x15b1498) (192.168.1.65)
Control Point GUPnP::ControlPoint=HASH(0x15c9188) associated with GUPnP::Context=HASH(0x15b1498)
UPnP Context becomes unavailable GUPnP::Context=HASH(0x15b1498) (192.168.1.65)
UPnP Context becomes available GUPnP::Context=HASH(0x13c0e78) (192.168.1.65)
Control Point GUPnP::ControlPoint=HASH(0x15c9518) associated with GUPnP::Context=HASH(0x13c0e78)
UPnP Context becomes unavailable GUPnP::Context=HASH(0x13c0e78) (192.168.1.65)
UPnP Context becomes available GUPnP::Context=HASH(0x15b1498) (192.168.1.65)
Control Point GUPnP::ControlPoint=HASH(0x1550dc0) associated with GUPnP::Context=HASH(0x15b1498)

The output is the same with a configured static IP (where the server is listening) and bringing up or down a separate vpn interface. For some odd reason I get RTM_NEWADDR for the interface that didn't change as well as RTM_DELADDR/RTM_DELLINK for the one that did (possibly because I have the vpn setup as the default route when active). Note that I have set a static port and explicit interface in rygel.conf.

UPnP Context becomes available GUPnP::Context=HASH(0x25de718) (127.0.0.1)
Control Point GUPnP::ControlPoint=HASH(0x25d5658) associated with GUPnP::Context=HASH(0x25de718)
UPnP Context becomes available GUPnP::Context=HASH(0x25c6478) (192.168.1.10)
Control Point GUPnP::ControlPoint=HASH(0x25ce078) associated with GUPnP::Context=HASH(0x25c6478)
UPnP Context becomes unavailable GUPnP::Context=HASH(0x25c6478) (192.168.1.10)
UPnP Context becomes available GUPnP::Context=HASH(0x23f3ef0) (192.168.1.10)
Control Point GUPnP::ControlPoint=HASH(0x25ce090) associated with GUPnP::Context=HASH(0x23f3ef0)


I'm almost leaning towards some sort of misconfiguration on this system since no one else appears to be having this issue.
Comment 33 Jens Georg 2015-11-21 17:25:51 UTC
Created attachment 316025 [details]
nl-dump utility

Can you compile and run the attached nl-dump.c program and attach its output?
Comment 34 Andrew 2015-12-01 23:45:30 UTC
Packet 0 is the initial dhcp call when eth0 is brought up. Packets 1 and 2 are instances of dhcp renews. Rygel is started between packets 0 and 1 with packet 1 coinciding with the bug.

=========> 00000000 <==========
0000: 58 00 00 00 14 00 00 00 08 0d 5e 56 34 03 00 00   X.........^V4...
0010: 02 18 00 00 02 00 00 00 08 00 01 00 c0 a8 07 1b   ................
0020: 08 00 02 00 c0 a8 07 1b 08 00 04 00 c0 a8 07 ff   ................
0030: 09 00 03 00 65 74 68 30 00 00 00 00 08 00 08 00   ....eth0........
0040: 00 00 00 00 14 00 06 00 31 01 00 00 31 01 00 00   ........1...1...
0050: d6 0d 07 00 d6 0d 07 00                              ........
=========> 00000001 <==========
0000: 58 00 00 00 14 00 00 00 14 0d 5e 56 34 03 00 00   X.........^V4...
0010: 02 18 00 00 02 00 00 00 08 00 01 00 c0 a8 07 1b   ................
0020: 08 00 02 00 c0 a8 07 1b 08 00 04 00 c0 a8 07 ff   ................
0030: 09 00 03 00 65 74 68 30 00 00 00 00 08 00 08 00   ....eth0........
0040: 00 00 00 00 14 00 06 00 31 01 00 00 31 01 00 00   ........1...1...
0050: d6 0d 07 00 d9 48 07 00                              .....H..
=========> 00000002 <==========
0000: 58 00 00 00 14 00 00 00 17 0d 5e 56 34 03 00 00   X.........^V4...
0010: 02 18 00 00 02 00 00 00 08 00 01 00 c0 a8 07 1b   ................
0020: 08 00 02 00 c0 a8 07 1b 08 00 04 00 c0 a8 07 ff   ................
0030: 09 00 03 00 65 74 68 30 00 00 00 00 08 00 08 00   ....eth0........
0040: 00 00 00 00 14 00 06 00 31 01 00 00 31 01 00 00   ........1...1...
0050: d6 0d 07 00 37 7d 07 00                              ....7}..
Comment 35 Jens Georg 2015-12-02 09:33:00 UTC
Info received. Thank you very much for all your effort :)
Comment 36 Jens Georg 2015-12-04 20:36:39 UTC
This is odd. Works fine here. Which versions of GUPnP, GSSDP and Rygel are you using now?
Comment 37 Andrew 2015-12-05 09:18:45 UTC
I finally updated a handful of (debian) packages the day after I sent the trace and am now getting slightly different results:

(rygel:10798): Rygel-DEBUG: rygel-main.vala:188: Network 192.168.1.0 (eth0) context now unavailable. IP: 192.168.1.65
(rygel:10798): Rygel-DEBUG: rygel-main.vala:155: New network 192.168.1.0 (eth0) context available. IP: 192.168.1.65

when no video stream is playing in rygel, and a trace similar to the one in #12 (libsoup/gupnp errors) when a video is playing. Digging a bit, it looks like this logic changed between rygel 0.24.2 and 0.26.1. I'm not too concerned about these messages when no stream is active as the server is still discoverable, but if a stream is active I get the issues discussed before (connection interruption, server discoverability failure). nl-dump with the new version (and an active video stream) don't appear to be much different either (packet 0-the initial dhcp call when the interface is brought up, packet 1-a dhcp renew).

Packages untouched:
gupnp: 0.20.14-1
gssdp: 0.14.10-1
linux: 4.2.5-1

Packages updated:
rygel: 0.24.2-1 to 0.28.1-1


=========> 00000000 <==========
0000: 58 00 00 00 14 00 00 00 63 4c 62 56 30 03 00 00   X.......cLbV0...
0010: 02 18 00 00 02 00 00 00 08 00 01 00 c0 a8 07 1b   ................
0020: 08 00 02 00 c0 a8 07 1b 08 00 04 00 c0 a8 07 ff   ................
0030: 09 00 03 00 65 74 68 30 00 00 00 00 08 00 08 00   ....eth0........
0040: 00 00 00 00 14 00 06 00 31 01 00 00 31 01 00 00   ........1...1...
0050: c5 96 22 00 c5 96 22 00                              .."...".
=========> 00000001 <==========
0000: 58 00 00 00 14 00 00 00 6f 4c 62 56 30 03 00 00   X.......oLbV0...
0010: 02 18 00 00 02 00 00 00 08 00 01 00 c0 a8 07 1b   ................
0020: 08 00 02 00 c0 a8 07 1b 08 00 04 00 c0 a8 07 ff   ................
0030: 09 00 03 00 65 74 68 30 00 00 00 00 08 00 08 00   ....eth0........
0040: 00 00 00 00 14 00 06 00 31 01 00 00 31 01 00 00   ........1...1...
0050: c5 96 22 00 36 ca 22 00                              ..".6.".
Comment 38 Jens Georg 2015-12-05 11:27:54 UTC
Ah ok. Thanks. I think with the combination of all fixes, your suggestion and Nick's changes we should have covered it. I'll prepare a patch to test soon
Comment 39 Jens Georg 2015-12-08 20:33:22 UTC
Created attachment 316979 [details] [review]
linux-cm: No need to look up device name

We get it "for free" with the ILF_* attributes.

Signed-off-by: Jens Georg <mail@jensge.org>
Comment 40 Jens Georg 2015-12-08 20:33:30 UTC
Created attachment 316980 [details] [review]
linux-cm: Add netlink debugging

Add the possibility to dump the received netlink packets.

Signed-off-by: Jens Georg <mail@jensge.org>
Comment 41 Jens Georg 2015-12-08 20:33:35 UTC
Created attachment 316981 [details] [review]
linux-cm: Fix various issues with DHCP

 - Some DHCP clients assign a link-local address and then switch back to the
   new address, causing loss of context
 - Some DHCP renews cause RTM_NEWADDR with the current address, causing
   needless loss of context

Signed-off-by: Jens Georg <mail@jensge.org>
Comment 42 Jens Georg 2015-12-08 20:34:47 UTC
Can you check the three patches? It works here with the simulated netlink packets from your dump.

Please note you need GSSDP master as well.
Comment 43 Jens Georg 2015-12-31 11:01:45 UTC
This problem has been fixed in our software repository. The fix will go into the next software release. Once that release is available, you may want to check for a software upgrade provided by your Linux distribution.
Comment 44 Andrew 2016-01-01 05:17:15 UTC
I wasn't able to test under the same conditions (videos in rygel) as I ran into other issues (likely unrelated to this bug), but audio files work fine. Output shows the following without symptoms described above.

(rygel:8110): GUPnP-ContextManager-Linux-DEBUG: Received RTM_NEWADDR
(rygel:8110): GUPnP-ContextManager-Linux-DEBUG: Context for address 192.168.1.79 on enp3s0 already exists

I believe this issue to be fixed with gupnp 0.20.15 and gssdp 0.14.3.