GNOME Bugzilla – Bug 510354
post GST_MESSAGE_STRUCTURE_CHANGE when linking and unlinking pads
Last modified: 2008-10-06 16:16:09 UTC
The patch posts GST_MESSAGE_STRUCTURE_CHANGE when pads are linked or unlinked. The message is handled by GstBin to make GstBinSortIterator resync when linking or unlinking happens.
Created attachment 103130 [details] [review] Test case and patch
yeps, perfect.
the patch is not working, the pad is linked before the message is posted so that the bin could still end up doing the wrong thing. Perhaps a start/stop message sequence would help.
Created attachment 103689 [details] [review] Updated patch. I ran into the issue described by this bug and tried the original patch, but still hit the same issue. After debugging I came to the conclusion described by Wim. This patch builds on the original patch but uses a start/stop sequence before/after the changes are made. I was able to reproduce the race quite reliably in less than 20 full-duplex a/v calls with rtpmanager before this patch (all inside one process using the loopback interface, so things get quite racy). Have not yet been able to reproduce with 100 call-setups. Will keep pounding and report back if I hit it again. Would it be preferable to have GST_MESSAGE_STRUCTURE_CHANGE_{START,DONE} instead of one message for both by the way? (To follow the same pattern as the other messages that are sequenced.) Or alternatively something shorter like GST_MESSAGE_RESTRUCTURE_{START,DONE}, which as an added bonus means not having to re-indent all the other entries' equal-signs and make for a huge diff. :)
I'm not very comfortable with a message taking an object lock and another message releasing it again. What about making the parent ignore the link to a pad that posted the STRUCTURE_CHANGE start message and making it restart the state change when the stop message incremented the counter? This would not require any permanent locks.
*** Bug 544440 has been marked as a duplicate of this bug. ***
Created attachment 115124 [details] More aggressive test casee I'm adding a much more aggressive test case..
Looking at the code, I guess the GstBin has to keep a list of pads in transition. But I'm afraid its not that simple, since the element that has the pad could be removed from the bin in the mean time. So maybe instead the element should listen to the "linked" and "unlinked" signals and emit the messages (and also emit the done message on the bin when the parent is changed.. if thats even doable...)
Actually, I'm stupid. The bin knows when elements are removed... so I guess it can remove the element/pads from the in-transition list when they are removed.
Also, the patch seems to look directly at the parent field of the gstobject without taking the object lock
I guess removing an element from a bin will first unlink all pads that are still linked, so its a non-issue?
Created attachment 115214 [details] [review] Further updated patch I'm adding that patch that adds a GPtrArray of the currently changing pads to GstBin and checks if the pads in there in the iterator update_degree function. I see only one problem with the current patch, the update_degree() function seems to be sometimes called with the GstBin lock held (on lock creation) and sometimes without (other times). So we can't do locking properly inside there, maybe there should be another lock just for that list ?
Created attachment 117098 [details] [review] Possibly complete patch? I've changed the patch to use the bin->messages to store messages. I also changed it to update the structure_cookie at the start and at the end of the structure change (otherwise it still failed in some cases).
Created attachment 117247 [details] [review] updated again oops, the last patch had one case where it could try to unref a NULL..
Created attachment 117248 [details] [review] updated again oops, the last patch had one case where it could try to unref a NULL..
Last patch starts to look good to me. There is one last thing, when we remove the element from the bin we would need to remove all the pads that still have pending messages of this element. While the pad always posts both start/done message pairs, it's possible that the element is removed from the bin before the done message is posted, leaing the start message in the bin still. It'll cause a leak because the pad cannot be finalized because the message has a ref to it. I would propose to add the element to the message structure for this purpose. Update to patch welcome or wait till I take a look at this after the freeze.
Commited with some updates. Based on Patch by: Olivier Crete <tester at tester dot ca> * gst/gstbin.c: (gst_bin_init), (gst_bin_add_func), (gst_bin_remove_func), (update_degree), (gst_bin_sort_iterator_new), (gst_bin_handle_message_func): Keep track of pads that are being linked/unlinked and resync the state changes. * gst/gstpad.c: (gst_pad_get_direction), (gst_pad_set_chain_function), (gst_pad_set_getrange_function), (gst_pad_set_checkgetrange_function), (gst_pad_unlink), (gst_pad_link_prepare), (gst_pad_link), (gst_pad_event_default_dispatch), (gst_pad_chain), (gst_pad_push), (gst_pad_check_pull_range), (gst_pad_get_range), (gst_pad_pull_range): Some code cleanups, use macros to check pad direction. Don't need to take the lock on the pad direction. Post structure change when pads are linked/unlinked. Change some checks into _return_if_fail(). * tests/check/gst/gstbin.c: (test_link_structure_change_state_changed_sync_cb), (GST_START_TEST), (gst_bin_suite): Add testcase for pad link/unlinke resync during a state change. Fixes #510354. * docs/gst/gstreamer-sections.txt: * gst/gstmessage.c: (gst_message_new_structure_change), (gst_message_parse_structure_change): * gst/gstmessage.h: Implement STRUCTURE_CHANGED messages. These messages will be used to signal the parent bin of link/unlink operations that could require a resync when doing a state change. See ##510354. API: gst_message_new_structure_change() API: gst_message_parse_structure_change() * gst/gstquark.c: * gst/gstquark.h: Add some more quarks for new message. See #510354.