After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 701110 - videomixer: sinkpads GSList is not protected for multi-threading
videomixer: sinkpads GSList is not protected for multi-threading
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
git master
Other Linux
: Normal normal
: 1.3.90
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2013-05-28 02:52 UTC by Nicolas Dufresne (ndufresne)
Modified: 2014-07-02 12:37 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
example that causes segmentation fault on the videomixer (8.08 KB, text/x-csrc)
2014-01-23 14:50 UTC, [laucha]
  Details
workaround for the segfault and a patch for the freezing bug. Both are for videomixer (653 bytes, application/x-gzip; charset=binary)
2014-03-07 13:35 UTC, [laucha]
  Details
Fix a locking order issue on videomixer (876 bytes, patch)
2014-03-07 21:14 UTC, [laucha]
reviewed Details | Review
Patch to protect a critical variable values change (1.21 KB, patch)
2014-03-13 10:01 UTC, David Fernandez
reviewed Details | Review
Patch to avoid a segmentation fault when src caps are configured (1.30 KB, patch)
2014-06-25 13:41 UTC, David Fernandez
committed Details | Review

Description Nicolas Dufresne (ndufresne) 2013-05-28 02:52:24 UTC
While fixing a z-order issue, I realized that the mixer sinkpads GSList is not protected. If a pad is created, a zorder is changed, or a pad is removed while the streaming thread is iterating the list, it may crash due to pointer being modified in parallel. Considering this, there might be many other threading issues in this element.
Comment 1 Tim-Philipp Müller 2013-08-25 11:50:12 UTC
Is this still the case?

The sinkpads list seems to be protected by GST_VIDEO_MIXER2_LOCK (mix) everywhere afaict.
Comment 2 [laucha] 2014-01-23 14:50:11 UTC
Created attachment 267052 [details]
example that causes segmentation fault on the videomixer

Talking with Sebastian on the list: http://lists.freedesktop.org/archives/gstreamer-devel/2014-January/045768.html 

This example have a pipe with videotestsrc ! capsfilter ! videomxier ! xvimagesink.
Then on PLAYING_STATE I do this:
while (TRUE){
    sleep(1);
    add_mixer_input(); //adds a new bin to the mixer
    sleep(1);
    remove_mixer_input(); // removes the previously added bin.
}
The program causes segfault, and always crashed in the same point.
If I run the program with GST_DEBUG=4 the program crashes more quickly.

One more thing, if I don't call release_request_pad on mixer, the program seems to doesn't blow up.
Comment 3 [laucha] 2014-02-17 19:55:36 UTC
I've seen that the videomixer blows up beacuse the function ptr to the  function that does the blending is NULL. This pointer was reseted when the caps are reseted.
I can try to fix the bug if you give me some help :)

I've already made an ugly fix to the videomixer. Basically, if the function's pointer is NULL I reject the buffer. This produces a green frame on the output but the pipeline doesn't blow up. But now the pipeline freezes beacuse a deadlock(I think that is another bug).

The deadlock was produced by the chain function of one sink and the relase_request_pad on another sink. Here is the sequence of the deadlock:

1) gst_collect_pads_chain function acquires GST_COLLECT_PADS_STREAM_LOCK.
2) gst_videomixer2_release_pad function acquires GST_VIDEO_MIXER2_LOCK.
2 )gst_collect_pads_chain function calls to gst_videomixer2_collected() function.
3) gst_videomixer2_collected() function tries to acquire GST_VIDEO_MIXER2_LOCK.
4) gst_videomixer2_release_pad() function tries to acquire GST_COLLECT_PADS_STREAM_LOCK.

So, gst_videomixer2_release_pad waits on COLLECT_PADS_STREAM_LOCK (acquired by gst_videomixer2_collected) and gst_videomixer2_collected waits on GST_VIDEO_MIXER2_LOCK(acquired by gst_videomixer2_release_pad).
Comment 4 Nicolas Dufresne (ndufresne) 2014-02-17 20:14:42 UTC
If it's what you have described, it's a locking order issue. The correct order is STREAM_LOCK follow by OBJECT_LOCK. If you have OBJECT_LOCK already and need STREAM_LOCK, you need to unlock the OBJECT_LOCK first, then take the two locks in the right order (in this case you'll need to check your state for changes).

Hopes that will help you make a fix, would be really appreciated. gst_videomixer2_release_pad() would be the one in fault.
Comment 5 [laucha] 2014-03-07 13:35:05 UTC
Created attachment 271223 [details]
workaround for the segfault and a patch for the freezing bug. Both are for videomixer

Nicolas, here is the patch. I've tested that works in the following way:

1) If I run the example without the patch it freezes after one hour of running aproximately. If I run 3  instances of the example at the same time, one of them freezes after 5 minutes at most.

2) I ran 4 instances of the example with the patch applied and it was running for 5 days without freezing.

Apparently seems that the patch works. Please confirm if you want me to change something.

If you want to test the patch, first at all you will have to apply another patch that is a workaround for the segfault that I describe on comment 2 and 3. The real patch is for the freezing case.

Until now I've not had the time to investigate why the ptr to the mix function was seted to NULL (I think it is because a caps's renegotiation when I remove the pad).
Comment 6 Nicolas Dufresne (ndufresne) 2014-03-07 16:13:29 UTC
(In reply to comment #5)
> Created an attachment (id=271223) [details] [review]
> workaround for the segfault and a patch for the freezing bug. Both are for
> videomixer

Ok, I found why I couldn't review this patch. Patch need to be attach with the patch checkbox, and in raw text, no zip file.

In order to produce a patch in the format we expect, so this, assuming 1 patch:

git format-patch -1

This will create a 00001-Description.path file, just attach this file. We have a tool on Bugzilla to review and comment about the code. Thanks for your time !
Comment 7 Nicolas Dufresne (ndufresne) 2014-03-07 16:18:39 UTC
For the reference:

>+  GST_VIDEO_MIXER2_UNLOCK (mix);
>   GST_COLLECT_PADS_STREAM_LOCK (mix->collect);
>+  GST_VIDEO_MIXER2_LOCK (mix);

This looks appropriate, and match the description I've made. The other patch seems unrelated, maybe file another bug, or just explain why you need this null checking (usually we try and find the source, and avoid NULL check if possible).
Comment 8 Nicolas Dufresne (ndufresne) 2014-03-07 16:19:57 UTC
Note though, it might not be fixing what is described in this bug, which is about the GSList protection, so don't close until the locking has been reviewed.
Comment 9 [laucha] 2014-03-07 19:19:28 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > Created an attachment (id=271223) [details] [details] [review]
> > workaround for the segfault and a patch for the freezing bug. Both are for
> > videomixer
> 
> Ok, I found why I couldn't review this patch. Patch need to be attach with the
> patch checkbox, and in raw text, no zip file.
> 
> In order to produce a patch in the format we expect, so this, assuming 1 patch:
> 
> git format-patch -1
> 
> This will create a 00001-Description.path file, just attach this file. We have
> a tool on Bugzilla to review and comment about the code. Thanks for your time !

Sorry, I didn't know that. Do you want me to upload again the patch in the correct way?

(In reply to comment #7)
> For the reference:
> 
> >+  GST_VIDEO_MIXER2_UNLOCK (mix);
> >   GST_COLLECT_PADS_STREAM_LOCK (mix->collect);
> >+  GST_VIDEO_MIXER2_LOCK (mix);
> 
> This looks appropriate, and match the description I've made. The other patch
> seems unrelated, maybe file another bug, or just explain why you need this null
> checking (usually we try and find the source, and avoid NULL check if
> possible).

I am completly agree with you. If I find the problem's source I will ask you how to continue :).
Comment 10 Nicolas Dufresne (ndufresne) 2014-03-07 19:55:03 UTC
(In reply to comment #9)
> I am completly agree with you. If I find the problem's source I will ask you
> how to continue :).

Yes please, so I don't have to craft the Author line myself, and while doing do, add a link to this bug on the last line of your commit message.
Comment 11 [laucha] 2014-03-07 21:14:33 UTC
Created attachment 271268 [details] [review]
Fix a locking order issue on videomixer

I've attached the patch. If you have any comments please tell me.
Comment 12 Nicolas Dufresne (ndufresne) 2014-03-10 14:10:55 UTC
Just a small update, yesterday I've ran the test script over night to make sure I could reproduce the crash. That is done, I'll apply the workaround and let it run to see if I can get the deadlock. Then apply the locking fix to see if it fixes it, or at least move the deadlock somewhere else.
Comment 13 David Fernandez 2014-03-13 10:01:51 UTC
Created attachment 271694 [details] [review]
Patch to protect a critical variable values change

I think that this patch solves the race condition setting src_caps. Sometimes, the function blend_buffers uses fill_color and its value is null because the function src_setcaps set its to NULL out the lock.
Comment 14 [laucha] 2014-04-03 14:10:03 UTC
(In reply to comment #12)
> Just a small update, yesterday I've ran the test script over night to make sure
> I could reproduce the crash. That is done, I'll apply the workaround and let it
> run to see if I can get the deadlock. Then apply the locking fix to see if it
> fixes it, or at least move the deadlock somewhere else.

Nicolas, could you get the deadlock issue? if you did, did the patch fix it?
Comment 15 Nicolas Dufresne (ndufresne) 2014-06-04 13:34:01 UTC
Meanwhile a new mixer has been introduced, called compositor iirc. I wonder what we should do with this ?
Comment 16 Sebastian Dröge (slomo) 2014-06-04 13:41:29 UTC
compositor won't be in 1.4, so would be nice to still fix this.
Comment 17 Nicolas Dufresne (ndufresne) 2014-06-04 14:33:06 UTC
Review of attachment 271268 [details] [review]:

::: gst/videomixer/videomixer2.c
@@ +2160,2 @@
   GST_COLLECT_PADS_STREAM_LOCK (mix->collect);
+  GST_VIDEO_MIXER2_LOCK (mix);

Normally, because we have unlocked, we should be validating the state before continuing. Is there any other calls that should induce a different behaviour here ?
Comment 18 Nicolas Dufresne (ndufresne) 2014-06-04 14:36:24 UTC
Review of attachment 271694 [details] [review]:

Looks good, but the commit message should be more descriptive. Think of someone that hits the same issue as you did and then try to write a message that would help that person understanding this patch will fix what he see.
Comment 19 David Fernandez 2014-06-25 13:41:24 UTC
Created attachment 279223 [details] [review]
Patch to avoid a segmentation fault when src caps are configured

Hi Nicolas,

Thank you for your comment. I have uploaded the patch changing the commit message. I also have added a short description.
Comment 20 Sebastian Dröge (slomo) 2014-06-25 14:45:18 UTC
Comment on attachment 279223 [details] [review]
Patch to avoid a segmentation fault when src caps are configured

commit 4ed74d3ab0ffd474d40728922f6277a1d90f69ad
Author: David Fernandez <d.fernandezlop@gmail.com>
Date:   Thu Mar 13 10:35:30 2014 +0100

    videomixer2: Solve segmentation fault when src caps are configured
    
    Change function pointers to NULL while holding the lock to avoid
    race conditions
    
    https://bugzilla.gnome.org/show_bug.cgi?id=701110
Comment 21 Sebastian Dröge (slomo) 2014-06-25 14:45:41 UTC
Thanks for the patch! Does this fix all the problems in this bug already?
Comment 22 David Fernandez 2014-06-26 09:01:23 UTC
I think so. I'm using the videomixer2 in a dynamic pipeline and it is more robust using this patch and this one http://lists.freedesktop.org/archives/gstreamer-commits/2014-March/077911.html
Comment 23 Thibault Saunier 2014-07-01 13:29:30 UTC
@David Fernandez: You should give a try to the new GstAggregator based videomixer called "compositor" in -plugins-bad. It fixes quite a few limitation of GstCollectPads and is going to replace videomixer in the mid term.
Comment 24 David Fernandez 2014-07-02 12:37:06 UTC
@Thibault Saunier: Thank you Thibault. I have taken a look and it seems work fine. The action to add or remove pads is faster than in the videomixer.