After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 788025 - vaapicontext: don't keep the context in pipeline when encoding
vaapicontext: don't keep the context in pipeline when encoding
Status: RESOLVED INVALID
Product: GStreamer
Classification: Platform
Component: gstreamer-vaapi
1.12.2
Other Linux
: Normal enhancement
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2017-09-21 20:45 UTC by Harvey
Modified: 2018-05-04 09:16 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
gstreamer pipeline (17.82 KB, application/msword-template)
2017-09-21 20:45 UTC, Harvey
Details

Description Harvey 2017-09-21 20:45:31 UTC
Created attachment 360228 [details]
gstreamer pipeline

GStreamer 1.12.2
Yocto Pyro 2.3

We have an application using the attached pipeline. We make very frequent recordings and have noticed a rapid leak of memory. Every time we start and stop a recording (encoding of captured video), we lose 1-4 frames of memory. The `pmap` command reports the memory buffers to be frame sized (12MB for a 4K frame) and belonging to /dev/dri/renderD128. After about 500 recordings, over 7 GB of memory is gone. 

I have source for everything, but am unsure of how to proceed.
Comment 1 Hyunjun Ko 2017-09-22 09:21:34 UTC
I've been trying to reproduce this issue since I heard about this kind of leak in IRC and Slack. But I failed to reproduce it on both 1.12 and master.

FYI, I tried to reproduce by my sample using v4l2src, vaapih264enc and fdsink.

If you give any clue to reproduce it, it'd be nice.
Comment 2 Víctor Manuel Jáquez Leal 2017-09-22 10:06:56 UTC
Would be great if you could track the memory leaks using the GStreamer tracer[1]

GST_TRACERS="leaks" GST_DEBUG="GST_TRACER:7" 

Also, a valgrind session would be great


1. https://gstreamer.freedesktop.org/data/events/gstreamer-conference/2016/Guillaume%20Desmottes%20-%20Tracking%20Memory%20Leaks.pdf
Comment 3 Harvey 2017-09-22 16:08:06 UTC
(In reply to Hyunjun Ko from comment #1)
> I've been trying to reproduce this issue since I heard about this kind of
> leak in IRC and Slack. But I failed to reproduce it on both 1.12 and master.
> 
> FYI, I tried to reproduce by my sample using v4l2src, vaapih264enc and
> fdsink.
> 
> If you give any clue to reproduce it, it'd be nice.

1. I'm not sure if I can release our gstreamer test app publicly, but I can give access to the two of you directly.

2. Using the GST_TRACERS didn't print anything useful.

3. I have a custom suppression file where I added every generated suppression. I'll upload that along with the summary of running valgrind. Nothing is classified as a "leak".

4. Our test app continuously captures frames to a tee. One of the tee outputs is to a valve with an h264 encoder pipeline. For testing, we make 20 second recordings with 5 second delays between.
Comment 4 Harvey 2017-09-22 16:09:30 UTC
5. The leak is not a true "leak". If I stop the entire pipeline, all of the "leaked" memory is released.
Comment 5 Víctor Manuel Jáquez Leal 2017-09-22 18:22:13 UTC
(In reply to Harvey from comment #4)
> 5. The leak is not a true "leak". If I stop the entire pipeline, all of the
> "leaked" memory is released.

Thus I'm setting it as an enhancement request.
Comment 6 Harvey 2017-09-22 19:02:38 UTC
(In reply to Víctor Manuel Jáquez Leal from comment #5)
> (In reply to Harvey from comment #4)
> > 5. The leak is not a true "leak". If I stop the entire pipeline, all of the
> > "leaked" memory is released.
> 
> Thus I'm setting it as an enhancement request.

The weird thing is that after we stop encoding, we turn off the valve,
remove the encoding bin from the pipeline, and unref it. But the memory remains allocated.
Comment 7 Hyunjun Ko 2017-09-25 01:19:04 UTC
(In reply to Harvey from comment #6)
> (In reply to Víctor Manuel Jáquez Leal from comment #5)
> > (In reply to Harvey from comment #4)
> > > 5. The leak is not a true "leak". If I stop the entire pipeline, all of the
> > > "leaked" memory is released.
> > 
> > Thus I'm setting it as an enhancement request.
> 
> The weird thing is that after we stop encoding, we turn off the valve,
> remove the encoding bin from the pipeline, and unref it. But the memory
> remains allocated.

Okay. 
So, "stop the entire pipeline" means call of gst_element_set_state (pipeline, GST_STATE_NULL), doesn't it? And all of the leaked memory is released after this call.

If then in "Every time we start and stop a recording (encoding of captured video), we lose 1-4 frames of memory" What does "start and stop a recording" mean?
Comment 8 Harvey 2017-09-25 02:51:11 UTC
(In reply to Hyunjun Ko from comment #7)
> (In reply to Harvey from comment #6)
> > (In reply to Víctor Manuel Jáquez Leal from comment #5)
> > > (In reply to Harvey from comment #4)
> > > > 5. The leak is not a true "leak". If I stop the entire pipeline, all of the
> > > > "leaked" memory is released.
> > > 
> > > Thus I'm setting it as an enhancement request.
> > 
> > The weird thing is that after we stop encoding, we turn off the valve,
> > remove the encoding bin from the pipeline, and unref it. But the memory
> > remains allocated.
> 
> Okay. 
> So, "stop the entire pipeline" means call of gst_element_set_state
> (pipeline, GST_STATE_NULL), doesn't it? And all of the leaked memory is
> released after this call.

Yes, setting state of the entire pipeline to NULL releases all of the leaked memory.

> If then in "Every time we start and stop a recording (encoding of captured
> video), we lose 1-4 frames of memory" What does "start and stop a recording"
> mean?

- Start of app
    - create pipeline: v4l2src -> caps -> valve (drop = true)
    - start recording:
        - create encoder bin: 
             bin[queue, videorate, videoconvert, vaapih264enc, mp4mux, fdsink ]
             - set videorate: skip-to-first=true
             - set queue:
                 max-size-buffers = 10
                 max-size-bytes = 4k*1.5*10
                 max-size-time = 0
                 leaky = GST_QUEUE_LEAK_UPSTREAM
        - add encoder bin to pipeline
        - link valve to encoder bin
        - gst_element_sync_state_to_parent(encoder bin)
        - set valve: drop = false
        - wait for fdsink to receive data
    - stop recording:
        - set valve: drop = true
        - gst_element_send_event(encoder bin, gst_event_new_eos())
        - wait for fdsink to receive EOS
        - gst_element_set_state(encoder bin, GST_STATE_NULL)
        - unlink valve from encoder bin
        - remove encoder bin from pipeline
        - unref encoder bin
Comment 9 Hyunjun Ko 2017-09-25 03:10:36 UTC
(In reply to Harvey from comment #8)
> - Start of app
>     - create pipeline: v4l2src -> caps -> valve (drop = true)
>     - start recording:
>         - create encoder bin: 
>              bin[queue, videorate, videoconvert, vaapih264enc, mp4mux,
> fdsink ]
>              - set videorate: skip-to-first=true
>              - set queue:
>                  max-size-buffers = 10
>                  max-size-bytes = 4k*1.5*10
>                  max-size-time = 0
>                  leaky = GST_QUEUE_LEAK_UPSTREAM
>         - add encoder bin to pipeline
>         - link valve to encoder bin
>         - gst_element_sync_state_to_parent(encoder bin)
>         - set valve: drop = false
>         - wait for fdsink to receive data
>     - stop recording:
>         - set valve: drop = true
>         - gst_element_send_event(encoder bin, gst_event_new_eos())
>         - wait for fdsink to receive EOS
>         - gst_element_set_state(encoder bin, GST_STATE_NULL)
>         - unlink valve from encoder bin
>         - remove encoder bin from pipeline
>         - unref encoder bin

Thanks for the information.
The pipeline looks more complicated than I thought :)
Carefully speaking, I suspect other plugins but I'm going to look into it considering this kind of pipeline.
Comment 10 Víctor Manuel Jáquez Leal 2017-09-25 08:18:07 UTC
Keep in mind that appsink, queue and tee may retain some buffers, and if zero-copy is negotiated that memory would be from VA surfaces held in those elements.
Comment 11 Harvey 2017-09-25 10:54:44 UTC
(In reply to Víctor Manuel Jáquez Leal from comment #10)
> Keep in mind that appsink, queue and tee may retain some buffers, and if
> zero-copy is negotiated that memory would be from VA surfaces held in those
> elements.

I simplified our pipeline to the one I just described to eliminate that possibility. I removed the tee and the other paths.
Comment 12 Hyunjun Ko 2017-09-26 01:48:53 UTC
(In reply to Harvey from comment #8)
> 
> - Start of app
>     - create pipeline: v4l2src -> caps -> valve (drop = true)
>     - start recording:
>         - create encoder bin: 
>              bin[queue, videorate, videoconvert, vaapih264enc, mp4mux,
> fdsink ]
>              - set videorate: skip-to-first=true
>              - set queue:
>                  max-size-buffers = 10
>                  max-size-bytes = 4k*1.5*10
>                  max-size-time = 0
>                  leaky = GST_QUEUE_LEAK_UPSTREAM
>         - add encoder bin to pipeline
>         - link valve to encoder bin
>         - gst_element_sync_state_to_parent(encoder bin)
>         - set valve: drop = false
>         - wait for fdsink to receive data
>     - stop recording:
>         - set valve: drop = true
>         - gst_element_send_event(encoder bin, gst_event_new_eos())
>         - wait for fdsink to receive EOS
>         - gst_element_set_state(encoder bin, GST_STATE_NULL)
>         - unlink valve from encoder bin
>         - remove encoder bin from pipeline
>         - unref encoder bin

I found some leaks but not sure it's the same leak.

Could you try the following:
1. Just once to create encode bin.
2. When start recording, do link, gst_element_sync_state_to_parent, set valve.
2. when stop recording, set valve, eos event, set state to NULL, unlink. (not remove encode bin)

I found this removed leaks that I found. But still not sure if this would affect your case.
Comment 13 Harvey 2017-09-26 14:35:29 UTC
(In reply to Hyunjun Ko from comment #12)
> 
> I found some leaks but not sure it's the same leak.
> 
> Could you try the following:
> 1. Just once to create encode bin.
> 2. When start recording, do link, gst_element_sync_state_to_parent, set
> valve.
> 2. when stop recording, set valve, eos event, set state to NULL, unlink.
> (not remove encode bin)
> 
> I found this removed leaks that I found. But still not sure if this would
> affect your case.

This doesn't seem to have had any effect (or perhaps this was another much, much smaller leak). I'm trying to test our code on Ubuntu 16.04 as well as Yocto 2.3, but I don't see any allocations at all labeled "renderD128" on Ubuntu. I'm not sure why. Can you help me to understand the difference? It's hard to step through the code on Yocto, but I can do it in Ubuntu.
Comment 14 Harvey 2017-09-29 18:56:52 UTC
We've installed a work-around:

```
static void unrefVaapiDisplayContext(PrivateData *priv) {
    GST_OBJECT_LOCK (priv->pipeline);
    GstElement* element = GST_ELEMENT_CAST(priv->pipeline);
    GList *l;
    for (l = element->contexts; l;) {
      GstContext *context = (GstContext*)l->data;

      if (!gst_context_is_persistent (context)) {
        if(g_strcmp0(gst_context_get_context_type(context), "gst.vaapi.Display") == 0) {
            GList *next;
            gst_context_unref (context);
            next = l->next;
            element->contexts = g_list_delete_link (element->contexts, l);
            l = next;
        } else {
            l = l->next;
        }
      } else {
        l = l->next;
      }
    }
    GST_OBJECT_UNLOCK (priv->pipeline);
}```
Comment 15 Víctor Manuel Jáquez Leal 2017-09-30 08:17:43 UTC
(In reply to Harvey from comment #14)
> We've installed a work-around:
> 

Could you provide a patch? Thanks.
Comment 16 Víctor Manuel Jáquez Leal 2017-09-30 08:20:16 UTC
Ah, that's in your application. No patch is required then.

When do you call that function in your app?
Comment 17 pauldotknopf 2017-12-06 06:34:18 UTC
We call this function after 20-or-so recordings to free up the memory. Directly after recording, we release the context that vaapih264enc stores in the pipeline.

This is what GST_STATE_NULL does, and why you don't typically see the memory leak, but it's there. It just isn't exposed in most pipelines. Our pipeline is continuously running (for rendering and such), and we dynamically add/remove the encoding piece without stopping/starting the pipeline.
Comment 18 Víctor Manuel Jáquez Leal 2018-05-04 09:16:13 UTC
This is a custom workaround for a specific problem. It is not a bug per-se, but a design issue in this use-case.

If you can come with a re-design idea to improve encoders sharing context, please reopen.