Bug 783086 – vpxenc: memory usage grows when dropframe-threshold is enabled

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 783086 - vpxenc: memory usage grows when dropframe-threshold is enabled


Summary:	vpxenc: memory usage grows when dropframe-threshold is enabled


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-good
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	1.12.3
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2017-05-25 10:57 UTC by George Kiagiadakis
Modified:	2017-08-17 10:42 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
vpxenc: discard frames that have been dropped by libvpx (3.90 KB, patch) 2017-05-25 10:57 UTC, George Kiagiadakis	none	Details \| Review
vpxenc: discard frames that have been dropped by libvpx (2.34 KB, patch) 2017-07-27 14:25 UTC, George Kiagiadakis	committed	Details \| Review

Description George Kiagiadakis 2017-05-25 10:57:32 UTC

Created attachment 352566 [details] [review]
vpxenc: discard frames that have been dropped by libvpx

When dropframe-threshold has been set, libvpx may output less frames than the input ones, which causes some GstVideoCodecFrames to queue up in GstVideoEncoder's internal frame queue with no chance of ever being all released. And because the frames keep references to the input buffers, the input buffer pool keeps allocating new buffers and memory usage grows very fast. For example the following pipeline's memory usage grows at a rate of about 1GB per minute!

videotestsrc ! capsfilter caps=video/x-raw,width=1920,height=1080,framerate=30/1,format=I420 ! vp8enc target-bitrate=1000000 end-usage=cbr dropframe-threshold=95 ! fakesink

The attached patch attempts to solve the issue.
I'm not very fond of it, though. The patch assumes that an input frame will either result immediately in exactly 1 output libvpx packet or no packet at all, but libvpx gives no such guarantees and it's quite unclear what happens when gst_vpx_enc_process() is being called from gst_vpx_enc_drain(). Can there be more packets at the end?

It works, though... and solves the issue.

Comment 1 Sebastian Dröge (slomo) 2017-05-25 11:16:27 UTC

Don't we get any indication from libvpx when a frame is being dropped?

Comment 2 Tim-Philipp Müller 2017-05-25 11:23:27 UTC

Here's another question: aiui a VPx encoder may actually output more frames than there were input frames (e.g. golden frames) - do we handle that case?

Comment 3 George Kiagiadakis 2017-05-25 12:10:38 UTC

(In reply to Sebastian Dröge (slomo) from comment #1)
> Don't we get any indication from libvpx when a frame is being dropped?

No. Or at least, I couldn't find it.

(In reply to Tim-Philipp Müller from comment #2)
> Here's another question: aiui a VPx encoder may actually output more frames
> than there were input frames (e.g. golden frames) - do we handle that case?

No. There must be an equal number of frames on the input and output at the moment.

Comment 4 Sebastian Dröge (slomo) 2017-05-25 12:18:38 UTC

(In reply to George Kiagiadakis from comment #3)

> (In reply to Tim-Philipp Müller from comment #2)
> > Here's another question: aiui a VPx encoder may actually output more frames
> > than there were input frames (e.g. golden frames) - do we handle that case?
> 
> No. There must be an equal number of frames on the input and output at the
> moment.

That would be a regression then, maybe from moving to the vpxenc base class. I implemented that a long time ago

Comment 5 George Kiagiadakis 2017-05-26 14:42:16 UTC

(In reply to Sebastian Dröge (slomo) from comment #4)
> (In reply to George Kiagiadakis from comment #3)
> > No. There must be an equal number of frames on the input and output at the
> > moment.
> 
> That would be a regression then, maybe from moving to the vpxenc base class.
> I implemented that a long time ago

Maybe I misunderstand something, but I fail to see how this would have worked in the past. GstVideoEncoder allocates a GstVideoCodecFrame internally in its chain() function. It then passes it down to the handle_frame() vfunc, while at the same time it stores it in an internal queue (priv->frames). GstVpxEnc afterwards in handle_frame() uses that GstVideoCodecFrame to call the encode function and then unrefs it (but without removing it from the queue). Later, in the gst_vpx_enc_processs() function, GstVideoCodecFrames are retrieved from the queue, filled with an output buffer and sent to the finish_frame() function, which properly releases the GstVideoCodecFrame and removes it from the queue as well.

Now, aiui, this is a 1-to-1 match. There is no way for GstVpxEnc (or any other encoder) to create a new GstVideoCodecFrame, since gst_video_encoder_new_frame() is not public API, therefore if you get more frames on the output than how many you have on the input, you are going to run out of GstVideoCodecFrames.

I don't fully understand why GstVideoEncoder is designed this way, but it looks like it would have been simpler to separate the input from the output frame objects and just release the input as soon as it is not needed, while at the same time create new objects on demand at the output. This would have avoided this bug, and possibly others.

Comment 6 Tim-Philipp Müller 2017-05-26 14:55:26 UTC

Theoretically the subclass could ref the GstVideoCodecFrame when it sees a golden frame, and call _finish() once with the first output buffer, and then (knowing that it has been pushed) set the output buffer in the codec frame to the next frame and call _finish() again. If that works in practice I don't know :)

Comment 7 Sebastian Dröge (slomo) 2017-05-26 15:04:09 UTC

You can just directly push a buffer downstream if there's no corresponding GstVideoCodecFrame.

Comment 8 Nicolas Dufresne (ndufresne) 2017-06-02 20:18:12 UTC

(In reply to Sebastian Dröge (slomo) from comment #7)
> You can just directly push a buffer downstream if there's no corresponding
> GstVideoCodecFrame.

Golden frames are still supported. They are passed to the subclass using VpxEnc::handle_invisible_frame_buffer and push using pad_push using VideoEnc::pre_push.

Comment 9 Nicolas Dufresne (ndufresne) 2017-06-05 16:13:41 UTC

Review of attachment 352566 [details] [review]:

::: ext/vpx/gstvpxenc.c
@@ +1697,3 @@
+        frame = gst_video_encoder_get_oldest_frame (video_encoder);
+      } while (expected_frame_number != 0 &&
+          frame->system_frame_number != expected_frame_number);

Interestingly, I have strong doubt that STATS packets replaces an encoded frame. I'm starting to think that this code is totally wrong, that this block should simply be removed.

@@ +1724,3 @@
+      frame = gst_video_encoder_get_oldest_frame (video_encoder);
+    } while (expected_frame_number != 0 &&
+        frame->system_frame_number != expected_frame_number);

I have my doubts about the correctness of using the frame number here. What VPX seems to carry, is temporal information about the frames. It's unfortunate we give the user control over the resolution, but yet, I think it still usable for sorting out which frames are to be removed.

If you look attentively, the pkt contains a timestamp which is carried from the input.

Comment 10 Nicolas Dufresne (ndufresne) 2017-06-05 16:35:30 UTC

(In reply to Nicolas Dufresne (stormer) from comment #9)
> Review of attachment 352566 [details] [review] [review]:
> 
> ::: ext/vpx/gstvpxenc.c
> @@ +1697,3 @@
> +        frame = gst_video_encoder_get_oldest_frame (video_encoder);
> +      } while (expected_frame_number != 0 &&
> +          frame->system_frame_number != expected_frame_number);
> 
> Interestingly, I have strong doubt that STATS packets replaces an encoded
> frame. I'm starting to think that this code is totally wrong, that this
> block should simply be removed.

Ignore this one, in first pass mode, only stats packets are produced. The fake buffer is just to make the flow working and to prevent leaks that same way you do now.

Comment 11 Nicolas Dufresne (ndufresne) 2017-06-26 15:09:14 UTC

Ping, any feedback about using the frame number vs using timestamp ?

Comment 12 George Kiagiadakis 2017-07-24 12:33:08 UTC

(In reply to Nicolas Dufresne (stormer) from comment #11)
> Ping, any feedback about using the frame number vs using timestamp ?

I guess it will work, assuming the encoder doesn't modify timestamps...
I wonder what the timestamp of the STATS packet will be, though. Note that the stats packet uses the 'twopass_stats' member of the packet union instead of the 'frame' one, which has the timestamp. It's all still a bit fuzzy.

Comment 13 George Kiagiadakis 2017-07-27 14:25:49 UTC

Created attachment 356478 [details] [review]
vpxenc: discard frames that have been dropped by libvpx

I've tried using the pts as an experiment. It seems to work perfectly and solves the issue as well. I didn't do this for the STATS packets, as I'm not sure how they should be handled, but I'm thinking that maybe in their case we don't have to do anything. The STATS packet will be matched to an old GstVideoCodecFrame and then we will catch up when a normal packet arrives.

Comment 14 Nicolas Dufresne (ndufresne) 2017-08-10 15:03:43 UTC

Review of attachment 356478 [details] [review]:

Looks good to me, it's also simplier now.

Comment 15 George Kiagiadakis 2017-08-11 11:33:53 UTC

commit 36fc2a747ade50db69221a502d498efa890bf054 (HEAD -> master, origin/master, origin/HEAD)
Author: George Kiagiadakis <george.kiagiadakis@collabora.com>
Date:   Thu Jul 27 17:21:48 2017 +0300

    vpxenc: discard frames that have been dropped by libvpx
    
    This fixes a memory leak. When dropframe-threshold has been set,
    libvpx may output less frames than the input ones, which causes
    some GstVideoCodecFrames to queue up in GstVideoEncoder's internal
    frame queue with no chance of ever being all released. And because
    the frames keep references to the input buffers, the input buffer
    pool keeps allocating new buffers and memory usage grows very fast.
    For example the following pipeline's memory usage grows at a rate
    of about 1GB per minute!
    
    videotestsrc ! capsfilter caps=video/x-raw,width=1920,height=1080,framerate=30/1,format=I420 ! \
      vp8enc target-bitrate=1000000 end-usage=cbr dropframe-threshold=95 ! fakesink
    
    https://bugzilla.gnome.org/show_bug.cgi?id=783086

Comment 16 Sebastian Dröge (slomo) 2017-08-11 12:04:38 UTC

Also for 1.12?

Comment 17 Nicolas Dufresne (ndufresne) 2017-08-11 13:23:44 UTC

(In reply to Sebastian Dröge (slomo) from comment #16)
> Also for 1.12?

Make sense.