Bug 696770 – avdec_h264: Memory leak when processing h.264 field pictures

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 696770 - avdec_h264: Memory leak when processing h.264 field pictures


Summary:	avdec_h264: Memory leak when processing h.264 field pictures


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-libav
Version:	1.0.5
Hardware:	Other Windows

Importance:	Normal normal
Target Milestone:	1.3.1
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2013-03-28 11:52 UTC by Michael Rubinstein
Modified:	2013-12-01 13:41 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Fix the memory leak associated with field picture decoding (2.17 KB, patch) 2013-06-04 12:03 UTC, sreerenj	reviewed	Details \| Review
Fix some comments (1.28 KB, patch) 2013-06-04 13:27 UTC, sreerenj	none	Details \| Review
Edited patches for gstavvidec.c to release frames from field pictures. (3.36 KB, patch) 2013-06-04 17:11 UTC, Michael Rubinstein	none	Details \| Review

Description Michael Rubinstein 2013-03-28 11:52:47 UTC

Our application demuxes, parses and decodes a live h.264 transport stream.
The input consists, mostly, of "field picutres".  These seem to decode properly.

The decoder produces one output frame for each pair of input buffers, as expected.  However, some memory is not free'd.

Three related items are not free'd.

1.  The input buffer created by the parser.
2.  A GstVideoCodecFrame allocated by GstVideoDecoder.  It has a reference to the buffer.
3.  A linked list element used to keep a list of active frames in GstVideoDecoder. The list is in _GstVideoDecoderPrivate.frames.

GstVideoDecoder will release the memory in question if
gst_video_decoder_drop_frame or gst_video_decoder_finish_frame is called.

GstFFMpegVidDec (gst-libav\ext\libav\gstavviddec.c) calls gst_video_decoder_finish_frame for each output frame, but doesn't call anything for the other field picture.

I tried calling gst_video_decoder_finish_frame as well as some other ways of releasing the memory but couldn't find a reliable way to release the memory without releasing memory that the decoder was still using.

As an experiment, I tried releasing old frames when the GstVideoDecoder frame list grew longer than 100 frames.  This fixed the memory leak.  I don't think that this a solution, but it does help validate the above analysis.

Comment 1 Tim-Philipp Müller 2013-03-28 13:10:19 UTC

Could you make a sample stream available by any chance, or point us to one? (Or does it happen with all interlaced H.264 streams?)

Does it happen with fakesink as well?

Comment 2 Michael Rubinstein 2013-03-28 21:24:02 UTC

Here's a link to a file:
https://www.box.com/s/mfiu8vx10bwfno0em9wm

Please note that this file contains a transport stream with two programs, one in h.264, the other is MPEG2.  Please use program 1.

I don't think the problem happens with all interlaced streams.  But I don't know for sure.  We don't have any problem with MPEG2 interlaced streams, but they have interlaced "frame pictures", so two fields per picture, one picture in, one out.  I would expect the same sort of problem if we had MPEG2 interlaced "field pictures".  These would have the "two in, one out", problem.

So, I think the problem is related to "field pictures" rather than interlace or h.264 vs MPEG2.

You want me to put a fakesink in place of the decoder?

Thanks,
Michael

Comment 3 sreerenj 2013-03-28 22:45:52 UTC

I have noticed the same issue with field-encoded mpeg2 streams sometimes before but didn't get time to investigate.

It seems that the following field-encoded sample is causing memory leak. Clearly noticeable with top command.

http://samples.mplayerhq.hu/MPEG-VOB/interlaced/uruseiyatsura.vob

Now it seems that there are some timestamp issue also !

Comment 4 Tim-Philipp Müller 2013-03-28 23:51:57 UTC

Thanks for the samples.

Comment 5 sreerenj 2013-06-04 12:02:23 UTC

For the field encoded pictures (interlaced stream with individual field encoded frames), the get_buffer() method of libav will get invoked only for one field (either top or bottom). Which means the decoder is using the same buffer for rendering both top and bottom field. If I understood correctly we are supposed to call _finish_frame() or drop_frame() for all the frames that reaches in handle_frame().Here we are not calling any of them for alternate fields. 

To fix this mem leak:
1) either we need to drop the alternate fields
2) or set the DECODE_ONLY flag to alternate fields and invoke _finish_frame.

I will attach a quick fix.

Comment 6 sreerenj 2013-06-04 12:03:54 UTC

Created attachment 245998 [details] [review]
Fix the memory leak associated with field picture decoding

Comment 7 sreerenj 2013-06-04 13:27:27 UTC

Created attachment 246005 [details] [review]
Fix some comments

Comment 8 Sebastian Dröge (slomo) 2013-06-04 15:27:07 UTC

Review of attachment 245998 [details] [review]:

::: ext/libav/gstavviddec.c
@@ +1400,3 @@
+     ffmpeg is allocating the output buffer only for complete frames and not
+     for the individual fields. */
+  if (frame && !frame->output_buffer) {

I don't think this is correct. libav is not necessarily allocating an output buffer for the current input frame immediately

Also does this mean that every GstVideoCodecFrame was a field here, not a frame? In that case it would be something that the parser should've fixed, GstVideoCodecFrame is supposed to be one complete frame with everything related to it.

Comment 9 Michael Rubinstein 2013-06-04 17:10:22 UTC

I'm attaching my temporary fix.  It's a diff from 1.0.7.  However, I edited the diff to remove unrelated changes.

The basic approach is to detect when the decoder processes a frame without allocating an output buffer.  I set a flag when the buffer is allocated and check it after the decode call.

Before releasing a frame, I, change the reference count.

I don't think my fix ia really correct, but it does handle the normal case. 
Hopefully, it helps describe the problem.

It would be nice to combine the two field pictures into a single frame at an earlier stage, but the h.264 deccoder would need to handle it as well.

Comment 10 Michael Rubinstein 2013-06-04 17:11:55 UTC

Created attachment 246020 [details] [review]
Edited patches for gstavvidec.c to release frames from field pictures.

Comment 11 Sebastian Dröge (slomo) 2013-06-04 17:33:11 UTC

libav doesn't really care about the framing, so having both fields in a single buffer should be fine.

Comment 12 sreerenj 2013-06-04 19:13:53 UTC

(In reply to comment #8)
> Review of attachment 245998 [details] [review]:
> 
> ::: ext/libav/gstavviddec.c
> @@ +1400,3 @@
> +     ffmpeg is allocating the output buffer only for complete frames and not
> +     for the individual fields. */
> +  if (frame && !frame->output_buffer) {
> 
> I don't think this is correct. libav is not necessarily allocating an output
> buffer for the current input frame immediately

Does the following code path in get_buffer() making sure the latest frame buffer allocation?

picture->reordered_opaque = context->reordered_opaque;
  frame =
      gst_video_decoder_get_frame (GST_VIDEO_DECODER (ffmpegdec),
      picture->reordered_opaque);

> 
> Also does this mean that every GstVideoCodecFrame was a field here, not a
> frame? In that case it would be something that the parser should've fixed,
> GstVideoCodecFrame is supposed to be one complete frame with everything related
> to it.

Yes, for field-encoded stream, every GstVideoCodecFrame is a field. The issue is exactly what I mentioned before.  I don't think it is the duty of parser. Because if I understood correctly some decoders can send fields as frames to renderer directly.  For eg, vaapi has a feature for this. vaPutsurface() api has flags to specify whether the surface has top fields or bottom fields. But AFAIK this is not yet supported in driver. So in current gstreamer-vaapi we are using the same surface to render both top and bottom fields.

I have a fix for this to gstreamer-vaapi also:
https://bugzilla.gnome.org/show_bug.cgi?id=701257

We need similar fixes to mpeg2dec also.

Comment 13 sreerenj 2013-06-05 08:13:23 UTC

(In reply to comment #12)
> (In reply to comment #8)
>
> 
> Yes, for field-encoded stream, every GstVideoCodecFrame is a field. The issue
> is exactly what I mentioned before.  I don't think it is the duty of parser.
> Because if I understood correctly some decoders can send fields as frames to
> renderer directly.  For eg, vaapi has a feature for this. vaPutsurface() api
> has flags to specify whether the surface has top fields or bottom fields. But
> AFAIK this is not yet supported in driver. So in current gstreamer-vaapi we are
> using the same surface to render both top and bottom fields.

Using same surface but rendering each field separately using the flags in vaPutSurface.

> 
> I have a fix for this to gstreamer-vaapi also:
> https://bugzilla.gnome.org/show_bug.cgi?id=701257
> 
> We need similar fixes to mpeg2dec also.

Comment 14 Sebastian Dröge (slomo) 2013-06-05 11:06:55 UTC

(In reply to comment #12)
> (In reply to comment #8)
> > Review of attachment 245998 [details] [review] [details]:
> > 
> > ::: ext/libav/gstavviddec.c
> > @@ +1400,3 @@
> > +     ffmpeg is allocating the output buffer only for complete frames and not
> > +     for the individual fields. */
> > +  if (frame && !frame->output_buffer) {
> > 
> > I don't think this is correct. libav is not necessarily allocating an output
> > buffer for the current input frame immediately
> 
> Does the following code path in get_buffer() making sure the latest frame
> buffer allocation?
> 
> picture->reordered_opaque = context->reordered_opaque;
>   frame =
>       gst_video_decoder_get_frame (GST_VIDEO_DECODER (ffmpegdec),
>       picture->reordered_opaque);

No, it only makes sure that some frames gets a buffer allocation. Might be the latest, might be any other frame.

> > Also does this mean that every GstVideoCodecFrame was a field here, not a
> > frame? In that case it would be something that the parser should've fixed,
> > GstVideoCodecFrame is supposed to be one complete frame with everything related
> > to it.
> 
> Yes, for field-encoded stream, every GstVideoCodecFrame is a field. The issue
> is exactly what I mentioned before.  I don't think it is the duty of parser.
> Because if I understood correctly some decoders can send fields as frames to
> renderer directly.  For eg, vaapi has a feature for this. vaPutsurface() api
> has flags to specify whether the surface has top fields or bottom fields. But
> AFAIK this is not yet supported in driver. So in current gstreamer-vaapi we are
> using the same surface to render both top and bottom fields.
> 
> I have a fix for this to gstreamer-vaapi also:
> https://bugzilla.gnome.org/show_bug.cgi?id=701257
> 
> We need similar fixes to mpeg2dec also.

Then this somehow needs to be handled in the GstVideoDecoder::parse() function to accumulate all fields before claiming something to be a frame. The detection of this could be based on buffer flags, that are put on there by the parser (http://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-base-libs/html/gst-plugins-base-libs-gstvideo.html#GstVideoBufferFlags  ONEFIELD | TFF and ONEFIELD | RFF)

Comment 15 sreerenj 2013-06-05 11:19:01 UTC

But we have to decode the fields individually. Which means considering each of them as frames. Remember, this is only in case of filed-enocoded interlaced content and not for interlaced frames.

Comment 16 Sebastian Dröge (slomo) 2013-06-05 11:27:13 UTC

(In reply to comment #15)
> But we have to decode the fields individually. Which means considering each of
> them as frames. Remember, this is only in case of filed-enocoded interlaced
> content and not for interlaced frames.

Yes, but libav does not care about that. For other decoders that do care see bug #660770

Comment 17 Sebastian Dröge (slomo) 2013-06-05 11:36:07 UTC

(In reply to comment #15)
> But we have to decode the fields individually. Which means considering each of
> them as frames. Remember, this is only in case of filed-enocoded interlaced
> content and not for interlaced frames.

Yes, but libav does not care if you pass two fields at once or just one. For other decoders that do care see bug #660770

Comment 18 sreerenj 2013-06-05 11:52:44 UTC

(In reply to comment #17)
> (In reply to comment #15)
> > But we have to decode the fields individually. Which means considering each of
> > them as frames. Remember, this is only in case of filed-enocoded interlaced
> > content and not for interlaced frames.
> 
> Yes, but libav does not care if you pass two fields at once or just one. For
> other decoders that do care see bug #660770

aha, okay.

So you mean, the temporary fix would be to send only complete frame to handle_frame() from base_decoder. right? But how can we make sure that the frame is complete or not from basedecoder if there is no parser in the pipeline? Only the individual decoders which are doing parsing themselves can identify the frame boundaries. Am i missing something?

Comment 19 Tim-Philipp Müller 2013-06-05 11:55:29 UTC

There should be a parser in the pipeline. We require this for almost all decoders now.

Comment 20 sreerenj 2013-06-05 12:01:29 UTC

(In reply to comment #19)
> There should be a parser in the pipeline. We require this for almost all
> decoders now.

Okay, So the assumption is that the decoder is always receiving data as complete frames(packetized) if there is a parser in the pipeline. right?

Comment 21 Sebastian Dröge (slomo) 2013-06-05 12:15:17 UTC

Yes

Comment 22 sreerenj 2013-06-05 14:26:14 UTC

(In reply to comment #21)
> Yes

Okay ...
so the temporary fix for this mem leak at the moment (with out fixing https://bugzilla.gnome.org/show_bug.cgi?id=660770) is to,

--send only complete frame from videoparser elements.

right?

Comment 23 sreerenj 2013-06-06 07:10:17 UTC

I prefer to raise the severity of this bug (may be this one also: 660770) since it is a huge memory leak and I have many samples like this.

Comment 24 Sebastian Dröge (slomo) 2013-06-06 12:44:24 UTC

Yeah, I think the parsers were supposed to do that anyway. They send everything up to and including a frame... just didn't consider separate fields.

Comment 25 sreerenj 2013-06-06 12:54:30 UTC

(In reply to comment #24)
> Yeah, I think the parsers were supposed to do that anyway. 

Irrespective of the solution of https://bugzilla.gnome.org/show_bug.cgi?id=660770 ??

They send everything
> up to and including a frame... just didn't consider separate fields.

Comment 26 Sebastian Dröge (slomo) 2013-06-06 13:00:35 UTC

There should be some kind of new "stream-format" for that then IMHO.


The only alternative would be that we *require* all decoders that handle a codec with separate-field encoding to combine the fields in GstVideoDecoder::parse() unless they handle partial frames somehow and #660770 is fixed.

Comment 27 Tim-Philipp Müller 2013-06-06 13:05:18 UTC

For me #660770 is more about slices than fields (I don't think raw video slices is something we need to support anytime soon).

Comment 28 Sebastian Dröge (slomo) 2013-06-06 13:22:37 UTC

How would you handle fields differently?

Comment 29 sreerenj 2013-06-06 13:27:02 UTC

(In reply to comment #26)
> There should be some kind of new "stream-format" for that then IMHO.
> 
> 
> The only alternative would be that we *require* all decoders that handle a
> codec with separate-field encoding to combine the fields in
> GstVideoDecoder::parse() unless they handle partial frames somehow and #660770
> is fixed.



The alternative solution might not be good for decoders like gst-libav if they wants to handle the fields separately since it is not overriding the GstVideoDecoder::parse() routine at the moment. And we cannot add it to basedecoder since the decision of handling them as separate fields or not is taken by individual decoders.

So the solution would be to combine them from parser. But somehow i don't feel it as a good solution. :)

Comment 30 Sebastian Dröge (slomo) 2013-06-06 14:10:57 UTC

gst-libav could just override the ::parse vfunc and implement that based on the buffer flags though.

I agree that doing it in the parser does not feel right, however currently everything is expecting to get a complete *frame* out of the parser :(

Comment 31 sreerenj 2013-06-10 13:30:19 UTC

The second patch "Fix some comments" is independent of the issue here. :)

Comment 32 Matej Knopp 2013-07-15 07:13:33 UTC

I don't think libav is very happy if you feed it separate fields in frame based multi threaded mode.

Comment 33 Edward Hervey 2013-07-23 12:14:38 UTC

(the various comments have gone completely astray and are confusing to say the least)

Is this whole bug about people using the libav decoders without a parser ? If so this bug will be closed.

Comment 34 Sebastian Dröge (slomo) 2013-07-23 12:22:32 UTC

No, it's also about h264parse and others providing one field per buffer instead of one frame per buffer. But that's a h264parse bug

Comment 35 Michael Rubinstein 2013-07-23 13:20:48 UTC

I'm using a parser and have the problem.
It's mismatch between the Parser and the Decoder.

IMHO gstavviddec's buffer lifetime handling could be improved.
It also leaks when skipping frames.

Comment 36 Nicolas Dufresne (ndufresne) 2013-11-22 16:45:23 UTC

What happen if the periodic pps/sps code insert stuff in the middle of two fields ? Does the spec allow this ?

Comment 37 Mark Nauwelaerts 2013-11-22 17:56:40 UTC

This seems to belong more to bug #704214, but from vague (quite possibly incomplete) recollection, when it comes to sps/pps, spec is mainly concerned about:
* these starting a new AU, but that is possibly already so for the other field coded picture
* these having to be present when referred to, but as the inserted ones are repeats (with the same data) of previous ones, not necessarily a problem

Comment 38 Matej Knopp 2013-11-22 21:24:09 UTC

Not sure what spec says about it, but I've seen streams with PPS in the second access unit.

Comment 39 Mark Nauwelaerts 2013-12-01 11:18:51 UTC

Testing shows frame list is being cleaned up now due to fixes for bug #693772, so I would say this is fixed as well.

h264(parse) framing might be improved, but being tracked in bug #704214.