After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 751334 - FLAC: memory leak on a specific media file
FLAC: memory leak on a specific media file
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
1.5.1
Other Linux
: Normal major
: 1.5.90
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2015-06-22 15:54 UTC by Stephane Cerveau
Modified: 2015-08-16 13:39 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
massif report (87.61 KB, text/plain)
2015-06-23 13:26 UTC, Stephane Cerveau
  Details
patch flacdec fails on metadata. (792 bytes, patch)
2015-06-24 09:05 UTC, Stephane Cerveau
rejected Details | Review
remove header parsing in decoder (7.87 KB, patch)
2015-06-25 07:02 UTC, Vineeth
none Details | Review
improve error handling (2.52 KB, patch)
2015-06-26 04:38 UTC, Vineeth
committed Details | Review

Description Stephane Cerveau 2015-06-22 15:54:42 UTC
The specific FLAC media http://bit.ly/1BH1Fkm can not be played and makes a big memory leak(>400MB)

The command line is:

valgrind --tool=massif gst-launch-1.0 playbin uri=file:////tmp/Barclay\ James\ Harvest\ -\ Time\ Honoured\ Ghosts\ -\ 1\ -\ In\ My\ Life.flac


Here is the output when the framework is failing to start the playback.

0:00:00.049510284  5311 0x7f1ed40eded0 ERROR              flacparse gstflacparse.c:891:gst_flac_parse_handle_streaminfo:<flacparse0> Invalid metablock size for STREAMINFO: 4
handleBusMessage: An error has occured with element flacdec0: No valid frames decoded before end of stream.
Comment 1 Vineeth 2015-06-23 08:20:58 UTC
i could not find any memory leak when checking using memcheck option of valgrind.
Can you post more details of the memleak
Comment 2 Stephane Cerveau 2015-06-23 08:32:53 UTC
I dont find any memory leak either with memcheck but i can see that the memory usage is bumping to 400MB with ps or ps_mem.py.

As i described in the bug, it is also point out by massif tool.

Here is a pastebin http://pastebin.com/BRdVJ2Et which show the memory usage with massif tool.
Comment 3 Stephane Cerveau 2015-06-23 09:07:50 UTC
No memory issue with flac tool.
Comment 4 Sebastian Dröge (slomo) 2015-06-23 09:11:05 UTC
So basically all buffers allocated by basesrc are leaked according to massif
Comment 5 Sebastian Dröge (slomo) 2015-06-23 09:12:31 UTC
... which might be baseparse/flacparse forgetting to free invalid frames, or accumulating them all because maybe at a later point it can be considered a valid frame.


So there are probably two bugs here: 1) flacparse not accepting that file, 2) flacparse/baseparse leaking everything because of 1)
Comment 6 Nicolas Dufresne (ndufresne) 2015-06-23 13:07:28 UTC
Massif reports 3MB here. I cannot reproduce the 400MB peak that is being reported. Is there some more detail that should be mention ?
Comment 7 Stephane Cerveau 2015-06-23 13:26:17 UTC
Created attachment 305918 [details]
massif report
Comment 8 Nicolas Dufresne (ndufresne) 2015-06-23 13:33:57 UTC
Ok, sorry, I miss-read the thread, thought the leak was pointed at the parser, while it's in the decoder. I can reproduce now.
Comment 9 Nicolas Dufresne (ndufresne) 2015-06-23 14:23:42 UTC
Ok, did a quick look. FLAC__stream_decoder_process_until_end_of_metadata fails, but we still expect the read/write callback to be called. Then we keep queueing stuff until EOS. I suppose the parsers repeast stuff on each frame, otherwise I cannot see how 52mb can turn into 400Mb. Ignoring that error is most likely what least to this memory waste.
Comment 10 Stephane Cerveau 2015-06-23 14:36:04 UTC
So shall we return false in gst_flac_dec_set_format after GST_WARNING_OBJECT (dec, "process_until_end_of_metadata failed");
According to the doc https://xiph.org/flac/api/group__flac__stream__decoder.html#ga027ffb5b75dc39b3d26f55c5e6b42682
We should stop decoding. 
I had a try and in this case the av_flac is used and the media is correctly decoded without any important leak.
Comment 11 Nicolas Dufresne (ndufresne) 2015-06-23 15:25:40 UTC
To keep things simple, I'd say yes. Flacparse should never return an invalid header though (so flacparse never have to search in the stream for a updated valid header).
Comment 12 Nicolas Dufresne (ndufresne) 2015-06-23 15:27:07 UTC
(In reply to Sebastian Dröge (slomo) from comment #5)
> So there are probably two bugs here: 1) flacparse not accepting that file,
> 2) flacparse/baseparse leaking everything because of 1)

So I'd redo this as:

1) flacparse should set a valid streamheader (or drop)
2) flacdec should fail on invalid stream header
Comment 13 Vineeth 2015-06-24 06:38:35 UTC
(In reply to Nicolas Dufresne (stormer) from comment #12)
> (In reply to Sebastian Dröge (slomo) from comment #5)
> > So there are probably two bugs here: 1) flacparse not accepting that file,
> > 2) flacparse/baseparse leaking everything because of 1)
> 
> So I'd redo this as:
> 
> 1) flacparse should set a valid streamheader (or drop)
> 2) flacdec should fail on invalid stream header

1) flacparse should set a valid streamheader (or drop)

flacparse is changed in
https://bugzilla.gnome.org/show_bug.cgi?id=684701
so that if we already have a valid streaminfo header, we just ignore if there are any other header parsing errors.
This does not work with flacdec, becauase libflac does not handle this.
But libav flacdec is able to process even after ignoring the above errors. So maybe this was added to let libav decode even with header error 

2) flacdec should fail on invalid stream header

for this, the solution suggested by Stephane Cerveau, seems to be proper. because libflac cannot handle this file.
While trying to process till end of metadata for this file , libav changes state to FLAC__STREAM_DECODER_ABORTED. Hence it is not able to proceed further, and all the buffer being pushed to it is not being freed.
Comment 14 Stephane Cerveau 2015-06-24 08:31:19 UTC
Vineeth, you mean there is also a leak in libav with this file?
Comment 15 Vineeth 2015-06-24 08:35:21 UTC
no, with libav there will be no leak.
the leak happens due to libflac not being able to handle the file.

flacparse is able to parse the header and is expecting flacdec(which uses libflac) to decode properly. But libflac has changed to FLAC__STREAM_DECODER_ABORTED state and hence all the buffers being passed to it are not being handled and hence lost.

With your suggested fix, it will not be decoded using libflac, instead it will be using libav. Hence there will be no memory leak.
Comment 16 Stephane Cerveau 2015-06-24 09:01:35 UTC
Ok i agree with your analysis. Do you think we can expect gstflacdec/libflac to decode this file correctly?
So i propose this path in attachment.
Comment 17 Stephane Cerveau 2015-06-24 09:05:43 UTC
Created attachment 305987 [details] [review]
patch flacdec fails on metadata.
Comment 18 Nicolas Dufresne (ndufresne) 2015-06-24 14:23:19 UTC
Before we make a conclusion, can someone identify the cause of failure, and can see if there exist a sequence that could be use to handle it with libflac ? The idea is that while there might be error at start, there is clearly valid data later, hence normal decoder should be able to skip.
Comment 19 Stephane Cerveau 2015-06-24 14:37:24 UTC
flac tool available on ubunutu 12.04 seems to fail to decode this stream too.

flac -d /tmp/BarclayJames__In_My_Life.flac 

flac 1.2.1, Copyright (C) 2000,2001,2002,2003,2004,2005,2006,2007  Josh Coalson
flac comes with ABSOLUTELY NO WARRANTY.  This is free software, and you are
welcome to redistribute it under certain conditions.  Type `flac' for details.



BarclayJames__In_My_Life.flac: ERROR while decoding metadata
                               state = FLAC__STREAM_DECODER_END_OF_STREAM

I tried last libflac ( 1.3.1) but same sentence.
Comment 20 Vineeth 2015-06-25 07:00:34 UTC
Review of attachment 305987 [details] [review]:

hi.
i have found a better solution for this and attaching it here.
This fix is not wrong, but there is no need of set_format function at all.
Comment 21 Vineeth 2015-06-25 07:02:07 UTC
Created attachment 306070 [details] [review]
remove header parsing in decoder

As explained in the patch description, there is no need of set_format function to parse the header again in libflac. It just ignores the same without any effect.
Please check with this.
Comment 22 Sebastian Dröge (slomo) 2015-06-25 09:06:02 UTC
There are cases where you need to pass headers to the decoder, IIRC not in all cases the information from the FLAC frames is enough for decoding.

Also don't we extract metadata (tags) from the headers somewhere?
Comment 23 Vineeth 2015-06-25 10:35:20 UTC
The problem here is libflac is not able to handle this particular file due to wrong header. StreamInfo is there twice in it. And on processing metadata it just says exits by changing the state to FLAC__STREAM_DECODER_ABORTED
Hence even though data is valid, the decoder is not able to handle it.


I saw in the file history that this particular piece of code is added in the below commit which was done almost 4 years back :)
http://cgit.freedesktop.org/gstreamer/gst-plugins-good/commit/ext/flac/gstflacdec.c?id=a49818f876d9112b11def8aae2fdd72930e8b3e3

and the description mentions
flacdec: parse stream headers from caps in set_format function
Not that this seems to be actually needed, libflac happily decodes
stuff even if we just drop all headers and never feed it to the
library.

So libflac has been this way for a long time and does not seem to need header.


and metadata extraction is being taken care by parser i guess.
The code for metadata extraction has been removed in the below commit, again almost 4 years back :)
http://cgit.freedesktop.org/gstreamer/gst-plugins-good/commit/ext/flac/gstflacdec.c?id=ab591b6d536d5ede62d6629dc8ed2a1d58a222ee



One solution i can think of involves, 
not passing to header to decoder only when the header is corrupted. But this might need some value being set in streamheader structure of caps from parser and using that to decide if it needs to be passed to decoder or not.
But not sure if this is a good solution..

Please advice on how to proceed..
Comment 24 Nicolas Dufresne (ndufresne) 2015-06-25 15:21:13 UTC
I have the feeling this bug is going in the wrong direction. First, libflac need to parse the header which are both in the stream and in the caps. If you where to do it from caps and again from the stream, I can understand libflac getting confused. There is benefits of doing from caps, one if that if it fails, you can fail on caps event, and that will let decodebin pick the next flac decoder in queue (if you are lucky to have one).

Now, *if* we want to support this broken file (because we know it's slightly broken), the solution should resides in flacparse. The parser should detect that the stream header has twice the stream info (hence the stream header is like two headers in one). Base on that, the parser could "fix" the header by only keeping the most recent one. Again, this is all about if we care supporting this broken file.
Comment 25 Sebastian Dröge (slomo) 2015-06-25 15:24:31 UTC
Correct, flacparse should fix this up ideally... but in any case situations like this shouldn't cause huge memory leaks.

But those are all things independent of the flacdec patch :)
Comment 26 Vineeth 2015-06-26 04:38:48 UTC
Created attachment 306136 [details] [review]
improve error handling

A less drastic change when compared to previous patch.
In this case, we are still letting libflac to parse the metadata,
and if it fails we just flush the whole thing and start again with the actual data.

PS: I still dont understand the need of metadata_cb when the same can be done anyways in dec_write
Comment 27 Vineeth 2015-06-26 05:01:43 UTC
(In reply to Nicolas Dufresne (stormer) from comment #24)
> I have the feeling this bug is going in the wrong direction. First, libflac
> need to parse the header which are both in the stream and in the caps. If
> you where to do it from caps and again from the stream, I can understand
> libflac getting confused. There is benefits of doing from caps, one if that
> if it fails, you can fail on caps event, and that will let decodebin pick
> the next flac decoder in queue (if you are lucky to have one).
> 

This will work only with decodebin right?
If i still use gst-launch-1.0 ... ! flacparse ! flacdec ! ....
it will still fail..

One thing i dont understand here is, in flacparse, the header is being set to caps, which is being used in set_format, and the same header is again pushed to baseparse, which means it is again being processed by the libflac.
Do we really need to handle the header twice?

> Now, *if* we want to support this broken file (because we know it's slightly
> broken), the solution should resides in flacparse. The parser should detect
> that the stream header has twice the stream info (hence the stream header is
> like two headers in one). Base on that, the parser could "fix" the header by
> only keeping the most recent one. Again, this is all about if we care
> supporting this broken file.

This file does not actually have two valid streaminfo. The first streaminfo is valid. While the second is not valid(it just has type as streaminfo while infact it is not) and it throws errors while parsing.
There is already code in flacparse to check this scenario and just ignore the other parts of the header. This is the one set to streamheader and which libflac is not able to process..
Now i did some simple experiments to determine what does libflac accept as a valid metadata.
In case of a flac header, the end of header is determined by 
is_last = ((map.data[0] & 0x80) == 0x80)
so in the present case, since we are truncating the header and passing it to libflac it fails.
Now when i try not to truncate and append all header till is_last is true, it still fails because, there are two stream data of type streaminfo.

While parsing in flacparse, if i just remove the second part of streaminfo and add everything else till is_last is true, to streamheader, this is being accepted by libflac as a valid metadata. But the size of this stream header is too big. 

But this fails when processing the actual frame :)... i guess this is because the header does not end properly with is_last(0x80) value, and we still keep searching in the actual data until we find something which we think is end of header...


Not exactly sure if i am making any sense :)... too big a message!!!!
But the patch i attached now does not make any drastic change, other than give libflac some more chances to prove itself, being worthy in good plugins :P
Comment 28 Nicolas Dufresne (ndufresne) 2015-06-26 13:15:10 UTC
(In reply to Vineeth from comment #27)
> This will work only with decodebin right?
> If i still use gst-launch-1.0 ... ! flacparse ! flacdec ! ....
> it will still fail..

Obviously.

> 
> One thing i dont understand here is, in flacparse, the header is being set
> to caps, which is being used in set_format, and the same header is again
> pushed to baseparse, which means it is again being processed by the libflac.
> Do we really need to handle the header twice?

Streamheader is "special". Parser will extra stream header as a convenience from a stream of type byte-stream. Though, it won't remove it because otherwise it would make the stream "incompatbile" with what is is supposed to be. Decoder for these format have the option of looking at the stream header, and skipping these from the stream, or ignoring the streamheader and only dealing with the stream. This is just a convenience as decoder API can vary a lot. There is also the benefit that whenever the header changes, caps will change, so renegotiation is made easier (kind of).

> 
> > Now, *if* we want to support this broken file (because we know it's slightly
> > broken), the solution should resides in flacparse. The parser should detect
> > that the stream header has twice the stream info (hence the stream header is
> > like two headers in one). Base on that, the parser could "fix" the header by
> > only keeping the most recent one. Again, this is all about if we care
> > supporting this broken file.
> 
> This file does not actually have two valid streaminfo. The first streaminfo
> is valid. While the second is not valid(it just has type as streaminfo while
> infact it is not) and it throws errors while parsing.

So maybe the workaround for flacparse is to validate the stream info. Then it would figure-out that the second one is incomplete and skip it. The thing with stream errors is that you can't assume that first is always the right one, or second is always the right one.

> There is already code in flacparse to check this scenario and just ignore
> the other parts of the header. This is the one set to streamheader and which
> libflac is not able to process..

I think libflac should try and remove the unusable part (if we want to not say this stream is broken, we don't support it).

> Now i did some simple experiments to determine what does libflac accept as a
> valid metadata.
> In case of a flac header, the end of header is determined by 
> is_last = ((map.data[0] & 0x80) == 0x80)
> so in the present case, since we are truncating the header and passing it to
> libflac it fails.
> Now when i try not to truncate and append all header till is_last is true,
> it still fails because, there are two stream data of type streaminfo.
> 
> While parsing in flacparse, if i just remove the second part of streaminfo
> and add everything else till is_last is true, to streamheader, this is being
> accepted by libflac as a valid metadata. But the size of this stream header
> is too big. 
> 
> But this fails when processing the actual frame :)... i guess this is
> because the header does not end properly with is_last(0x80) value, and we
> still keep searching in the actual data until we find something which we
> think is end of header...
> 
> 
> Not exactly sure if i am making any sense :)... too big a message!!!!
> But the patch i attached now does not make any drastic change, other than
> give libflac some more chances to prove itself, being worthy in good plugins
> :P

Yeah, I think I understand. You endup having to re-encode things, which is falling in the too much wor imho (for a broken stream). Let do first thing first. Ensure the flacdec fails properly and not leak. Initially we said to simply return false, then you endup reverting everything touching the streamheader. I still don't understand, since there must have been good reason to more to using streamheader in the first place.
Comment 29 Vineeth 2015-06-26 23:03:40 UTC
Maybe you missed to check the last patch? 
Now instead of removing all the code related to streamheader, i just flush the decoder in case of failure and let decoder continue with handling frames instead of just returning false and failing..
if this patch is also not correct then probably i should start from scratch and first just return false during failure :)
Comment 30 Nicolas Dufresne (ndufresne) 2015-07-06 13:45:09 UTC
Review of attachment 306136 [details] [review]:

Ah, that one looks good.
Comment 31 Luis de Bethencourt 2015-07-06 13:57:33 UTC
Review of attachment 306136 [details] [review]:

commit 3f6a868f7e7ba732f36078be4f42d9b502b19991
Author: Vineeth TM <vineeth.tm@samsung.com>
Date:   Fri Jun 26 13:24:17 2015 +0900

    flacdec: improve error handling

    for files which have corrupted header, libflac is not able to
    process the metadata properly. We just try to ignore the error
    and continue with the processing, since metadata parsing is not
    making much of a difference to libflac

    https://bugzilla.gnome.org/show_bug.cgi?id=751334
Comment 32 Stephane Cerveau 2015-07-08 14:22:58 UTC
Thx a lot for your support. Its working fine on my side.