GNOME Bugzilla – Bug 751334
FLAC: memory leak on a specific media file
Last modified: 2015-08-16 13:39:53 UTC
The specific FLAC media http://bit.ly/1BH1Fkm can not be played and makes a big memory leak(>400MB) The command line is: valgrind --tool=massif gst-launch-1.0 playbin uri=file:////tmp/Barclay\ James\ Harvest\ -\ Time\ Honoured\ Ghosts\ -\ 1\ -\ In\ My\ Life.flac Here is the output when the framework is failing to start the playback. 0:00:00.049510284 5311 0x7f1ed40eded0 ERROR flacparse gstflacparse.c:891:gst_flac_parse_handle_streaminfo:<flacparse0> Invalid metablock size for STREAMINFO: 4 handleBusMessage: An error has occured with element flacdec0: No valid frames decoded before end of stream.
i could not find any memory leak when checking using memcheck option of valgrind. Can you post more details of the memleak
I dont find any memory leak either with memcheck but i can see that the memory usage is bumping to 400MB with ps or ps_mem.py. As i described in the bug, it is also point out by massif tool. Here is a pastebin http://pastebin.com/BRdVJ2Et which show the memory usage with massif tool.
No memory issue with flac tool.
So basically all buffers allocated by basesrc are leaked according to massif
... which might be baseparse/flacparse forgetting to free invalid frames, or accumulating them all because maybe at a later point it can be considered a valid frame. So there are probably two bugs here: 1) flacparse not accepting that file, 2) flacparse/baseparse leaking everything because of 1)
Massif reports 3MB here. I cannot reproduce the 400MB peak that is being reported. Is there some more detail that should be mention ?
Created attachment 305918 [details] massif report
Ok, sorry, I miss-read the thread, thought the leak was pointed at the parser, while it's in the decoder. I can reproduce now.
Ok, did a quick look. FLAC__stream_decoder_process_until_end_of_metadata fails, but we still expect the read/write callback to be called. Then we keep queueing stuff until EOS. I suppose the parsers repeast stuff on each frame, otherwise I cannot see how 52mb can turn into 400Mb. Ignoring that error is most likely what least to this memory waste.
So shall we return false in gst_flac_dec_set_format after GST_WARNING_OBJECT (dec, "process_until_end_of_metadata failed"); According to the doc https://xiph.org/flac/api/group__flac__stream__decoder.html#ga027ffb5b75dc39b3d26f55c5e6b42682 We should stop decoding. I had a try and in this case the av_flac is used and the media is correctly decoded without any important leak.
To keep things simple, I'd say yes. Flacparse should never return an invalid header though (so flacparse never have to search in the stream for a updated valid header).
(In reply to Sebastian Dröge (slomo) from comment #5) > So there are probably two bugs here: 1) flacparse not accepting that file, > 2) flacparse/baseparse leaking everything because of 1) So I'd redo this as: 1) flacparse should set a valid streamheader (or drop) 2) flacdec should fail on invalid stream header
(In reply to Nicolas Dufresne (stormer) from comment #12) > (In reply to Sebastian Dröge (slomo) from comment #5) > > So there are probably two bugs here: 1) flacparse not accepting that file, > > 2) flacparse/baseparse leaking everything because of 1) > > So I'd redo this as: > > 1) flacparse should set a valid streamheader (or drop) > 2) flacdec should fail on invalid stream header 1) flacparse should set a valid streamheader (or drop) flacparse is changed in https://bugzilla.gnome.org/show_bug.cgi?id=684701 so that if we already have a valid streaminfo header, we just ignore if there are any other header parsing errors. This does not work with flacdec, becauase libflac does not handle this. But libav flacdec is able to process even after ignoring the above errors. So maybe this was added to let libav decode even with header error 2) flacdec should fail on invalid stream header for this, the solution suggested by Stephane Cerveau, seems to be proper. because libflac cannot handle this file. While trying to process till end of metadata for this file , libav changes state to FLAC__STREAM_DECODER_ABORTED. Hence it is not able to proceed further, and all the buffer being pushed to it is not being freed.
Vineeth, you mean there is also a leak in libav with this file?
no, with libav there will be no leak. the leak happens due to libflac not being able to handle the file. flacparse is able to parse the header and is expecting flacdec(which uses libflac) to decode properly. But libflac has changed to FLAC__STREAM_DECODER_ABORTED state and hence all the buffers being passed to it are not being handled and hence lost. With your suggested fix, it will not be decoded using libflac, instead it will be using libav. Hence there will be no memory leak.
Ok i agree with your analysis. Do you think we can expect gstflacdec/libflac to decode this file correctly? So i propose this path in attachment.
Created attachment 305987 [details] [review] patch flacdec fails on metadata.
Before we make a conclusion, can someone identify the cause of failure, and can see if there exist a sequence that could be use to handle it with libflac ? The idea is that while there might be error at start, there is clearly valid data later, hence normal decoder should be able to skip.
flac tool available on ubunutu 12.04 seems to fail to decode this stream too. flac -d /tmp/BarclayJames__In_My_Life.flac flac 1.2.1, Copyright (C) 2000,2001,2002,2003,2004,2005,2006,2007 Josh Coalson flac comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions. Type `flac' for details. BarclayJames__In_My_Life.flac: ERROR while decoding metadata state = FLAC__STREAM_DECODER_END_OF_STREAM I tried last libflac ( 1.3.1) but same sentence.
Review of attachment 305987 [details] [review]: hi. i have found a better solution for this and attaching it here. This fix is not wrong, but there is no need of set_format function at all.
Created attachment 306070 [details] [review] remove header parsing in decoder As explained in the patch description, there is no need of set_format function to parse the header again in libflac. It just ignores the same without any effect. Please check with this.
There are cases where you need to pass headers to the decoder, IIRC not in all cases the information from the FLAC frames is enough for decoding. Also don't we extract metadata (tags) from the headers somewhere?
The problem here is libflac is not able to handle this particular file due to wrong header. StreamInfo is there twice in it. And on processing metadata it just says exits by changing the state to FLAC__STREAM_DECODER_ABORTED Hence even though data is valid, the decoder is not able to handle it. I saw in the file history that this particular piece of code is added in the below commit which was done almost 4 years back :) http://cgit.freedesktop.org/gstreamer/gst-plugins-good/commit/ext/flac/gstflacdec.c?id=a49818f876d9112b11def8aae2fdd72930e8b3e3 and the description mentions flacdec: parse stream headers from caps in set_format function Not that this seems to be actually needed, libflac happily decodes stuff even if we just drop all headers and never feed it to the library. So libflac has been this way for a long time and does not seem to need header. and metadata extraction is being taken care by parser i guess. The code for metadata extraction has been removed in the below commit, again almost 4 years back :) http://cgit.freedesktop.org/gstreamer/gst-plugins-good/commit/ext/flac/gstflacdec.c?id=ab591b6d536d5ede62d6629dc8ed2a1d58a222ee One solution i can think of involves, not passing to header to decoder only when the header is corrupted. But this might need some value being set in streamheader structure of caps from parser and using that to decide if it needs to be passed to decoder or not. But not sure if this is a good solution.. Please advice on how to proceed..
I have the feeling this bug is going in the wrong direction. First, libflac need to parse the header which are both in the stream and in the caps. If you where to do it from caps and again from the stream, I can understand libflac getting confused. There is benefits of doing from caps, one if that if it fails, you can fail on caps event, and that will let decodebin pick the next flac decoder in queue (if you are lucky to have one). Now, *if* we want to support this broken file (because we know it's slightly broken), the solution should resides in flacparse. The parser should detect that the stream header has twice the stream info (hence the stream header is like two headers in one). Base on that, the parser could "fix" the header by only keeping the most recent one. Again, this is all about if we care supporting this broken file.
Correct, flacparse should fix this up ideally... but in any case situations like this shouldn't cause huge memory leaks. But those are all things independent of the flacdec patch :)
Created attachment 306136 [details] [review] improve error handling A less drastic change when compared to previous patch. In this case, we are still letting libflac to parse the metadata, and if it fails we just flush the whole thing and start again with the actual data. PS: I still dont understand the need of metadata_cb when the same can be done anyways in dec_write
(In reply to Nicolas Dufresne (stormer) from comment #24) > I have the feeling this bug is going in the wrong direction. First, libflac > need to parse the header which are both in the stream and in the caps. If > you where to do it from caps and again from the stream, I can understand > libflac getting confused. There is benefits of doing from caps, one if that > if it fails, you can fail on caps event, and that will let decodebin pick > the next flac decoder in queue (if you are lucky to have one). > This will work only with decodebin right? If i still use gst-launch-1.0 ... ! flacparse ! flacdec ! .... it will still fail.. One thing i dont understand here is, in flacparse, the header is being set to caps, which is being used in set_format, and the same header is again pushed to baseparse, which means it is again being processed by the libflac. Do we really need to handle the header twice? > Now, *if* we want to support this broken file (because we know it's slightly > broken), the solution should resides in flacparse. The parser should detect > that the stream header has twice the stream info (hence the stream header is > like two headers in one). Base on that, the parser could "fix" the header by > only keeping the most recent one. Again, this is all about if we care > supporting this broken file. This file does not actually have two valid streaminfo. The first streaminfo is valid. While the second is not valid(it just has type as streaminfo while infact it is not) and it throws errors while parsing. There is already code in flacparse to check this scenario and just ignore the other parts of the header. This is the one set to streamheader and which libflac is not able to process.. Now i did some simple experiments to determine what does libflac accept as a valid metadata. In case of a flac header, the end of header is determined by is_last = ((map.data[0] & 0x80) == 0x80) so in the present case, since we are truncating the header and passing it to libflac it fails. Now when i try not to truncate and append all header till is_last is true, it still fails because, there are two stream data of type streaminfo. While parsing in flacparse, if i just remove the second part of streaminfo and add everything else till is_last is true, to streamheader, this is being accepted by libflac as a valid metadata. But the size of this stream header is too big. But this fails when processing the actual frame :)... i guess this is because the header does not end properly with is_last(0x80) value, and we still keep searching in the actual data until we find something which we think is end of header... Not exactly sure if i am making any sense :)... too big a message!!!! But the patch i attached now does not make any drastic change, other than give libflac some more chances to prove itself, being worthy in good plugins :P
(In reply to Vineeth from comment #27) > This will work only with decodebin right? > If i still use gst-launch-1.0 ... ! flacparse ! flacdec ! .... > it will still fail.. Obviously. > > One thing i dont understand here is, in flacparse, the header is being set > to caps, which is being used in set_format, and the same header is again > pushed to baseparse, which means it is again being processed by the libflac. > Do we really need to handle the header twice? Streamheader is "special". Parser will extra stream header as a convenience from a stream of type byte-stream. Though, it won't remove it because otherwise it would make the stream "incompatbile" with what is is supposed to be. Decoder for these format have the option of looking at the stream header, and skipping these from the stream, or ignoring the streamheader and only dealing with the stream. This is just a convenience as decoder API can vary a lot. There is also the benefit that whenever the header changes, caps will change, so renegotiation is made easier (kind of). > > > Now, *if* we want to support this broken file (because we know it's slightly > > broken), the solution should resides in flacparse. The parser should detect > > that the stream header has twice the stream info (hence the stream header is > > like two headers in one). Base on that, the parser could "fix" the header by > > only keeping the most recent one. Again, this is all about if we care > > supporting this broken file. > > This file does not actually have two valid streaminfo. The first streaminfo > is valid. While the second is not valid(it just has type as streaminfo while > infact it is not) and it throws errors while parsing. So maybe the workaround for flacparse is to validate the stream info. Then it would figure-out that the second one is incomplete and skip it. The thing with stream errors is that you can't assume that first is always the right one, or second is always the right one. > There is already code in flacparse to check this scenario and just ignore > the other parts of the header. This is the one set to streamheader and which > libflac is not able to process.. I think libflac should try and remove the unusable part (if we want to not say this stream is broken, we don't support it). > Now i did some simple experiments to determine what does libflac accept as a > valid metadata. > In case of a flac header, the end of header is determined by > is_last = ((map.data[0] & 0x80) == 0x80) > so in the present case, since we are truncating the header and passing it to > libflac it fails. > Now when i try not to truncate and append all header till is_last is true, > it still fails because, there are two stream data of type streaminfo. > > While parsing in flacparse, if i just remove the second part of streaminfo > and add everything else till is_last is true, to streamheader, this is being > accepted by libflac as a valid metadata. But the size of this stream header > is too big. > > But this fails when processing the actual frame :)... i guess this is > because the header does not end properly with is_last(0x80) value, and we > still keep searching in the actual data until we find something which we > think is end of header... > > > Not exactly sure if i am making any sense :)... too big a message!!!! > But the patch i attached now does not make any drastic change, other than > give libflac some more chances to prove itself, being worthy in good plugins > :P Yeah, I think I understand. You endup having to re-encode things, which is falling in the too much wor imho (for a broken stream). Let do first thing first. Ensure the flacdec fails properly and not leak. Initially we said to simply return false, then you endup reverting everything touching the streamheader. I still don't understand, since there must have been good reason to more to using streamheader in the first place.
Maybe you missed to check the last patch? Now instead of removing all the code related to streamheader, i just flush the decoder in case of failure and let decoder continue with handling frames instead of just returning false and failing.. if this patch is also not correct then probably i should start from scratch and first just return false during failure :)
Review of attachment 306136 [details] [review]: Ah, that one looks good.
Review of attachment 306136 [details] [review]: commit 3f6a868f7e7ba732f36078be4f42d9b502b19991 Author: Vineeth TM <vineeth.tm@samsung.com> Date: Fri Jun 26 13:24:17 2015 +0900 flacdec: improve error handling for files which have corrupted header, libflac is not able to process the metadata properly. We just try to ignore the error and continue with the processing, since metadata parsing is not making much of a difference to libflac https://bugzilla.gnome.org/show_bug.cgi?id=751334
Thx a lot for your support. Its working fine on my side.