GNOME Bugzilla – Bug 774566
matroskaparse: error out on last buffer
Last modified: 2016-11-21 11:48:20 UTC
Please try this file: http://195.250.34.59/temp/broken.mkv this pipeline works fine: filesrc ! matroskademux ! fakesink but this one fails: filesrc ! matroskaparse ! fakesink this is the relevant error: ebmlread ebml-read.c:141:gst_ebml_peek_id_length:<matroskaparse0> Invalid EBML ID size tag (0x0) at position 7892158 (0x786cbe) this seems to happen on the last data in the file, so both matroskademux and matroskaparse reads until timestamp is 50:52:02.540000000 but matroskademux exits with eos while matroskaparse exits with error, any suggestions to modify matroskaparse so that it exits without errors?
the difference between parser and muxer is this: https://cgit.freedesktop.org/gstreamer/gst-plugins-good/tree/gst/matroska/matroska-demux.c#n4832 applying the same code as the muxer to the parser here: https://cgit.freedesktop.org/gstreamer/gst-plugins-good/tree/gst/matroska/matroska-parse.c#n3014 seems to solve the issue
Created attachment 340086 [details] [review] fix parsing error at the end of the file please review
commit 28995f3527c286d7414074e26180ef4445e1419a Author: Sebastian Dröge <sebastian@centricular.com> Date: Thu Nov 17 10:24:28 2016 +0200 matroskaparse: Add remaining relevant parts from a3a55305 to the parser https://bugzilla.gnome.org/show_bug.cgi?id=774566 commit 7627171566aa6388166a8f02c410a33462b158a4 Author: Nicola Murino <nicola.murino@gmail.com> Date: Wed Nov 16 22:39:01 2016 +0100 matroskaparse: ignore parsing errors at the end of the file This is the same change as a3a55305 for the parser. https://bugzilla.gnome.org/show_bug.cgi?id=774566
While this part is correct, there is a remaining problem. It does not work yet on your file (also not with matroskademux in push mode, just add a queue after filesrc for example). The problem here is that after the last valid EBML item, there is an invalid one. For pull mode we can handle that by skipping to the next cluster that would follow, and try to recover from there (in this case actually EOS because the next cluster is not there). For push mode we can't do that currently. Some logic for that has to be added to the demuxer and parser for that.
What your patch without my additions did btw, was just to ignore *all* errors and convert them to EOS, if the segment had "infinite" (aka 0) size.
Created attachment 340128 [details] [review] proof of concept patch thanks for the review and explaination, in my use case the worst thing that can happen is that a recording process that use matroskamux is stopped while writing, for example for a server reboot and/or unclean shutdown, so the invalid data should be always at the end of the file. This is a proof of concept patch to handle invalid data in the middle of the file, with this patch the pipeline filesrc ! queue ! matroskademux works as expected with my test file, I think this patch should be improved, do you have suggestions and/or do you want to provide a proper solution?
I don't have time to work on a proper solution here, no. But to ignore errors at the EOS (not just in the middle), more work is also needed. Your pipeline still errors out with the changes that are merged.
That is, just having a fix that makes them be ignored at the end would be a good start already.
In my proof of concept patch after a parse error I just search for next cluster in the data available on the adapter and if found (does not happen with my test file) I update the demuxer offset and flush the adapter until the next cluster pos, is this an acceptable solution? I can apply similar code for the parser too. In my use case I have only streamable files
It goes into the right direction, yes. You would have to search not only in the adapter but also in the data that follows in the next buffers... until a specific threshold. I think there's code for "resyncing" on a cluster already.
(In reply to Sebastian Dröge (slomo) from comment #10) > It goes into the right direction, yes. You would have to search not only in > the adapter but also in the data that follows in the next buffers... yes this is already done until a > specific threshold. an upper threshold is missing in my patch, I can take a look on this in my free time
I meant the cluster scanning that is already there, see the GST_MATROSKA_READ_STATE_SCANNING case (same change is needed in the demuxer btw). It should be possible to fall back to that case here on parsing errors without having to implement any new scanning.
Created attachment 340161 [details] [review] tentative patch please take a look at this attempt, probably there is still something to improve, please note that gst_matroska_parse_parse_id and gst_matroska_demux_parse_id does not works in the same way: https://cgit.freedesktop.org/gstreamer/gst-plugins-good/tree/gst/matroska/matroska-demux.c#n4324 read is defined as guint64, but while skipping is passed to gst_matroska_demux_flush https://cgit.freedesktop.org/gstreamer/gst-plugins-good/tree/gst/matroska/matroska-demux.c#n4563 but gst_matroska_demux_flush want a guint, in the error case the value overflows
once the patch for the parser is ok I can send a similar one for the demuxer too
oops ... just noted that start_resync_offset is never resetted to -1, I'll fix this if the patch is acceptable, please advise
Review of attachment 340161 [details] [review]: Goes in the right direction I think ::: gst/matroska/matroska-parse.c @@ +3021,2 @@ ret = GST_FLOW_EOS; + return ret; Don't you always want to return EOS here in that case? @@ +3031,3 @@ + gint64 bytes_scanned; + gint read_state; + const guint max_threshold = 64 * 1024; // same value used in gst_matroska_demux_search_cluster No C99 comments @@ +3034,3 @@ + length = 0; + if (parse->common.start_resync_offset == -1) { + parse->common.start_resync_offset = parse->common.offset; Needs the reset, yes @@ +3046,3 @@ + parse->common.state = GST_MATROSKA_READ_STATE_SCANNING; + ret = gst_matroska_parse_parse_id (parse, id, length, needed); + parse->common.state = read_state; I don't understand this :) Why don't you just set state=SCANNING, return from everywhere here and let the outer code that calls gst_matroska_parse_parse_id() take care of calling all these things again? Also why the custom scanning at all here?
Created attachment 340257 [details] [review] tentative patch v2 please check this one
and please also note that GST_MATROSKA_READ_STATE_SCANNING was never setted before this patch so the change in case GST_MATROSKA_READ_STATE_SCANNING should be safe
gboolean is_resyncing = (parse->common.state == GST_MATROSKA_READ_STATE_SCANNING && parse->common.start_resync_offset > 0); must be changed to gboolean is_resyncing = (parse->common.state == GST_MATROSKA_READ_STATE_SCANNING && parse->common.start_resync_offset != -1); please let me known if other modifications are required, works fine with my test file, for now is untested an mkv file with invalid data in the middle (between two clusters)
It also was not set for the demuxer it seems. Checking the middle-case should be relatively easy, just get a valid mkv file and destroy some EBML elements (like a cluster maybe?) in the middle and see if it recovers from that.
Created attachment 340295 [details] [review] updated patch new patch, works fine with the original file and tested with a file with broken data in the middle generated this way: dd if=/tmp/broken.mkv of=/tmp/1.mkv bs=208748 count=1 dd if=/dev/zero of=/tmp/random bs=19 count=1 dd if=/tmp/broken.mkv of=/tmp/2.mkv skip=208748 bs=1 cat /tmp/random >> /tmp/1.mkv cat /tmp/2.mkv >> /tmp/1.mkv the original file has a cluster at offset 208749 matroskaparse matroska-parse.c:3059:gst_matroska_parse_chain:<matroskaparse0> Offset 208749, Element id 0x1f43b675, size 18446744073709551615, needed 12, available 147 if I change bs here: dd if=/dev/zero of=/tmp/random bs=19 count=1 to a value > 65536 we exceed the threshold and we get an unrecoverable error, please review
Comment on attachment 340295 [details] [review] updated patch Thanks, this looks great and is what I had in mind :) Can you also do the same thing for the demuxer? I'll do a more thorough review later, but the way how it's done now is good. Just have to check all the little conditions in more detail in case something is missing there.
Thanks, I'll do the same for the demuxer during the week end
Sebastian, regarding the demux the things are a bit more complicated, STATE_SCANNING is already setted while seeking, so if invalid data is found while seeking the state should not be setted to READ_DATA but should remain to SCANNING, a possible solution could be to save the state in a new private class member when invalid data is found and restore this state when a new cluster is found, probably this should be done for the parser too, what do you think about? Do you have other suggestions? The state should be restored only when start_resync_offset != -1 (scanning after invalid data is found)
Sounds like a possible approach, yes. Not sure how this should be handled otherwise :)
Review of attachment 340295 [details] [review]: Looks all good except for: ::: gst/matroska/matroska-parse.c @@ +3035,3 @@ + */ + gint64 bytes_scanned; + const guint max_threshold = 64 * 1024; Why 64kbyte? Looks a bit too small. I would've said something like ~2MB maybe
Created attachment 340406 [details] [review] updated patch
Created attachment 340407 [details] [review] demuxer patch please review
commit 0ef3a71b89c7448da1ad0a19c60f99e0e8be74bd Author: Nicola Murino <nicola.murino@gmail.com> Date: Mon Nov 21 11:48:58 2016 +0100 matroskademux: add support for skipping invalid data in push mode https://bugzilla.gnome.org/show_bug.cgi?id=774566 commit 20ed9e823731cc93f8583440d2718385f40258b0 Author: Nicola Murino <nicola.murino@gmail.com> Date: Mon Nov 21 11:48:29 2016 +0100 matroskaparse: add support for skipping invalid data https://bugzilla.gnome.org/show_bug.cgi?id=774566