Bug 774566 – matroskaparse: error out on last buffer

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 774566 - matroskaparse: error out on last buffer


Summary:	matroskaparse: error out on last buffer


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-good
Version:	1.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	1.11.1
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2016-11-16 16:10 UTC by Nicola
Modified:	2016-11-21 11:48 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
fix parsing error at the end of the file (1.19 KB, patch) 2016-11-16 21:42 UTC, Nicola	committed	Details \| Review
proof of concept patch (1.77 KB, patch) 2016-11-17 11:13 UTC, Nicola	none	Details \| Review
tentative patch (3.56 KB, patch) 2016-11-17 18:02 UTC, Nicola	none	Details \| Review
tentative patch v2 (4.54 KB, patch) 2016-11-18 17:13 UTC, Nicola	none	Details \| Review
updated patch (4.40 KB, patch) 2016-11-19 10:19 UTC, Nicola	none	Details \| Review
updated patch (4.90 KB, patch) 2016-11-21 10:50 UTC, Nicola	committed	Details \| Review
demuxer patch (3.60 KB, patch) 2016-11-21 10:50 UTC, Nicola	committed	Details \| Review

Description Nicola 2016-11-16 16:10:39 UTC

Please try this file:

http://195.250.34.59/temp/broken.mkv

this pipeline works fine:

filesrc ! matroskademux ! fakesink

but this one fails:

filesrc ! matroskaparse ! fakesink

this is the relevant error:

ebmlread ebml-read.c:141:gst_ebml_peek_id_length:<matroskaparse0> Invalid EBML ID size tag (0x0) at position 7892158 (0x786cbe)

this seems to happen on the last data in the file, so both matroskademux and matroskaparse reads until timestamp is 50:52:02.540000000 but matroskademux exits with eos while matroskaparse exits with error, 

any suggestions to modify matroskaparse so that it exits without errors?

Comment 1 Nicola 2016-11-16 21:38:25 UTC

the difference between parser and muxer is this:

https://cgit.freedesktop.org/gstreamer/gst-plugins-good/tree/gst/matroska/matroska-demux.c#n4832

applying the same code as the muxer to the parser here:

https://cgit.freedesktop.org/gstreamer/gst-plugins-good/tree/gst/matroska/matroska-parse.c#n3014

seems to solve the issue

Comment 2 Nicola 2016-11-16 21:42:52 UTC

Created attachment 340086 [details] [review]
fix parsing error at the end of the file

please review

Comment 3 Sebastian Dröge (slomo) 2016-11-17 08:25:23 UTC

commit 28995f3527c286d7414074e26180ef4445e1419a
Author: Sebastian Dröge <sebastian@centricular.com>
Date:   Thu Nov 17 10:24:28 2016 +0200

    matroskaparse: Add remaining relevant parts from a3a55305 to the parser
    
    https://bugzilla.gnome.org/show_bug.cgi?id=774566

commit 7627171566aa6388166a8f02c410a33462b158a4
Author: Nicola Murino <nicola.murino@gmail.com>
Date:   Wed Nov 16 22:39:01 2016 +0100

    matroskaparse: ignore parsing errors at the end of the file
    
    This is the same change as a3a55305 for the parser.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=774566

Comment 4 Sebastian Dröge (slomo) 2016-11-17 08:33:52 UTC

While this part is correct, there is a remaining problem. It does not work yet on your file (also not with matroskademux in push mode, just add a queue after filesrc for example).

The problem here is that after the last valid EBML item, there is an invalid one. For pull mode we can handle that by skipping to the next cluster that would follow, and try to recover from there (in this case actually EOS because the next cluster is not there).

For push mode we can't do that currently. Some logic for that has to be added to the demuxer and parser for that.

Comment 5 Sebastian Dröge (slomo) 2016-11-17 08:34:32 UTC

What your patch without my additions did btw, was just to ignore *all* errors and convert them to EOS, if the segment had "infinite" (aka 0) size.

Comment 6 Nicola 2016-11-17 11:13:10 UTC

Created attachment 340128 [details] [review]
proof of concept patch

thanks for the review and explaination,

in my use case the worst thing that can happen is that a recording process that use matroskamux is stopped while writing, for example for a server reboot and/or unclean shutdown, so the invalid data should be always at the end of the file.

This is a proof of concept patch to handle invalid data in the middle of the file,

with this patch the pipeline

filesrc ! queue ! matroskademux

works as expected with my test file, I think this patch should be improved, do you have suggestions and/or do you want to provide a proper solution?

Comment 7 Sebastian Dröge (slomo) 2016-11-17 11:23:14 UTC

I don't have time to work on a proper solution here, no.

But to ignore errors at the EOS (not just in the middle), more work is also needed. Your pipeline still errors out with the changes that are merged.

Comment 8 Sebastian Dröge (slomo) 2016-11-17 11:23:35 UTC

That is, just having a fix that makes them be ignored at the end would be a good start already.

Comment 9 Nicola 2016-11-17 11:37:05 UTC

In my proof of concept patch after a parse error I just search for next cluster in the data available on the adapter and if found (does not happen with my test file) I update the demuxer offset and flush the adapter until the next cluster pos, is this an acceptable solution? I can apply similar code for the parser too. In my use case I have only streamable files

Comment 10 Sebastian Dröge (slomo) 2016-11-17 11:44:23 UTC

It goes into the right direction, yes. You would have to search not only in the adapter but also in the data that follows in the next buffers... until a specific threshold. I think there's code for "resyncing" on a cluster already.

Comment 11 Nicola 2016-11-17 11:47:24 UTC

(In reply to Sebastian Dröge (slomo) from comment #10)
> It goes into the right direction, yes. You would have to search not only in
> the adapter but also in the data that follows in the next buffers...

yes this is already done

 until a
> specific threshold. 

an upper threshold is missing in my patch, I can take a look on this in my free time

Comment 12 Sebastian Dröge (slomo) 2016-11-17 11:52:50 UTC

I meant the cluster scanning that is already there, see the GST_MATROSKA_READ_STATE_SCANNING case (same change is needed in the demuxer btw). It should be possible to fall back to that case here on parsing errors without having to implement any new scanning.

Comment 13 Nicola 2016-11-17 18:02:22 UTC

Created attachment 340161 [details] [review]
tentative patch

please take a look at this attempt, probably there is still something to improve,

please note that gst_matroska_parse_parse_id and gst_matroska_demux_parse_id does not works in the same way:

https://cgit.freedesktop.org/gstreamer/gst-plugins-good/tree/gst/matroska/matroska-demux.c#n4324

read is defined as guint64, but while skipping is passed to gst_matroska_demux_flush 

https://cgit.freedesktop.org/gstreamer/gst-plugins-good/tree/gst/matroska/matroska-demux.c#n4563

but gst_matroska_demux_flush want a guint, in the error case the value overflows

Comment 14 Nicola 2016-11-17 18:03:41 UTC

once the patch for the parser is ok I can send a similar one for the demuxer too

Comment 15 Nicola 2016-11-17 18:05:57 UTC

oops ... just noted that start_resync_offset is never resetted to -1, I'll fix this if the patch is acceptable, please advise

Comment 16 Sebastian Dröge (slomo) 2016-11-18 16:09:50 UTC

Review of attachment 340161 [details] [review]:

Goes in the right direction I think

::: gst/matroska/matroska-parse.c
@@ +3021,2 @@
       ret = GST_FLOW_EOS;
+      return ret;

Don't you always want to return EOS here in that case?

@@ +3031,3 @@
+      gint64 bytes_scanned;
+      gint read_state;
+      const guint max_threshold = 64 * 1024;    // same value used in gst_matroska_demux_search_cluster

No C99 comments

@@ +3034,3 @@
+      length = 0;
+      if (parse->common.start_resync_offset == -1) {
+        parse->common.start_resync_offset = parse->common.offset;

Needs the reset, yes

@@ +3046,3 @@
+        parse->common.state = GST_MATROSKA_READ_STATE_SCANNING;
+        ret = gst_matroska_parse_parse_id (parse, id, length, needed);
+        parse->common.state = read_state;

I don't understand this :) Why don't you just set state=SCANNING, return from everywhere here and let the outer code that calls gst_matroska_parse_parse_id() take care of calling all these things again?

Also why the custom scanning at all here?

Comment 17 Nicola 2016-11-18 17:13:22 UTC

Created attachment 340257 [details] [review]
tentative patch v2

please check this one

Comment 18 Nicola 2016-11-18 17:15:06 UTC

and please also note that GST_MATROSKA_READ_STATE_SCANNING was never setted before this patch so the change in case GST_MATROSKA_READ_STATE_SCANNING should be safe

Comment 19 Nicola 2016-11-18 18:20:35 UTC

gboolean is_resyncing =
          (parse->common.state == GST_MATROSKA_READ_STATE_SCANNING
          && parse->common.start_resync_offset > 0);

must be changed to

gboolean is_resyncing =
          (parse->common.state == GST_MATROSKA_READ_STATE_SCANNING
          && parse->common.start_resync_offset != -1);

please let me known if other modifications are required, works fine with my test file,

for now is untested an mkv file with invalid data in the middle (between two clusters)

Comment 20 Sebastian Dröge (slomo) 2016-11-19 09:53:44 UTC

It also was not set for the demuxer it seems. Checking the middle-case should be relatively easy, just get a valid mkv file and destroy some EBML elements (like a cluster maybe?) in the middle and see if it recovers from that.

Comment 21 Nicola 2016-11-19 10:19:44 UTC

Created attachment 340295 [details] [review]
updated patch

new patch, works fine with the original file and tested with a file with broken data in the middle generated this way:

dd if=/tmp/broken.mkv of=/tmp/1.mkv bs=208748 count=1
dd if=/dev/zero of=/tmp/random bs=19 count=1
dd if=/tmp/broken.mkv of=/tmp/2.mkv skip=208748 bs=1
cat /tmp/random >> /tmp/1.mkv
cat /tmp/2.mkv >> /tmp/1.mkv

the original file has a cluster at offset 208749

matroskaparse matroska-parse.c:3059:gst_matroska_parse_chain:<matroskaparse0> Offset 208749, Element id 0x1f43b675, size 18446744073709551615, needed 12, available 147

if I change bs here:

dd if=/dev/zero of=/tmp/random bs=19 count=1

to a value > 65536 we exceed the threshold and we get an unrecoverable error,

please review

Comment 22 Sebastian Dröge (slomo) 2016-11-19 10:41:14 UTC

Comment on attachment 340295 [details] [review]
updated patch

Thanks, this looks great and is what I had in mind :) Can you also do the same thing for the demuxer?

I'll do a more thorough review later, but the way how it's done now is good. Just have to check all the little conditions in more detail in case something is missing there.

Comment 23 Nicola 2016-11-19 11:24:14 UTC

Thanks,

I'll do the same for the demuxer during the week end

Comment 24 Nicola 2016-11-20 11:46:06 UTC

Sebastian, regarding the demux the things are a bit more complicated, STATE_SCANNING is already setted while seeking, so if invalid data is found while seeking the state should not be setted to READ_DATA but should remain to SCANNING, 

a possible solution could be to save the state in a new private class member when invalid data is found and restore this state when a new cluster is found, probably this should be done for the parser too, what do you think about? Do you have other suggestions? The state should be restored only when start_resync_offset != -1 (scanning after invalid data is found)

Comment 25 Sebastian Dröge (slomo) 2016-11-21 07:45:48 UTC

Sounds like a possible approach, yes. Not sure how this should be handled otherwise :)

Comment 26 Sebastian Dröge (slomo) 2016-11-21 07:47:32 UTC

Review of attachment 340295 [details] [review]:

Looks all good except for:

::: gst/matroska/matroska-parse.c
@@ +3035,3 @@
+       */
+      gint64 bytes_scanned;
+      const guint max_threshold = 64 * 1024;

Why 64kbyte? Looks a bit too small. I would've said something like ~2MB maybe

Comment 27 Nicola 2016-11-21 10:50:03 UTC

Created attachment 340406 [details] [review]
updated patch

Comment 28 Nicola 2016-11-21 10:50:31 UTC

Created attachment 340407 [details] [review]
demuxer patch

please review

Comment 29 Sebastian Dröge (slomo) 2016-11-21 11:47:52 UTC

commit 0ef3a71b89c7448da1ad0a19c60f99e0e8be74bd
Author: Nicola Murino <nicola.murino@gmail.com>
Date:   Mon Nov 21 11:48:58 2016 +0100

    matroskademux: add support for skipping invalid data in push mode
    
    https://bugzilla.gnome.org/show_bug.cgi?id=774566

commit 20ed9e823731cc93f8583440d2718385f40258b0
Author: Nicola Murino <nicola.murino@gmail.com>
Date:   Mon Nov 21 11:48:29 2016 +0100

    matroskaparse: add support for skipping invalid data
    
    https://bugzilla.gnome.org/show_bug.cgi?id=774566