GNOME Bugzilla – Bug 750847
dashdemux: variables containing time information should be guint64 not gint64
Last modified: 2015-10-30 16:33:10 UTC
variables containing time information (eg start and duration in _GstPeriodNode, mediaPresentationDuration in mediaPresentationDuration and many others) should be guint64, not gint64. Comparisons and arithmetic operations are done against variables of type GstClockTime which is guint64, meaning signed and unsigned operands are used in an operation.
Do you want to provide a patch? And check that all the values in question are really never signed and can't be negative.
Also needs to be changed in the unit tests, don't forget that
After looking at the spec and code for the signed values in the header: GstPeriodNode::start, duration: No particular indication of signedness in the spec. GstMetricsRangeNode::starttime, duration: No particular indication of signedness in the spec. starttime is said to be relative to the period stat time. I think everything is supposed to be clipped to that, so negative would not make sense here, but it's unclear again. GstMPDNode::mediaPresentationDuration, minimumUpdatePeriod, minBufferTime, timeShiftBufferDepth, maxSegmentDuration, maxSubsegmentDuration No particular indication of signedness in the spec. GstMPDNode: suggestedPresentationDelay The only thing that could related to signedness is a mention this can be used to sync playback to wall clock, this this hints that it could be negative. GstMediaSegment::repeat Can be negative (see https://bugzilla.gnome.org/show_bug.cgi?id=752480) GstMediaSegment::range_start, range_end, index_range_start, index_range_end: Seems not to come directly from attributes, end is used in the STL sense, so -1 is valid there. Thus a -1 start makes sense too (even though it may be impossible in practice) GstActiveStream::representation_idx Internal, negative doesn't make sense in theory, but I guess it might be possible to be in a state where there is no current representation. This is not the case with current code I thinkl. GstActiveStream::segment_index Internal, -1 is used
Turns out -1 is used implicitely as a "default" value for all "unsigned" durations, passed via the default value on the parsing function, to be used when the property is not present. Nothing seems to actually test for this after the fact though. 0 would seem to be another possible default value, though 0 makes sense as a valid duration while -1 does not. It's not quite clear what's the best to do here. The spec says at least some of these durations are optional (eg, 5.3.2.1). I can't see code that checks whether a period has a duration (which distinguishes between "regular" and "early available" periods), so I'm guessing this is also missing. We end up with a choice between keeping signed and -1 marker, or switching to unsigned and having an extra flag (or, I guess switching to unsigned and using G_MAXINT64 as a "not here" value).
(In reply to Vincent Penquerc'h from comment #4) > Turns out -1 is used implicitely as a "default" value for all "unsigned" > durations, passed via the default value on the parsing function, to be used > when the property is not present. Nothing seems to actually test for this > after the fact though. 0 would seem to be another possible default value, > though 0 makes sense as a valid duration while -1 does not. > > It's not quite clear what's the best to do here. The spec says at least some > of these durations are optional (eg, 5.3.2.1). I can't see code that checks > whether a period has a duration (which distinguishes between "regular" and > "early available" periods), so I'm guessing this is also missing. We end up > with a choice between keeping signed and -1 marker, or switching to unsigned > and having an extra flag (or, I guess switching to unsigned and using > G_MAXINT64 as a "not here" value). For time values, the choice is simple: use GST_CLOCK_TIME_NONE as default (instead of -1). This is a -1 stored into a guint64. So basically we can change only the data types from signed to unsigned and still use -1 as default
I'd use a different define, since those aren't actually GstClockTime, but either in milliseconds, or "units of timescale", so using GST_CLOCK_TIME_NONE itself would be confusing. But yes, that choice (unsigned and marker value) seems best, I'll go with that.
Created attachment 311271 [details] [review] unsigned durations
I removed the code that was explicitely allowing a "-" sign when parsing the durations. The original source imported into git had this already, so there's no commit message alluding to why this was here.
can you provide the full list of patches to be applied on master? This patch fails to apply. The gst_mpdparser_get_xml_prop_duration_inner function is not available yet on master. I believe you introduced that as a patch on another bug.
Sorry about that. I pushed my current tree to https://git.collabora.com/cgit/user/vincent/gst-plugins-bad/log/?h=dash So the list of commits between current master and the commit in this bug are: e0de87ce0590573cfc1caf38580c83ca7d1a8557 35dbfd7c1b81e4fdd1427d40cb298d63499b91d1 a8446ad4bce3d2ddf677120ec9a62ab2fce740b3 fb6f702fff49bf9bd10c900875cd7f84bcf738c7 2a4e1df9a7c535d955b6d5df10cb3a3cb15e9b32 b513397d6a659fd302a5c654058b230a08afdc9f
Review of attachment 311271 [details] [review]: ::: ext/dash/gstmpdparser.c @@ +847,1 @@ { "sign" variable is no longer used in gst_mpdparser_get_xml_prop_duration function, remove it
Created attachment 311463 [details] [review] unsigned durations With sign removed.
Review of attachment 311463 [details] [review]: ::: ext/dash/gstmpdparser.c @@ +960,3 @@ exists = TRUE; *property_value = + (((((gint64) years * 365 + months * 30 + days) * 24 + years should now be converted to guint64. Also years, months, etc should be guint and they should be validated that they are not negative. Or search for "-" sign and return error if found.
Created attachment 311467 [details] [review] unsigned durations With requested changes. Turns out sscanf("%u") does accept negative numbers, though the manpage does not hint at it.
Review of attachment 311467 [details] [review]: ::: ext/dash/gstmpdparser.c @@ +847,3 @@ + char *end = NULL; + + if (!g_ascii_isdigit (*s)) I believe this solution will not skip spaces at the beginning of s and will return -1 Why not use normal sscanf(str, &read) and test that read >= 0 ? This was proposed in https://bugzilla.gnome.org/show_bug.cgi?id=752429
(In reply to Florin Apostol from comment #15) > Review of attachment 311467 [details] [review] [review]: > > ::: ext/dash/gstmpdparser.c > @@ +847,3 @@ > + char *end = NULL; > + > + if (!g_ascii_isdigit (*s)) > > I believe this solution will not skip spaces at the beginning of s and will > return -1 Indeed. Raw strtoul would allow those. > Why not use normal sscanf(str, &read) and test that read >= 0 ? This was > proposed in https://bugzilla.gnome.org/show_bug.cgi?id=752429 Could do as well I guess. Testing here, sscanf will allow leading whitespace, even if the format string does not contain whitespace.
Did you mean you actually *want* leading whitespace to be allowed ? I've just seem comment 2 from https://bugzilla.gnome.org/show_bug.cgi?id=752428, which hints at yes. Also, sscanf doesn't care much about range, it happily parses and discards upper bits: $ ./a.out "3333333333" ret 1, value 3333333333 $ ./a.out "33333333333" ret 1, value 3268562261 $ ./a.out "333333333333" ret 1, value 2620851541 $ ./a.out "3333333333333" ret 1, value 438711637 $ ./a.out "33333333333333" ret 1, value 92149077 #include <stdio.h> int main(int argc,char **argv) { unsigned int u; int ret=sscanf(argv[1], "%u", &u); printf("ret %d, value %u\n", ret, u); return 0; }
(In reply to Vincent Penquerc'h from comment #17) > Did you mean you actually *want* leading whitespace to be allowed ? I've > just seem comment 2 from > https://bugzilla.gnome.org/show_bug.cgi?id=752428, which hints at yes. > I would like the parser to follow the standard exactly and complain about every detected deviation, so that humans can spot the problems with the mpd files. Unfortunately, the user of the parser is the client application and it cannot change the xml published by a service provider, even if it knows the mpd is not conform to the standard. The general consensus here is to make the parser as permissive as possible (we can stil warn about deviations, but try to work around them), so that gstreamer doesn't refuse to play some files that other parsers (which are less restrictive) might play. So, if there are some spaces at the beginning of a string and we can skip them and still play the file, why not do it? > Also, sscanf doesn't care much about range, it happily parses and discards > upper bits: > > $ ./a.out "3333333333" > ret 1, value 3333333333 > $ ./a.out "33333333333" > ret 1, value 3268562261 > $ ./a.out "333333333333" > ret 1, value 2620851541 > $ ./a.out "3333333333333" > ret 1, value 438711637 > $ ./a.out "33333333333333" > ret 1, value 92149077 > > #include <stdio.h> > > int main(int argc,char **argv) > { > unsigned int u; > int ret=sscanf(argv[1], "%u", &u); > printf("ret %d, value %u\n", ret, u); > return 0; > } I know. Also, if you pass 33xx it will return 33 and not complain about the extra data. We could (and I think we should) make the parser check also the remaining of data from a string (parse the property completely) and issue a warning to say something is wrong in the mpd file if extra data is detected. Maybe discard the data and use the default values in such a case?
OK, I will make this accept leading whitespace them. As for 33xx, we specifically want that to allow data after it (ie, Y2M3). But in the 333333333333333 case, the extra digits are eaten by the %u parser, causing overflow, rather than the parsing stopping (which would be weird too I guess).
(In reply to Vincent Penquerc'h from comment #19) > OK, I will make this accept leading whitespace them. > > As for 33xx, we specifically want that to allow data after it (ie, Y2M3). > But in the 333333333333333 case, the extra digits are eaten by the %u > parser, causing overflow, rather than the parsing stopping (which would be > weird too I guess). We want to allow data after a number only in special cases (e.g. P0Y1M2DT12H10M20.5S) and in such cases the parsing is more elaborate, verifying the presence of expected markers. When I mentioned 33xx I was referring to fields that contain numbers only (eg presentationTimeOffset) for which no extra validation is done if the field starts with a number. Bottom line, I believe we should check all the bytes present in a field and issue a warning if anything extra (not mentioned in the standard) is found. If the extra data is spaces, we accept the value. If the extra data is something else, we should use the default value for the field.
Created attachment 311537 [details] [review] unsigned durations Back to sscanf then, as it was originally, but with an extra check for negative values. Whitespace will be accepted iff it was accepted before. The rest will be the same too.
*** Bug 752333 has been marked as a duplicate of this bug. ***
Review of attachment 311537 [details] [review]: ::: ext/dash/gstmpdparser.c @@ +843,3 @@ +static int +sscanfu (const char *s, unsigned int *u) instead of introducing a new function that works only for 32bit integers why not update the code that requires reading unsigned data to check for the presence of a "-" sign in the string? After that check we can safely use sscanf(s, "%u",...) or sscanf (str, "%" G_GUINT64_FORMAT, ...) For example we add: if (strchr (str, '-') != NULL) { GST_WARNING("'-' sign found while parsing unsigned data"); goto error; }
I'm changing this now to your latest comments, but... do you really want to read 64 bit unsigned for these ? It means changing all the hours, months, etc, to 64 bit too (or large values will be clipped), but none of these values make sense to be larger than 1<<32-1 really.
Created attachment 314382 [details] [review] unsigned durations Here's a patch with the sscanf/strstr requested change, but without the move to 64 bit.
Review of attachment 314382 [details] [review]: ::: ext/dash/gstmpdparser.c @@ +979,3 @@ + GST_WARNING ("'-' sign found while parsing unsigned duration"); + goto error; + } if negative durations are accepted, this is not where the '-' sign should be. It should be before P. If we do not need to support parsing negative durations, I suggest doing a single search at the beginning. If we do want to support parsing negative durations, the original code did it ok: search for the first occurrence, test that it is at the beginning, search again for a second occurrence that should not exist.
Created attachment 314504 [details] [review] unsigned durations I think we want to keep to unsigned durations at this point. Signed check moved.
Review of attachment 314504 [details] [review]: ::: ext/dash/gstmpdparser.c @@ +1021,3 @@ + GST_WARNING ("'-' sign found while parsing unsigned duration"); + goto error; + } there was another one here You can merge it after your remove this one
commit e48e68416cd2698ed821748ac89e198b61c95391 Author: Vincent Penquerc'h <vincent.penquerch@collabora.co.uk> Date: Thu Oct 29 11:38:35 2015 +0000 mpdparser: make durations unsigned where appropriate The standard does not seem to make any particular explicit not implicit reference to the signedness of durations, and the code does not rely on such, nor on the negativity of the -1 value that's used as a placeholder when a duration property is not present in the XML. https://bugzilla.gnome.org/show_bug.cgi?id=750847
Created attachment 314505 [details] [review] unsigned durations The one I pushed, with the second sign check removed.