GNOME Bugzilla – Bug 633268
Schema validation of time/dateTime does not meet spec XML Schema Part 2
Last modified: 2021-07-05 13:21:35 UTC
According to XML Schema Part 2: Datatypes Second Edition; 3.2.7 dateTime: "Local" or untimezoned times are presumed to be the time in the timezone of some unspecified locality as prescribed by the appropriate legal authority; currently there are no legally prescribed timezones which are durations whose magnitude is greater than 14 hours. And in 3.2.7.3 Timezones: Timezones are durations with (integer-valued) hour and minute properties (with the hour magnitude limited to at most 14, and the minute magnitude limited to at most 59, except that if the hour magnitude is 14, the minute value must be 0); they may be both positive or both negative. However, the file xmlschemastypes.c contains the line: #define VALID_TZO(tzo) ((tzo > -840) && (tzo < 840)) which constrains the valid timzone to not include -14:00 and 14:00. (840 is 14:00). Then in 3.2.1.7: hh is a two-digit numeral that represents the hour; '24' is permitted if the minutes and seconds represented are zero, and the dateTime value so represented is the first instant of the following day (the hour property of a dateTime object in the ·value space· cannot have a value greater than 23); In section 3.2.8 time, it refers to the appendix paragraph D, ISO 8601 Date and Time Formats (§D). In that section, it reads: h -- represents a digit used in the time element "hour". The two digits in a hh format can have values from 0 to 24. If the value of the hour element is 24 then the values of the minutes element and the seconds element must be 00 and 00. The file xmlschemastypes.c does not take this exception into account, but has: #define VALID_HOUR(hr) ((hr >= 0) && (hr <= 23)) That same appendix explains: s -- represents a digit used in the time element "second". The two digits in a ss format can have values from 0 to 60. In the formats described in this specification the whole number of seconds ·may· be followed by decimal seconds to an arbitrary level of precision. This is represented in the picture by "ss.sss". A value of 60 or more is allowed only in the case of leap seconds. However, the same LIBXML2 file xmlschemastypes.c contains: #define VALID_SEC(sec) ((sec >= 0) && (sec < 60)) There is further text in this appendix which might be useful to the developer. It may be found at: http://www.w3.org/TR/xmlschema-2/#isoformats As a result, LIBXML2 fails to validate the legal "dateTime" and "time" values of 2010-12-31T23:59:60 and 24:00:00 and 12:00:00+14:00.
Okay, but the problem is more complex than jus changing those 3 macros we also need to check that those boundaries are checked properly, for example 2010-12-31T23:58:60 fails as well as 24:00:01 and 12:00:00+14:030 So I prefer to delay this until I have time for a full fix Daniel
How about just fixing the time zone to include +/-14:00, that seems pretty safe: #define VALID_TZO(tzo) ((tzo >= -840) && (tzo =< 840))
#define VALID_TZO(tzo) ((tzo >= -840) && (tzo <= 840))
The VALID_TZO macro is fixed now: https://github.com/nwellnhof/libxml2/commit/8efc5b283cdc808bee17a5567c0a0f29dcdd9b9b
*** Bug 731964 has been marked as a duplicate of this bug. ***
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/libxml2/-/issues/ Thank you for your understanding and your help.