GNOME Bugzilla – Bug 348006
read functions do not indicate parse failures
Last modified: 2009-01-09 09:43:56 UTC
libxml++ maps the int return code for textreader read and next functions to a boolean. A true presumably indicates successful parse and not end of document, while a false presumably indicates end of document. The underlying libxml functions have three states: 1 = successful parse, and not end of document 0 = end of document -1 = parse error The current code calls check_for_exceptions() in the -1 case, which is a no-op and a const function. It then returns the integer value as a bool. C++ converts all non-zero values assigned to a bool to a value of 1. This means that successful parse and not end of document is indistinguishable through the interface from a parse error. Perhaps an exception could be thrown here. I would favour both read and next functions being changed to return the underlying libxml type rather than converting to bool. Benjamin.
Exactly what method do you mean, and what are the underlying libxml functions that you mean?
Murray, From the 2.14.0 source code: bool TextReader::read() { return propertyreader->Bool( xmlTextReaderRead(impl_)); } and bool TextReader::next() { return propertyreader->Bool( xmlTextReaderNext(impl_)); } propertyreader->Bool is: bool TextReader::PropertyReader::Bool( int value) { if(value == -1) owner_.check_for_exceptions(); return value; } check_for_exceptions() is a no-op, so this function effectively returns the int value as a bool. This converts 1 -> 1, 0 -> 0, and -1 (error) -> 1 (ok). xmlTextReaderRead and xmlTextReaderNext both return the -1,0,1 int return code. See <http://xmlsoft.org/html/libxml-xmlreader.html#xmlTextReaderRead> and <http://xmlsoft.org/html/libxml-xmlreader.html#xmlTextReaderNext>. Benjamin.
We cannot change the return type of the function, and the proper way to report errors is an exception, as it is done this way in the other parsers. The solution would be to have check_for_exceptions doing something.
Yes, it seems that I have seen a possible problem before: void TextReader::check_for_exceptions() const { //TODO: Shouldn't we do something here? murrayc. } Maybe we should provide a callback via xmlTextReaderErrorFunc(): http://xmlsoft.org/html/libxml-xmlreader.html#xmlTextReaderErrorFunc But I don't know what those parameters should be. If we don't figure this out then we need to just throw a generic exception when the result is -1.
Created attachment 69999 [details] [review] libxmlpp_textreader_errors.patch Here is a patch that shows partly how me might do this. Please take a look and think about the TODO comments. This would also break ABI by adding a class member. But maybe this parser is useless anyway if it can't report errors, so maybe nobody is using it.
Does anyone have opinions or feedback about this?
Created attachment 86675 [details] [review] Another similar patch Looks good to me. The only change I would make is to make _TextReader_on_libxml_error a private static function (fibbing the enum, like you suggested). Oh, and like your comments say, no warnings. I actually just cooked up a similar patch while fixing a user bug.
Created attachment 86676 [details] Test case which demonstrates problem/fix
This does add three class member variables: int severity_; (I'd rename that to error_severity.) Glib::ustring error_; So I'll start a conversation on the mailing list about whether anyone is using this class as it is.
Committed with this ChangeLog entry: 2009-01-09 Stef Walter <stef-list@memberwebs.com> * libxml++/parsers/textreader.[h|cc]: Add setup_exceptions(), setting the on_libxml_error() callback, and call it from the constructors. check_for_exceptions(): Actually check some member variables and throw an exception if necessary. This should fix bug #348006. It breaks ABI because it adds member variables, but we decided that is OK because nobody could actually be using this class seriously before now because it had no error checking.