GNOME Bugzilla – Bug 387550
xmllint displays incorrect line numbers for files over 65,535 lines in length
Last modified: 2007-11-25 02:30:12 UTC
Please describe the problem: I'm working with some very long xml files, some have a length of over 400,000 lines. When I run xmllint it reports all the errors as occuring on line 65,535 (The maximum value of an unsigned 16 bit integer). Steps to reproduce: Generate an xml file along these lines <?xml version="1.0" encoding="UTF-8"?> <yadayada> <foo><bar/></foo> <foo><bar/></foo> <foo><bar/></foo> <foo><bar/></foo> <!-- and so on --> <!-- then somewhere after 65,535 lines --> <foo> <this-causes-an-error/> </foo> </yadayada> This script may be used to generate this sort of file #!/usr/bin/python outFile=open('testLineNumbers.xml','w') outFile.write("""<?xml version="1.0" encoding="UTF-8"?> <yadayada> """) line="<foo><bar/></foo>\n" # write a bunch of OK lines for i in range(65540): outFile.write(line) # write a bad line outFile.write('<foo><this-causes-an-error/></foo>\n') # write 10 more good lines for i in range(10): outFile.write(line) outFile.write('</yadayada>') outFile.close() Validate the file against this schema: <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="yadayada"> <xs:complexType> <xs:sequence> <xs:element ref="foo" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="foo"> <xs:complexType> <xs:sequence> <xs:element name="bar"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> Actual results: You get the following error: testLineNumbers.xml:65535: element foo: Schemas validity error : Element 'foo' [CT local]: The element content is not valid. testLineNumbers.xml fails to validate Looking at the line 65535 (vi +65535 testLineNumbers.xml) indicates that the line number is not the correct location of the error Expected results: Line number should be correct. I'm guessing that the type of the variable that tracks the line number must be changed short to int or int to long. The format specifier in the printf that displays this variable may also need to be changed. Does this happen every time? yes Other information:
XSD validates a precompiled tree. The tree has only room for 16 bits, I will not make an API/ABI incompatible change just for this. Try the streaming version which works on an even flow for which 32bits may be available xmllint --stream --schemas ... Sorry WONTFIX otherwise, I will not change the tree structure for this. Daniel
If I'm reading the man page correctly the --stream option only works when you are validating with either a DTD or a relaxNG schema. Is there a workaround for use with a W3C schema?
make 100% sure you're using the latest version. --stream does work with --schemas at worse it's a documentation problem Daniel
Using --stream is not a valid workaround because it doesn't display the line numbers at all!? Is there another way to fix this?
See my answer on the list: http://mail.gnome.org/archives/xml/2007-October/msg00006.html Daniel
John, A simple patch to resolve the line number problem can be found in bug 325533. Regards