After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 700840 - incorrect parsing of html document, that begins with 2 closing tags
incorrect parsing of html document, that begins with 2 closing tags
Status: RESOLVED OBSOLETE
Product: libxml2
Classification: Platform
Component: htmlparser
git master
Other All
: Normal minor
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2013-05-22 14:30 UTC by Alexander Pastukhov
Modified: 2021-07-05 13:25 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Alexander Pastukhov 2013-05-22 14:30:46 UTC
certainly document is also incorrect.

I think, the problem is in function: 
htmlParseTryOrFinish
file: HTMLparser.c
line num: 5859

After parser finds first closing tag, his instance == XML_PARSER_END_TAG.
Then he process this state - he cant find the name of opened tag and make his instanse = XML_PARSER_EPILOG.
/*...*/
htmlParseEndTag(ctxt);
if (ctxt->nameNr == 0) {
    ctxt->instate = XML_PARSER_EPILOG;
} else {
    ctxt->instate = XML_PARSER_CONTENT;
}
/*...*/
after this parser ignores all other html buffer.

In my program I used such code:
/*...*/
htmlParseEndTag(ctxt);
if (ctxt->nameNr == 0) {
	ctxt->instate = XML_PARSER_MISC;
} else {
	ctxt->instate = XML_PARSER_CONTENT;
}
/*...*/
parser works great on bugged files and seemed to work good on other files too..

Thank you for your work and sorry for my english :)
Comment 1 GNOME Infrastructure Team 2021-07-05 13:25:29 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/libxml2/-/issues/

Thank you for your understanding and your help.