GNOME Bugzilla – Bug 784517
php DOMDocument loadHTML parses html tags inside cdata
Last modified: 2017-07-05 06:49:25 UTC
test script: <?php $test_content=' <script> //<![CDATA[ a=\'123\'; b=\'</script>\'; c=\'456\'; //]]> </script> '; $d=new DOMDocument(); $d->loadHTML($test_content); echo $d->saveHTML(); ---end of the test script--- its output: PHP Warning: DOMDocument::loadHTML(): Unexpected end tag : script in Entity, line: 8 in C:\xampp\htdocs\test\dom_cdata.php on line 13 Warning: DOMDocument::loadHTML(): Unexpected end tag : script in Entity, line: 8 in C:\xampp\htdocs\test\dom_cdata.php on line 13 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> <html><head><script> //<![CDATA[ a='123'; b='</script></head><body><p>'; c='456'; //]]> </p></body></html> ---end of the output--- -- so, you can see that cdata which was used to escape html tags inside strings inside script tags failed to perform that task while it was feed to PHP's DOMDocument's loadHTML , which is made with libxml. [and you can see here that] "c='456'; //]]> " - content of script element is going to be outputted to user.
> failed to perform that task while it was feed to PHP's DOMDocument's loadHTML , which is made with libxml You have reported this to the libxml++ project (C++ bindings for libxml). Please report this to the PHP project.
i have found that it is already reported: https://bugs.php.net/bug.php?id=71452 .
i see now that it is not with cdata, so i will report another bug.
https://bugs.php.net/bug.php?id=74858