After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 620195 - xpointer doesn't use the string values of elements consistently
xpointer doesn't use the string values of elements consistently
Status: RESOLVED OBSOLETE
Product: libxml2
Classification: Platform
Component: xpointer
2.7.6
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2010-05-31 22:51 UTC by Piotr Banski
Modified: 2021-07-05 13:21 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
test case (1.18 KB, application/x-compressed)
2010-05-31 22:51 UTC, Piotr Banski
Details

Description Piotr Banski 2010-05-31 22:51:19 UTC
Created attachment 162412 [details]
test case

As per the W3C XPointer draft,

"Element boundaries, as well as entire embedded nodes such as processing 
  instructions and comments, are ignored as specified by the definition of 
  string-value in [XPath]."

http://www.w3.org/TR/xptr-xpointer/#stringrange

It turns out that embedded nodes disrupt the string calculations, however.

Below is the source and the output of

$ xmllint --xinclude xpointer-nested_element.xml > output.xml

[some of the strange effects of running this command are described separately in bug #562541 and bug #620190]

SOURCE:
<div xmlns="http://example.org/">
  <p>XXAAXXAAXX <b>YYBBYYBBYY</b> ZZCCZZCCZZ WWDDWWDDWW.</p>
  <p>XXAAXXAAXX YYBBYYBBYY ZZCCZZCCZZ WWDDWWDDWW.</p>
  <p>XXAAXXAAXX YYBBYYBBYY <!-- gizmo -->ZZCCZZCCZZ WWDDWWDDWW.</p>
  <p>Thomas <em>Pyn</em>chon</p>
</div>

OUTPUT:
<body xmlns="http://example.org/">
  
<!--  
  Although the entire content of the <p> element should be 
  seen and handled as a single string to the exclusion of non-element nodes 
  (this is what "string value of <p>" is), we can see in the first three <div>s 
  that an embedded node disrupts calculations (the second example is the 
  reference
  [if you're surprised about the indexes, see bug #620190 - this seems an 
  independent issue]). Notice that in the third <div>, the comment is not 
  included, but nevertheless the calculations change.

  The last example is modified from the W3C draft, it should match the entire 
  name. 
  It actually does, but it returns more than the requested 10 characters.
-->  
  <div>
    <seg>XXAAXXAAXX</seg>
    <seg><b xmlns="http://example.org/">YBBYYBBYY</b> Z</seg>
    <seg>CCZZCCZZ W</seg>
    <seg>DDWWD</seg>
  </div>
  <div>
    <seg>XXAAXXAAXX</seg>
    <seg>YYBBYYBBYY</seg>
    <seg>ZZCCZZCCZZ</seg>
    <seg>WWDDW</seg>
  </div>
  <div>
    <seg>XXAAXXAAXX</seg>
    <seg>YYBBYYBBYY</seg>
    <seg>ZCCZZCCZZ </seg>
    <seg>WDDWW</seg>
  </div>
  <seg>Thomas <em xmlns="http://example.org/">Pyn</em>cho</seg>
</body>
Comment 1 Piotr Banski 2010-06-01 09:09:57 UTC
Ah, I have now realised the obvious, concerning the "Thoman Pynchon" example: string-range() does return exactly 10 characters: "Thomas "=7 + "cho"=3. It completely skips the embedded element, but it should not be able to distinguish

<p>Thomas <em>Pyn</em>chon</p>

from

<p>Thomas Pynchon</p>


So just to make matters clearer: the desired output of the last match is 

"Thomas Pyn" = 10 characters.
Comment 2 Piotr Banski 2010-06-02 11:19:54 UTC
And one more remark (sigh): it doesn't "completely skip" the embedded element: its text value is apparently (properly!) used for the purpose of *matching* the string, but the retrieval (improperly) works on the full element content instead of merely the string value.
Comment 3 Piotr Banski 2010-08-01 16:19:38 UTC
How extremely unfriendly of me not to have pasted at least part of the attachment that contains the XInclude directives. Here's the Thomas P. line, for starters:

 <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(//ex:p,'Thomas Pynchon',1,10))"/></seg>

What it says is:

- search the p elements for a match with 'Thomas Pynchon' -- this is *successful*
- return 10 characters starting with the first character of the match(es) -- this *fails*

------------
The crucial part of the attached file with XIncludes follows, for convenience.

<body xmlns="http://example.org/">
  <div>
    <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(/ex:div/ex:p[1],'',1,9)[1])"/></seg>
    <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(/ex:div/ex:p[1],'',12,10)[1])"/></seg>
    <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(/ex:div/ex:p[1],'',23,10)[1])"/></seg>
    <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(/ex:div/ex:p[1],'',34,5)[1])"/></seg>
  </div>
  <div>
    <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(/ex:div/ex:p[2],'',1,9)[1])"/></seg>
    <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(/ex:div/ex:p[2],'',12,10)[1])"/></seg>
    <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(/ex:div/ex:p[2],'',23,10)[1])"/></seg>
    <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(/ex:div/ex:p[2],'',34,5)[1])"/></seg>
  </div>
  <div>
    <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(/ex:div/ex:p[3],'',1,9)[1])"/></seg>
    <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(/ex:div/ex:p[3],'',12,10)[1])"/></seg>
    <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(/ex:div/ex:p[3],'',23,10)[1])"/></seg>
    <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(/ex:div/ex:p[3],'',34,5)[1])"/></seg>
  </div>
  <seg><include xmlns="http://www.w3.org/2001/XInclude" href="source-nested_element.xml" xpointer="xmlns(ex=http://example.org/) xpointer(string-range(//ex:p,'Thomas Pynchon',1,10))"/></seg>
</body>
Comment 4 GNOME Infrastructure Team 2021-07-05 13:21:20 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/libxml2/-/issues/

Thank you for your understanding and your help.