Bug 643148 – race in xmlNewInputStream

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 643148 - race in xmlNewInputStream


Summary:	race in xmlNewInputStream


Status:	RESOLVED FIXED

Product:	libxml2
Classification:	Platform
Component:	general
Version:	git master
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Daniel Veillard
QA Contact:	libxml QA maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2011-02-24 00:10 UTC by llib
Modified:	2012-05-15 03:50 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Lock the input id increment in libxml using a mutex (2.75 KB, patch) 2011-03-03 00:18 UTC, llib	none	Details \| Review
Lock the input id increment in libxml using a mutex (attempt #2) (2.95 KB, patch) 2011-03-08 18:53 UTC, llib	none	Details \| Review

Description llib 2011-02-24 00:10:02 UTC

This chunk of code is not thread safe:

xmlParserInputPtr
xmlNewInputStream(xmlParserCtxtPtr ctxt) {
    xmlParserInputPtr input;
    static int id = 0;
...

    /*
     * we don't care about thread reentrancy unicity for a single
     * parser context (and hence thread) is sufficient.
     */
    input->id = id++;
...

Assuming you want a particular serialized code path to always see an incrementing (unique) id, this is wrong.  For example, the above code can result in the following output:

assume the statement:
  id++
compiles into:
  load id
  increment register
  store id

then with the following input:

id = 0

thread A calls xmlNewInputStream once.
thread B calls xmlNewInputStream 3 times.

this is a reasonable timeline:

A: load id in register: 0; id = 0
B: load id in register: 0; id = 0
B: increment register: 1; id = 0
B: store id: id = 1
B: load id in register: 1; id = 1
B: increment register: 2; id = 1
B: store id: id = 2
A: increment register: 1; id = 2
A: store id: id = 1
B: load id in register: 1; id = 1
B: increment register: 2; id = 2
B: store id: id = 2

So thread A sees one id value: 0.
thread B sees the following series: 0, 1, 1.  Which violates the desired contract.
It's easy to construct a timeline where id goes backwards.

Comment 1 llib 2011-03-03 00:18:14 UTC

Created attachment 182317 [details] [review]
Lock the input id increment in libxml using a mutex

Proposed patch (v2.7.7, since that's what I have at work) is attached.

Patch description:
	Lock the input id increment in libxml.  This needs to be a monotonically increasing counter which cannot be achieved with a simple increment (it's easy to construct interleavings in which this count goes backwards, or repeats).
	libxml doesn't provide any atomic increment operations, so I've gone for the mutex approach, which is slower, but probably fine.

Comment 2 llib 2011-03-08 18:53:33 UTC

Created attachment 182850 [details] [review]
Lock the input id increment in libxml using a mutex (attempt #2)

    Lock the input id increment in libxml.  This needs to be a monotonically
increasing counter which cannot be achieved with a simple increment (it's easy
to construct interleavings in which this count goes backwards, or repeats).
    libxml doesn't provide any atomic increment operations, so I've gone for
the mutex approach, which is slower, but probably fine.
    Now works if you aren't using libxml multithreaded (i.e., don't call xmlInitParser).

Comment 3 Daniel Veillard 2012-05-10 10:35:45 UTC

The key point is about "single parser context" i.e. we compare id on inputs
all related to the same parser context. Libxml2 doesn't allow multiple threads
to call API for the same parser context 
  http://xmlsoft.org/threads.html

  "Note that the thread safety cannot be ensured for multiple threads sharing   
   the same document, the locking must be done at the application level"

 same parser == same document

 So as explained in the comment your locking patch is not needed,

  thanks, but I'm not taking that one :-)

Daniel

Comment 4 llib 2012-05-12 17:00:19 UTC

Please re-read my original report.
Even if used correctly (one context being accessed only in one thread) this code is unsafe.  It is possible for a single thread to get the same value for id multiple times (if some other thread interferes by calling this function with some other context).

Comment 5 Daniel Veillard 2012-05-13 02:30:12 UTC

Sorry your explanation is just plain wrong.
2 possible case: hardware threads i.e. you have two different
set of physical registers and the scenario you gave can't happen
because the registers 0/1/2 of thread A and B are distints.
Or is soft threads and the very basic thing one lear to do when
implementing threads is to save/restore register set when switching
threads.
With the kind of arguments you put forward, I can prove that you can
never run a multithreaded program safely, because the concurrent access
on the register set, not envisioned by the compiler means that you
cannot guarantee the semantic of the high level operations when
running them in such a situation.
You basically assume a completely broken thread implementation,
recheck you basic or please chase someone else, you are just wrong here.
Or come with a reproducer !

Daniel

Comment 6 Jeffrey Yasskin 2012-05-13 09:47:03 UTC

It's not about saving and restoring register sets. Static function-local variables have a single memory location for the whole process. You could, in theory, write code to look up a different 'id' for each parser context, but if the code llib posted is all that's there, you haven't done it.

Once we've established that there's a single 'id' variable, we can look at how two threads might interleave in incrementing it. llib has actually already done this, but not precisely enough to show you how two threads can race:

Processor/thread A uses r1, while processor/thread B uses r2. r1 and r2 might be on different hardware cores, or they might be simulated when the OS saves and restores registers:

A: load id in r1: r1 == 0; id == 0
B: load id in r2: r2 == 0; id == 0
B: increment r2: r2 == 1; id == 0
B: store id from r2: id == 1
B: load id in r2: r2 == 1; id == 1
B: increment r2: r2 == 2; id == 1
B: store id from r2: id == 2
A: increment r1: r1 == 1; id == 2
A: store id from r1: id == 1
B: load id in r2: r2 == 1; id == 1
B: increment r2: r2 == 2; id == 2
B: store id from r2: id == 2

B's second increment was lost because A overwrote it. They didn't need to re-use each other's registers to do this, just share the memory location for 'id'.

Comment 7 Daniel Veillard 2012-05-13 13:01:20 UTC

Heh, I didn't expect this to be so popular !

Okay, I was wrong ! It is theorically possible for one of the threads
to not see the increase due to the lack of synchronization between the
threads. Sorry llib I really looked at that from the wrong angle !

One more reason to state that global variables are evil.
Now in practice I doubt this happen but okay I'm convinced and I'm reopening
this, I will look at the patch.

I didn't want to increase the parser context with yet another counter
and went with the global variable thinking it would be safe, since we revisit
this I may just extend the context instead to avoid putting a lock
completely artificial from a processing POV into the path of creating
input buffers. 

  thanks for taking the time to get me right, I learned something !

Daniel

Comment 8 Daniel Veillard 2012-05-13 13:57:28 UTC

BTW was that spotted just with manual code review or did valgrind/helgrind
or other tools spotted the error ? I'm curious ...

Daniel

Comment 9 llib 2012-05-13 16:45:22 UTC

Thanks Jeffrey for explaining what I could not.  Thread-safety is subtle.
Daniel: this was found using tsan: <code.google.com/p/data-race-test/wiki/ThreadSanitizer>.

Comment 10 Daniel Veillard 2012-05-14 01:28:19 UTC

Hum, quite interesting, so this was developped from the helgrind framework
and chromium (and hence libxml2/libxslt) is being checked against it.
Okay I will keep an eye, this may be something we could use too for
other projects ...

  thanks !

Daniel

Comment 11 Daniel Veillard 2012-05-15 03:30:24 UTC

Okay, I commited a fix, moving the id counter to the parser context instead
this avoid the race and also avoid putting any extraneous lock contention
on the parser.

 http://git.gnome.org/browse/libxml2/commit/?id=0d51cfebc9eecf311e50eabdf1c2412211220e6d

  thanks again for getting me right :-)

Daniel

P.S.: I pointed tsan to other people, it could be interesting for qemu-kvm
   and libvirt checking as they are doing more and more threading and being
   able to do this kind of checks on a regular basis would be extremely
   useful.
P.S.2: you should advertise tsan more, people don't know about it :-)

Comment 12 llib 2012-05-15 03:50:09 UTC

Thanks for fixing.  Your solution looks good to me.