GNOME Bugzilla – Bug 643148
race in xmlNewInputStream
Last modified: 2012-05-15 03:50:09 UTC
This chunk of code is not thread safe: xmlParserInputPtr xmlNewInputStream(xmlParserCtxtPtr ctxt) { xmlParserInputPtr input; static int id = 0; ... /* * we don't care about thread reentrancy unicity for a single * parser context (and hence thread) is sufficient. */ input->id = id++; ... Assuming you want a particular serialized code path to always see an incrementing (unique) id, this is wrong. For example, the above code can result in the following output: assume the statement: id++ compiles into: load id increment register store id then with the following input: id = 0 thread A calls xmlNewInputStream once. thread B calls xmlNewInputStream 3 times. this is a reasonable timeline: A: load id in register: 0; id = 0 B: load id in register: 0; id = 0 B: increment register: 1; id = 0 B: store id: id = 1 B: load id in register: 1; id = 1 B: increment register: 2; id = 1 B: store id: id = 2 A: increment register: 1; id = 2 A: store id: id = 1 B: load id in register: 1; id = 1 B: increment register: 2; id = 2 B: store id: id = 2 So thread A sees one id value: 0. thread B sees the following series: 0, 1, 1. Which violates the desired contract. It's easy to construct a timeline where id goes backwards.
Created attachment 182317 [details] [review] Lock the input id increment in libxml using a mutex Proposed patch (v2.7.7, since that's what I have at work) is attached. Patch description: Lock the input id increment in libxml. This needs to be a monotonically increasing counter which cannot be achieved with a simple increment (it's easy to construct interleavings in which this count goes backwards, or repeats). libxml doesn't provide any atomic increment operations, so I've gone for the mutex approach, which is slower, but probably fine.
Created attachment 182850 [details] [review] Lock the input id increment in libxml using a mutex (attempt #2) Lock the input id increment in libxml. This needs to be a monotonically increasing counter which cannot be achieved with a simple increment (it's easy to construct interleavings in which this count goes backwards, or repeats). libxml doesn't provide any atomic increment operations, so I've gone for the mutex approach, which is slower, but probably fine. Now works if you aren't using libxml multithreaded (i.e., don't call xmlInitParser).
The key point is about "single parser context" i.e. we compare id on inputs all related to the same parser context. Libxml2 doesn't allow multiple threads to call API for the same parser context http://xmlsoft.org/threads.html "Note that the thread safety cannot be ensured for multiple threads sharing the same document, the locking must be done at the application level" same parser == same document So as explained in the comment your locking patch is not needed, thanks, but I'm not taking that one :-) Daniel
Please re-read my original report. Even if used correctly (one context being accessed only in one thread) this code is unsafe. It is possible for a single thread to get the same value for id multiple times (if some other thread interferes by calling this function with some other context).
Sorry your explanation is just plain wrong. 2 possible case: hardware threads i.e. you have two different set of physical registers and the scenario you gave can't happen because the registers 0/1/2 of thread A and B are distints. Or is soft threads and the very basic thing one lear to do when implementing threads is to save/restore register set when switching threads. With the kind of arguments you put forward, I can prove that you can never run a multithreaded program safely, because the concurrent access on the register set, not envisioned by the compiler means that you cannot guarantee the semantic of the high level operations when running them in such a situation. You basically assume a completely broken thread implementation, recheck you basic or please chase someone else, you are just wrong here. Or come with a reproducer ! Daniel
It's not about saving and restoring register sets. Static function-local variables have a single memory location for the whole process. You could, in theory, write code to look up a different 'id' for each parser context, but if the code llib posted is all that's there, you haven't done it. Once we've established that there's a single 'id' variable, we can look at how two threads might interleave in incrementing it. llib has actually already done this, but not precisely enough to show you how two threads can race: Processor/thread A uses r1, while processor/thread B uses r2. r1 and r2 might be on different hardware cores, or they might be simulated when the OS saves and restores registers: A: load id in r1: r1 == 0; id == 0 B: load id in r2: r2 == 0; id == 0 B: increment r2: r2 == 1; id == 0 B: store id from r2: id == 1 B: load id in r2: r2 == 1; id == 1 B: increment r2: r2 == 2; id == 1 B: store id from r2: id == 2 A: increment r1: r1 == 1; id == 2 A: store id from r1: id == 1 B: load id in r2: r2 == 1; id == 1 B: increment r2: r2 == 2; id == 2 B: store id from r2: id == 2 B's second increment was lost because A overwrote it. They didn't need to re-use each other's registers to do this, just share the memory location for 'id'.
Heh, I didn't expect this to be so popular ! Okay, I was wrong ! It is theorically possible for one of the threads to not see the increase due to the lack of synchronization between the threads. Sorry llib I really looked at that from the wrong angle ! One more reason to state that global variables are evil. Now in practice I doubt this happen but okay I'm convinced and I'm reopening this, I will look at the patch. I didn't want to increase the parser context with yet another counter and went with the global variable thinking it would be safe, since we revisit this I may just extend the context instead to avoid putting a lock completely artificial from a processing POV into the path of creating input buffers. thanks for taking the time to get me right, I learned something ! Daniel
BTW was that spotted just with manual code review or did valgrind/helgrind or other tools spotted the error ? I'm curious ... Daniel
Thanks Jeffrey for explaining what I could not. Thread-safety is subtle. Daniel: this was found using tsan: <code.google.com/p/data-race-test/wiki/ThreadSanitizer>.
Hum, quite interesting, so this was developped from the helgrind framework and chromium (and hence libxml2/libxslt) is being checked against it. Okay I will keep an eye, this may be something we could use too for other projects ... thanks ! Daniel
Okay, I commited a fix, moving the id counter to the parser context instead this avoid the race and also avoid putting any extraneous lock contention on the parser. http://git.gnome.org/browse/libxml2/commit/?id=0d51cfebc9eecf311e50eabdf1c2412211220e6d thanks again for getting me right :-) Daniel P.S.: I pointed tsan to other people, it could be interesting for qemu-kvm and libvirt checking as they are doing more and more threading and being able to do this kind of checks on a regular basis would be extremely useful. P.S.2: you should advertise tsan more, people don't know about it :-)
Thanks for fixing. Your solution looks good to me.