GNOME Bugzilla – Bug 704903
dict.c randomization is ineffective and interferes with debugging
Last modified: 2021-07-05 13:22:23 UTC
I recently had to track down a memory corruption bug in a program using libxml2, and setting watchpoints to catch the corruption turned out to be futile, since the address of the object being corrupted kept changing on each run, despite gdb turning off ASLR. The culprit turned out to be libxml2's randomization in dict.c; disabling that (by interposing the "xkcd implementation" of rand_r) solved the problem. I understand that the purpose of this randomization is to secure against DoS via hash collisions. However, the current implementation is ineffective against such attacks (the seed is time(), which is completely predictable, rather than something secure like /dev/urandom or clock_gettime's tv_nsec) and only serves to make debugging difficult. As a fix, I would propose starting off with a fixed seed, and only re-seeding, using clock_gettime with tv_nsec, if a pathologically high collision rate is detected, indicating high likelihood of a DoS attack. This would make programs using libxml2 deterministic under normal inputs and would provide real security against attacks.
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/libxml2/-/issues/ Thank you for your understanding and your help.