GNOME Bugzilla – Bug 672665
Race condition in erroneous dlsym error detection scheme
Last modified: 2013-02-03 06:26:08 UTC
_g_module_symbol in gmodule-dl.c calls dlerror before and after dlsym, and treats any error returned by dlerror as failure of the dlsym call. Presumably this hackery is an attempt to deal with the fact that a symbol address might be 0 without the lookup having failed, but it is dangerously incorrect. The dlerror interface (per POSIX) is not thread-safe and is required to return the most recent error that occurred for dlopen or dlsym, regardless of what thread the error occurred in, so it can give false positives if another thread invoked dlopen or dlsym between the two dlerror calls in _g_module_symbol and received an error. Conversely, it could give false negatives if another thread called dlerror and stole (cleared) the error status. This error does not manifest with glibc because glibc's dlerror function is non-conformant and returns a thread-local error status rather than the POSIX-required global one. A safe compromise approach (but still not ideal) would be to only do the dlerror check if dlsym returned a null pointer. That is, non-null return values from dlsym should always be considered success (this is specified by POSIX anyway) and only the null return is ambiguous (symbol with null address or error) in which case the ambiguity can be resolved (but still with the ugly race condition) using dlerror. The approach I would prefer is just removing the dlerror hackery entirely and treating null symbols as not-found/error, and non-null symbols of course as success.
*** This bug has been marked as a duplicate of bug 646342 ***