GNOME Bugzilla – Bug 756139
musl: ctors called in the wrong order
Last modified: 2016-03-13 08:27:27 UTC
glib-2.46 expects to run the ctor of libgobject after the ctor of libglib. musl does not guerantee to run them in a certain order. In the result, the ctor of libgobject skips initialization and leaves some static fields uninitialized. This crashes various glib based applications. Here an example of the build process of gobject-introspection-1.46.0 which suffers from that exact issue: ---- (process:4826): GLib-CRITICAL **: g_hash_table_lookup: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_insert_internal: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_lookup: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_insert_internal: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_lookup: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_insert_internal: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_lookup: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_insert_internal: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_lookup: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_insert_internal: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_lookup: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_insert_internal: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_lookup: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_insert_internal: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_lookup: assertion 'hash_table != NULL' failed (process:4826): GLib-CRITICAL **: g_hash_table_insert_internal: assertion 'hash_table != NULL' failed ** GLib-GObject:ERROR:gtype.c:2747:g_type_register_static: assertion failed: (static_quark_type_flags) /bin/sh: line 1: 4826 Aborted env PATH=".libs:/builddir/.xbps-gobject-introspection/wrappers:/usr/lib/ccache/bin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin" ./g-ir-compiler --includedir=. --includedir=./gir --includedir=. --includedir=. --includedir=./gir --includedir=. gir/freetype2-2.0.gir -o gir/freetype2-2.0.typelib ---- To fix this issue, we added the following patch: https://github.com/voidlinux/void-packages/blob/master/srcpkgs/glib/patches/quark_init_on_demand.patch
Are there any chances of getting musl to call the constructors in order?
I think you should consider fixing musls linker to do the right thing. nondeterministic order makes constructors much less useful.
We discussed this extensively in #musl and, while we're not opposed to improvements in ctor ordering in the dynamic linker, no possible changes we could make to the dynamic linker would fix the static-linking case, which glib also broke with this dependency between ctors. If ctor B depends on ctor A already having been run, then ctor B should call ctor A, and ctor A should be safe to invoke multiple times (this probably means using something like pthread_once/call_once if there are runtime-loading scenarios where concurrent calls would be possible).
Created attachment 312778 [details] [review] patch1
Created attachment 312779 [details] [review] patch2
Created attachment 312780 [details] [review] patch3
Can't say I'm a great fan of this, but it should get the job done. Alternatively, we could just export glib_init() publicly, and document it as "you don't ever have to call this".
That looks like it should work. As long as glib_init is called as a ctor I don't think there's any need for pthread_once type magic, since glib having been loaded will not be visible until after dlopen returns, in which case all ctors have run. My concern there would only apply if it were a public API that code using glib had to call before using glib.
Review of attachment 312778 [details] [review]: OK.
Review of attachment 312779 [details] [review]: OK.
Review of attachment 312780 [details] [review]: I'd say it's worth adding a comment above the added line here which links to the bug, to ensure the history/rationale is easily found.
Just for future reference, there was a regression fixed by https://git.gnome.org/browse/glib/commit/?id=99ff9bb5e0ef261e39cb3c67a2d212f6bbeb99e4
*** Bug 757083 has been marked as a duplicate of this bug. ***
Even though this is resolved, as further justification, I am working on getting constructors working on AIX. glib-2.38 worked but 2.46 didn't because because the constructor in gobject was run before the constructor in glib. Do not know if there is a way to specify the ordering. Still looking into it. Anyway, this patch makes it moot.