GNOME Bugzilla – Bug 768215
gparted crashes due to g_quark_from_static_string used in global initialization
Last modified: 2018-05-24 18:58:34 UTC
gparted build with musl libc crashes with this error: GLib:ERROR:gquark.c:60:g_quark_init: assertion failed: (quark_seq_id == 0) backtrace: Core was generated by `/usr/sbin/gpartedbin'. Program terminated with signal SIGABRT, Aborted.
+ Trace 236399
So what happens is that there is a global initialization using g_quark_from_static_string in glibmm/gio/src/application.ccg (https://git.gnome.org/browse/glibmm/tree/gio/src/application.ccg#n46) This call to g_quark_from_static_string will increase the quark_seq_id (via quark_new) and it happens before the glib_init ctor has called g_quark_init, so when glib_init tries to initialize quark, it will trigger the assert. This can be solved in various ways: - revert commit 2fe992 (Move quark initialization to a constructor) - call g_quark_init from quark_new (and have a static boolean to check that its only initialized once) - call g_quark_init from g_quark_from_static_string (with static boolean check) - say that its not allowed to use g_quark_from_static_string to initalize globals and tell glibmm to fix it other way.
Untested, possible fix: diff --git a/glib/gquark.c b/glib/gquark.c index 9e51a92..978f9ef 100644 --- a/glib/gquark.c +++ b/glib/gquark.c @@ -57,6 +57,11 @@ static gint quark_block_offset = 0; void g_quark_init (void) { + static gboolean quark_initialized = FALSE; + + if (quark_initialized) + return; + g_assert (quark_seq_id == 0); quark_ht = g_hash_table_new (g_str_hash, g_str_equal); quarks = g_new (gchar*, QUARK_BLOCK_SIZE); @@ -280,6 +285,7 @@ quark_new (gchar *string) GQuark quark; gchar **quarks_new; + g_quark_init (); if (quark_seq_id % QUARK_BLOCK_SIZE == 0) { quarks_new = g_new (gchar*, quark_seq_id + QUARK_BLOCK_SIZE);
there should of cource be an quark_initialized = TRUE; in there too
> This call to g_quark_from_static_string will increase the quark_seq_id (via > quark_new) and it happens before the glib_init ctor has called g_quark_init, so > when glib_init tries to initialize quark, it will trigger the assert. I don't see how this scenario can happen. Calling g_quark_from_static string before g_quark_init will run into an assertion when quark_from_string tries to look up the string quark_ht, which hasn't been created yet.
Where is the assert in quark_from_string? In g_hash_table_lookup? gpointer g_hash_table_lookup (GHashTable *hash_table, gconstpointer key) { ... g_return_val_if_fail (hash_table != NULL, NULL); node_index = g_hash_table_lookup_node (hash_table, key, &node_hash); ... } g_hash_table_lookup_node seems to have an assert but unless i totally misunderstand the g_return_val_if_fail, it will return NULL if its uninitialized instead of assert?
Created attachment 330639 [details] [review] gquark: make sure quark is initialized before create new quark I think that this might be a possible solution if you insist on calling g_quark_init from glib ctor. Tested on my Alpine Linux desktop (using musl libc) and it works.
g_return_val_if_fail is basically an assertion. All bets are off once you trigger one of those. Having that there does *not* mean passing NULL for the hash table is ok in any way.
Review of attachment 330639 [details] [review]: This defeats the entire purpose of moving quark initialization to a constructor. We do that precisely so we *don't* have to have this extra branch in every quark lookup
(In reply to Matthias Clasen from comment #7) > Review of attachment 330639 [details] [review] [review]: > > This defeats the entire purpose of moving quark initialization to a > constructor. We do that precisely so we *don't* have to have this extra > branch in every quark lookup Understand. Do you have any benchmarking test program so I can try different alternatives? Do you have any numbers that show how big the difference is in practice? I assume the difference is big enough to justify the C++ breakage. So the alternatives that are left are: - forbid using quark functions for global/static vars in C++. For example: https://git.gnome.org/browse/glibmm/tree/glib/glibmm/class.cc?id=f2cc4ebb113c48ad0aba46c1f67d4e8f65a69330#n152 This means it just needs to be documented so next time something breaks we can just point the to the doc saying that its a feature, not a bug. - add a function for C++ constructors: g_quark_new_from_static_string or similar GQuark g_quark_new_from_static_string (const gchar *string) { if (!quark_inited) { g_quark_init(); quark_inited = TRUE; } return q_guark_new_from_static_string(string); }
Created attachment 330942 [details] [review] 0001-gquark-simplify-locking-code.patch This patch only simplifies the codebase without fixing the issue.
Created attachment 330943 [details] [review] 0002-gquark-move-string-check-to-quark_from_string.patch A second codebase simplification without any change in behavior. I decided to split it to make it easier to see what is going on.
Created attachment 330944 [details] [review] 0003-gquark-fix-initialization-with-c-constructors.patch This third patch fixes the issue at hand, but together with the 2 previous patches the final result will make codebase simpler. glib/gquark.c | 57 ++++++++++++-------------------------------------------- 1 file changed, 12 insertions(+), 45 deletions(-)
(In reply to ncopa from comment #9) > Created attachment 330942 [details] [review] [review] > 0001-gquark-simplify-locking-code.patch > > This patch only simplifies the codebase without fixing the issue. Sorry this is broken. We need protect the quarks[quark] with the global lock.
Review of attachment 330944 [details] [review]: Same comment as the previous patch: this defeats the purpose of moving to a constructor. It brings the branch back.
What is the main motive to remove the branch? performance?
I'm not sufficiently familiar with the code to write a patch at the moment, but on a high level the following approach should work with no runtime costs except a tiny constant number of branches during startup. It's not a hack but a canonical solution to the problem which is portable and clean. Add a call to g_quark_init to any ctor that depends on g_quark already being initialized. Only ctors need this special treatment; code called after all ctors have run can assume that g_quark's own ctor handled the issue. In g_quark_init, move all code out of the actual body to a new function g_quark_init_body and instead g_quark_init simply call pthread_once(&g_quark_init_once, g_quark_init_body). On the musl libc side we are not going to make a change whose sole purpose is to make code that's missing explicit initialization ordering (and thus broken with static linking) so that it works but only in a dynamic-linking environment. All that does is encourage the treatment of static linking as having second-class status.
Supporting static linking is not a goal for GLib.
And supporting intentionally broken software with hostile maintainers not a goal for musl.
Alright; I think you can patch things on your side as you see fit, then.
I'm not the one doing the patching; that's a burden you're putting on our mutual users (mainly distributions like Alpine since they're the ones who'll end up doing and maintaining the patches). I've already spent a great deal of time trying to address this bug constructively, and without impacting performance since I got the impression that the bug was introduced in the interest of avoiding branches at runtime. If there's really no interest in an upstream fix, I would recommend a minimally-invasive fix, just putting a pthread_once call to the init function in every externally visible function that's part of the g_quark API that depends on init having completed. This is trivial to maintain and does not involve introducing any new architectural complexity like cross-library init-function calls. And I strongly suspect it will not make any measurable performance difference either.
> If there's really no interest in an upstream fix Sure there is interest in code improvements. Just not in trading insults. But maybe we can get beyond that. I would suggest that an acceptable fix would either need to preserve the quality of keeping initialization out of the runtime calls or be an ifdef MUSL or ifdef NO_CONSTRUCTORS thing.
I'd say: - say that its not allowed to use g_quark_from_static_string to initalize globals and tell glibmm to fix it other way.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/1177.