After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 768215 - gparted crashes due to g_quark_from_static_string used in global initialization
gparted crashes due to g_quark_from_static_string used in global initialization
Status: RESOLVED OBSOLETE
Product: glib
Classification: Platform
Component: gobject
2.48.x
Other Linux
: Normal major
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2016-06-29 20:16 UTC by ncopa
Modified: 2018-05-24 18:58 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
gquark: make sure quark is initialized before create new quark (1.37 KB, patch)
2016-06-30 07:59 UTC, ncopa
none Details | Review
0001-gquark-simplify-locking-code.patch (2.53 KB, patch)
2016-07-06 11:15 UTC, ncopa
none Details | Review
0002-gquark-move-string-check-to-quark_from_string.patch (1.88 KB, patch)
2016-07-06 11:17 UTC, ncopa
none Details | Review
0003-gquark-fix-initialization-with-c-constructors.patch (1.46 KB, patch)
2016-07-06 11:20 UTC, ncopa
rejected Details | Review

Description ncopa 2016-06-29 20:16:11 UTC
gparted build with musl libc crashes with this error:

GLib:ERROR:gquark.c:60:g_quark_init: assertion failed: (quark_seq_id == 0)

backtrace:

Core was generated by `/usr/sbin/gpartedbin'.
Program terminated with signal SIGABRT, Aborted.
  • #0 __syscall4
    at ./arch/x86_64/syscall_arch.h line 38
  • #0 __syscall4
    at ./arch/x86_64/syscall_arch.h line 38
  • #1 __restore_sigs
    at src/signal/block.c line 43
  • #2 raise
    at src/signal/raise.c line 13
  • #3 abort
    at src/exit/abort.c line 7
  • #4 g_assertion_message
  • #5 g_assertion_message_expr
    at gtestutils.c line 2452
  • #6 g_quark_init
    at gquark.c line 60
  • #7 glib_init
    at glib-init.c line 243
  • #8 do_init_fini
    at ldso/dynlink.c line 1233
  • #9 __libc_start_main
  • #10 _start_c
    at crt/crt1.c line 17
  • #11 _start
  • #0 quark_new
    at gquark.c line 299
  • #1 quark_from_string
    at gquark.c line 186
  • #2 g_quark_from_static_string
    at gquark.c line 246
  • #3 ??
    from /usr/lib/libgiomm-2.4.so.1
  • #4 do_init_fini
    at ldso/dynlink.c line 1233
  • #5 __libc_start_main
  • #6 _start_c
    at crt/crt1.c line 17
  • #7 _start


So what happens is that there is a global initialization using g_quark_from_static_string in glibmm/gio/src/application.ccg
(https://git.gnome.org/browse/glibmm/tree/gio/src/application.ccg#n46)

This call to g_quark_from_static_string will increase the quark_seq_id (via quark_new) and it happens before the glib_init ctor has called g_quark_init, so when glib_init tries to initialize quark, it will trigger the assert.

This can be solved in various ways:
- revert commit 2fe992 (Move quark initialization to a constructor)
- call g_quark_init from quark_new (and have a static boolean to check that its only initialized once)
- call g_quark_init from g_quark_from_static_string (with static boolean check)
- say that its not allowed to use g_quark_from_static_string to initalize globals and tell glibmm to fix it other way.
Comment 1 ncopa 2016-06-29 20:19:50 UTC
Untested, possible fix:

diff --git a/glib/gquark.c b/glib/gquark.c
index 9e51a92..978f9ef 100644
--- a/glib/gquark.c
+++ b/glib/gquark.c
@@ -57,6 +57,11 @@ static gint           quark_block_offset = 0;
 void
 g_quark_init (void)
 {
+  static gboolean quark_initialized = FALSE;
+
+  if (quark_initialized)
+    return;
+
   g_assert (quark_seq_id == 0);
   quark_ht = g_hash_table_new (g_str_hash, g_str_equal);
   quarks = g_new (gchar*, QUARK_BLOCK_SIZE);
@@ -280,6 +285,7 @@ quark_new (gchar *string)
   GQuark quark;
   gchar **quarks_new;
 
+  g_quark_init ();
   if (quark_seq_id % QUARK_BLOCK_SIZE == 0)
     {
       quarks_new = g_new (gchar*, quark_seq_id + QUARK_BLOCK_SIZE);
Comment 2 ncopa 2016-06-29 20:22:53 UTC
there should of cource be an

  quark_initialized = TRUE;

in there too
Comment 3 Matthias Clasen 2016-06-30 03:56:51 UTC
> This call to g_quark_from_static_string will increase the quark_seq_id (via
> quark_new) and it happens before the glib_init ctor has called g_quark_init, so
> when glib_init tries to initialize quark, it will trigger the assert.

I don't see how this scenario can happen. Calling g_quark_from_static string before g_quark_init will run into an assertion when quark_from_string tries to look up the string quark_ht, which hasn't been created yet.
Comment 4 ncopa 2016-06-30 06:14:34 UTC
Where is the assert in quark_from_string?

In g_hash_table_lookup?

gpointer
g_hash_table_lookup (GHashTable    *hash_table,
                     gconstpointer  key)
{
  ...
  g_return_val_if_fail (hash_table != NULL, NULL);

  node_index = g_hash_table_lookup_node (hash_table, key, &node_hash);
  ...
}



g_hash_table_lookup_node seems to have an assert but unless i totally misunderstand the g_return_val_if_fail, it will return NULL if its uninitialized instead of assert?
Comment 5 ncopa 2016-06-30 07:59:30 UTC
Created attachment 330639 [details] [review]
gquark: make sure quark is initialized before create new quark

I think that this might be a possible solution if you insist on calling g_quark_init from glib ctor.

Tested on my Alpine Linux desktop (using musl libc) and it works.
Comment 6 Matthias Clasen 2016-06-30 13:50:05 UTC
g_return_val_if_fail is basically an assertion. All bets are off once you trigger one of those. Having that there does *not* mean passing NULL for the hash table is ok in any way.
Comment 7 Matthias Clasen 2016-06-30 13:51:29 UTC
Review of attachment 330639 [details] [review]:

This defeats the entire purpose of moving quark initialization to a constructor. We do that precisely so we *don't* have to have this extra branch in every quark lookup
Comment 8 ncopa 2016-07-06 08:35:41 UTC
(In reply to Matthias Clasen from comment #7)
> Review of attachment 330639 [details] [review] [review]:
> 
> This defeats the entire purpose of moving quark initialization to a
> constructor. We do that precisely so we *don't* have to have this extra
> branch in every quark lookup


Understand. Do you have any benchmarking test program so I can try different alternatives? Do you have any numbers that show how big the difference is in practice? I assume the difference is big enough to justify the C++ breakage.

So the alternatives that are left are:

- forbid using quark functions for global/static vars in C++. For example:

https://git.gnome.org/browse/glibmm/tree/glib/glibmm/class.cc?id=f2cc4ebb113c48ad0aba46c1f67d4e8f65a69330#n152

This means it just needs to be documented so next time something breaks we can just point the to the doc saying that its a feature, not a bug.

- add a function for C++ constructors: g_quark_new_from_static_string or similar

GQuark
g_quark_new_from_static_string (const gchar *string)
{
  if (!quark_inited) {
    g_quark_init();
    quark_inited = TRUE;
  }
  return q_guark_new_from_static_string(string);
}
Comment 9 ncopa 2016-07-06 11:15:10 UTC
Created attachment 330942 [details] [review]
0001-gquark-simplify-locking-code.patch

This patch only simplifies the codebase without fixing the issue.
Comment 10 ncopa 2016-07-06 11:17:41 UTC
Created attachment 330943 [details] [review]
0002-gquark-move-string-check-to-quark_from_string.patch

A second codebase simplification without any change in behavior. I decided to split it to make it easier to see what is going on.
Comment 11 ncopa 2016-07-06 11:20:56 UTC
Created attachment 330944 [details] [review]
0003-gquark-fix-initialization-with-c-constructors.patch

This third patch fixes the issue at hand, but together with the 2 previous patches the final result will make codebase simpler.

glib/gquark.c | 57 ++++++++++++--------------------------------------------
1 file changed, 12 insertions(+), 45 deletions(-)
Comment 12 ncopa 2016-07-06 11:44:00 UTC
(In reply to ncopa from comment #9)
> Created attachment 330942 [details] [review] [review]
> 0001-gquark-simplify-locking-code.patch
> 
> This patch only simplifies the codebase without fixing the issue.

Sorry this is broken. We need protect the quarks[quark] with the global lock.
Comment 13 Matthias Clasen 2016-07-06 12:58:43 UTC
Review of attachment 330944 [details] [review]:

Same comment as the previous patch: this defeats the purpose of moving to a constructor. It brings the branch back.
Comment 14 ncopa 2016-07-06 16:25:04 UTC
What is the main motive to remove the branch? performance?
Comment 15 bugdal 2016-07-06 17:42:17 UTC
I'm not sufficiently familiar with the code to write a patch at the moment, but on a high level the following approach should work with no runtime costs except a tiny constant number of branches during startup. It's not a hack but a canonical solution to the problem which is portable and clean.

Add a call to g_quark_init to any ctor that depends on g_quark already being initialized. Only ctors need this special treatment; code called after all ctors have run can assume that g_quark's own ctor handled the issue.

In g_quark_init, move all code out of the actual body to a new function g_quark_init_body and instead g_quark_init simply call pthread_once(&g_quark_init_once, g_quark_init_body).

On the musl libc side we are not going to make a change whose sole purpose is to make code that's missing explicit initialization ordering (and thus broken with static linking) so that it works but only in a dynamic-linking environment. All that does is encourage the treatment of static linking as having second-class status.
Comment 16 Matthias Clasen 2016-07-06 17:46:27 UTC
Supporting static linking is not a goal for GLib.
Comment 17 bugdal 2016-07-06 18:16:17 UTC
And supporting intentionally broken software with hostile maintainers not a goal for musl.
Comment 18 Matthias Clasen 2016-07-06 19:20:02 UTC
Alright; I think you can patch things on your side as you see fit, then.
Comment 19 bugdal 2016-07-06 21:51:21 UTC
I'm not the one doing the patching; that's a burden you're putting on our mutual users (mainly distributions like Alpine since they're the ones who'll end up doing and maintaining the patches). I've already spent a great deal of time trying to address this bug constructively, and without impacting performance since I got the impression that the bug was introduced in the interest of avoiding branches at runtime.

If there's really no interest in an upstream fix, I would recommend a minimally-invasive fix, just putting a pthread_once call to the init function in every externally visible function that's part of the g_quark API that depends on init having completed. This is trivial to maintain and does not involve introducing any new architectural complexity like cross-library init-function calls. And I strongly suspect it will not make any measurable performance difference either.
Comment 20 Matthias Clasen 2016-07-07 14:45:39 UTC
> If there's really no interest in an upstream fix

Sure there is interest in code improvements. Just not in trading insults. But maybe we can get beyond that.

I would suggest that an acceptable fix would either need to preserve the quality of keeping initialization out of the runtime calls or be an ifdef MUSL or ifdef NO_CONSTRUCTORS thing.
Comment 21 Colin Walters 2016-11-22 18:19:53 UTC
I'd say:

- say that its not allowed to use g_quark_from_static_string to initalize globals and tell glibmm to fix it other way.
Comment 22 GNOME Infrastructure Team 2018-05-24 18:58:34 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/1177.