GNOME Bugzilla – Bug 613943
Native regex literal compilation should be guarded by g_once_init_enter/leave
Last modified: 2010-03-31 19:41:19 UTC
Hi, currently the native regex literals are compiled into a static variable the first time they're used but unfortunately without making sure to do this in a threadsafe way. The g_regex_new() call should be wrapped inside a g_once_init_enter() / g_once_init_leave() block to fix this. Using the same compiled regex from different threads at the same time seems to work fine fortunately (according to the GLib docs at least) so we don't need any mutexes at least :)
Note that using g_once_init_enter/leave for pointers is no problem although it takes gsize as parameter. gsize should be of the same size as a pointer.
So the following initialization would be thread safe: static GRegex* _safe_regex_init (GRegex* re, gchar* pattern, GRegexMatchFlags match_options) { if (g_once_init_enter ((volatile gsize*) &re)) { GRegex* val = g_regex_new (pattern, match_options, 0, NULL); g_once_init_leave ((volatile gsize *) &re, (gsize) val); } return re; } And then in the code where regex is used: _tmp0_regex0 = _safe_regex_init (_tmp_regex_0, "myregex", 0) I'll try to create a patch to generate it this way.
No, that won't work this way. This is better: static inline GRegex* _safe_regex_init (GRegex** re, const gchar* pattern, GRegexMatchFlags match_options) { if (g_once_init_enter ((volatile gsize*) *re)) { GRegex* val = g_regex_new (pattern, match_options, 0, NULL); g_once_init_leave ((volatile gsize *) *re, (gsize) val); } return *re; } _tmp_regex_0 = _safe_regex_init (&_tmp_regex_0_storage, "myregex", 0); Note that you should never assign a new value to the storage of the GRegex after the initialization because setting a pointer could be non-atomic on some architectures!
This seems to work. The upper one seg.faults. static inline GRegex* _thread_safe_regex_init (GRegex** re, const gchar* pattern, GRegexMatchFlags match_options) { if (g_once_init_enter ((volatile gsize*) re)) { GRegex* val = g_regex_new (pattern, match_options, 0, NULL); g_once_init_leave ((volatile gsize *) re, (gsize) val); } return *re; }
Err, yes. Your last version is correct :)
Created attachment 157113 [details] [review] Thread safe regex literal initialization (partial) The attached patch should fix the problem once I found out (or someone gives me a hint) how to append the _thread_safe_regex_init function to the C code.
Created attachment 157146 [details] [review] Regular expression thread safety fix This fixes the issue. I created a new class for raw C code nodes.
I think it's preferred to add the function not as a single blob but by using the CCode stuff for all parts of the function. This way you automatically get correct indention and everything.
Classes to look at are CCodeFunction (for the function declaration including return type and parameters and flags... and the body), CCodeConditionalExpression, CCodeFunctionCall, CCodeAssignment. Look at codegen/valatyperegisterfunction.vala for an example of a g_once_init_enter/leave block.
Created attachment 157159 [details] [review] Thread safe regular expression literals How about this?
Yes, the generated code looks good and your patch looks good to me too :) IIRC there's something for CCodeIdentifier ("GRegex* val"), CCodeVariableDeclaration or something and something else for casts too and pointer dereferences but I don't think it's that important.
Oh another thing, you might want to store the compiled GRegex in a static variable inside the function scope instead of making it global. That might make it easier to prevent identifier collisions.
commit 0ea081ce7c654fe2fc88709884ec612f3d815687 Author: Jukka-Pekka Iivonen <jp0409@jippii.fi> Date: Wed Mar 31 21:38:01 2010 +0200 Use thread-safe initialization for regular expression literals Fixes bug 613943.