GNOME Bugzilla – Bug 687600
gfileutils.c performs invalid cast of (varargs) open to non-vararg type
Last modified: 2012-11-13 04:29:17 UTC
g_mkstemp_full, get_tmp_file related funcs pass around a function pointer of type GTmpFileCallback which points to the libc open func, and subsequently call it. #define g_open open typedef gint (*GTmpFileCallback) (gchar *, gint, gint); but POSIX says: int open(const char *path, int oflag, ... ); and this is also how musl libc implements it. glibc possibly (at least according to man 2) defines it to int open(const char *pathname, int flags, mode_t mode); so it happens to work. this causes heisenbugs (segfaults, aborts, ...) when calling get_file_name on musl-based (and probably any non-glibc) x86_64 platforms heres a single-step session: http://sprunge.us/HAdc , see how the code jumps from open() to the unrelated fcntl() (this is not in the source code: http://git.musl-libc.org/cgit/musl/tree/src/fcntl/open.c ) this is the code generated by gcc 4.5.4 for open: http://sprunge.us/YdSg it generates prologue code for the varargs, and then jumps to rax. depending on which bits were set in al, it usually jumps to the wrong place. the best solution for this bug would be a wrapper funtion that takes gchar*, gint, gint, and then calls open(a, b, c) directly. gint g_open_wrapper(gchar* a, gint b, gint c) { return open(a, b, c); } #define g_open open would become #define g_open g_open_wrapper and there's one spot, g_mkstemp(), that doesnt use g_open, but open directly, that one needed to be fixed as well to use g_open.
i forgot to mention: it is UB (undefined behaviour) to call a variadic function as non-variadic
That's actually the most/only important part of the bug report: calling a function through a pointer to the wrong function type is undefined behavior. Calling a variadic function through a non-variadic function pointer is a particularly bad example of this, and one where the UB is likely to break on real implementations. As for why it breaks, on x86_64, non-variadic function-call ABI does not use RAX at all. Variadic functions, however, get called with AL (the low byte of RAX) containing the number of xmm-register-passed arguments, to allow the compiler to optimize out saving all 8 registers in the va_list structure (a dubious optimization anyway since the branch probably costs nearly as much as the saving would). When a variadic function is called with junk in AL (which will happen if the caller doesn't know it's variadic), the prologue code GCC generates in the variadic function uses AL as a jump offset and ends up jumping somewhere in the 1k of code preceding the variadic function's prologue. This can lead to Very Bad Things.
as far as I can see glibc declares open as variadic as well
The declaration in glibc is obviously variadic; this is required and would break many programs if wrong. However, what's unclear is whether the implementation inside glibc is variadic. If it were implemented in assembly, it certainly would not have this horrible variadic prologue code; it would just use the argument registers (actually, gcc should be doing the same for variadic C functions that don't pass the va_list to other functions; the code it's generating now is highly suboptimal). It's also possible (this would not be *portable*, but glibc doesn't have to be portable, as long as it works on the systems they support) that they implement it with a non-variadic function (which would be more efficient) and use the fact that the calling conventions allow calling a non-variadic function as if it were variadic.
Your proposed fixes sound fine, btw. A patch would be welcome
Created attachment 228073 [details] [review] proposed fix if the gchar* casts are to noisy, the alternative is to make the typedef use const char* instead. however this would mean changing the prototype of the windows version as well, and this could lead to other warnings.
Review of attachment 228073 [details] [review]: We can't change the public API of g_open() in this way. Your many changes to GLib are only the tip of the iceberg in terms of how many users would be impacted by this. Why not just use the wrapper from the one user of the API internally and not modify the public g_open() ?
the public API of g_open has not changed. i did not modify *g_open()*, but the MACRO g_open *iff* the macro was used, the compiler would have used the open prototype from the libc, which uses const char*. in that case, no warnings were emitted. if the macro was not used (like on windows), the g_open() prototype was used which is not const, thus you would have exactly these warnings that i now muted. feel free to remove the casts. if i follow your advice, only the single direct open call is fixed, all other instances of g_open usage will stay broken when they are used in conjunction with the GTMpFileClallback thing.
note that when the macro #define g_open open was used, the prototype would mismatch with the non-macro-ified version in 2 ways. int open (const char*, int, ...); vs gint g_open(gchar*, gint, gint);
Created attachment 228160 [details] [review] Don't call varargs open() through non-varargs type open() is probably defined varargs. Casting a varargs function to an equivalent non-varargs type and then calling it is undefined, but gfileutils.c was doing exactly that. Add some non-varargs wrappers to avoid the problem. Problem reported by John Spencer.
I think this attachment will fix the problem you describe without making any public changes. I'm not concerned about other uses of the GTmpFileCallback thing that you mention because that's a private typedef only used with internal functions. Does this address the issue for you?
Review of attachment 228160 [details] [review]: that'll work as long as the #define g_open open is not used as a function pointer with g_open() (function) semantics in other places. since the g_open *function* used under some circumstances (#ifdef'd code, probably active on windows) has different semantics than system open, this macro seems like a Bad Idea. code using the g_open macro should be changed everywhere to call the wrapper. the wrapper itself could use const char* to not cause the warnings. note that code using the g_open function will still have these const warnings, unless you change the public prototype to take const gchar*. so now, the cleanest solution appears to be: - the macro has to disappear. - instead implement g_open(gchar*,gint,gint) on every platform. in the first #ifdef code, it's implementation is a simple open wrapper, in the second branch it should do whatever it did already.
Created attachment 228173 [details] [review] as proposed in the latest comment
(In reply to comment #9) > note that when the macro > > #define g_open open > > was used, the prototype would mismatch with the non-macro-ified version in 2 > ways. > > int open (const char*, int, ...); > > vs > > gint g_open(gchar*, gint, gint); g_open actually uses const gchar*, only the function pointer does not. so the function pointer typedef should add const. this should be added to my latest patch for completeness.
Created attachment 228176 [details] [review] latest patch, including the typedef (const) change
I'm not yet convinced that this is a problem in the general case. glib has quite a lot of functions that are sometimes macros...
The practice of using macros in general is not the problem. The problem is calling a function through a pointer to the wrong function type. This results in undefined behavior, and in the specific case of x86_64, the undefined behavior could very well lead to invalid or arbitrary code execution due to the way variadic functions work. It needs to be fixed.
(In reply to comment #16) > I'm not yet convinced that this is a problem in the general case. glib has > quite a lot of functions that are sometimes macros... yes, and the problem here is that the macro points to a function with a different prototype as the one used by g_open (the function, which gets compiled anyway, and is exactly the open wrapper we want and need on any non-windows platform). since the function exists already, i see no reason not to use it, instead of doing questionable macro tricks. my patch fixes the issue once and for all. otoh your patch still has the UB on all functions using "(GTmpFileCallback) g_open" (2 occurences in gfileutils.c). additionally the bug will appear again if someone decides to introduce a similar callback typedef in another translation unit.
Attachment 228160 [details] pushed as b26fb3a - Don't call varargs open() through non-varargs type After chatting about this on IRC, John ACKed that my patch fixes the issue as reported. On the topic of the theoretical problems caused by having g_open() that is sometimes a macro and sometimes a function, there are arguments on both sides: - g_open as a possibly-varargs macro can be dangerous for future problems similar to the one fixed here - g_open as a wrapper function can be dangerous for platforms that define oddly-sized/typed flags or mode arguments One thing that is clear though is that the proposal to always wrap g_open() with a non-varargs function would break the case where people call: g_open (filename, O_RDONLY); which is an unacceptable API break. We should probably reevaluate the meaning of gstdio.h in a greater sense at some future date that we're feeling like doing API/ABI breaks.
Created attachment 228307 [details] [review] alternative fix based on owen's idea as discussed on IRC this fixes 2 issues: - the UB resulting from the function pointer mismatch - the mismatch on windows, where 2-args g_open calls didnt work, while on UNIX they did
ping. ryan, you said on IRC that you would apply a patch based on owen's idea, if i would come up with one. that was already after you applied your earlier patch. getting it done right including testing took me nearly an hour, so please apply as agreed.