GNOME Bugzilla – Bug 674446
g_malloc terminates the program on failure
Last modified: 2014-09-12 20:55:20 UTC
Since all of glib uses g_malloc for allocation, and since all libraries that use glib (including gtk) must directly or indirectly call g_malloc, it is completely impossible to write a safe, robust application using glib or any such library dependent on glib. I know this was an intentional design, but that does not make it any less of a critical bug. Obviously g_malloc cannot be changed, since programs may rely on the fact that it cannot return failure. Instead g_malloc should be deprecated, a replacement function that does not abort on failure should be added in its place, and all internal code in glib and glib-dependent libraries should be fixed to use the replacement function, and to safely handle failure, backing out all partially-completed operations and returning an error to the caller with the program left in a consistent state.
You can use g_try_malloc if you want to handle out-of-memory situations. GLib and GTK+ do not want to handle those.
Then that's a major bug. It means it's impossible to write a safe program using glib/gtk. Any moment the library could crash your program and cause it to lose all its data. And since the SIGABRT happens from inside a glib function (the function that called g_malloc), it's possible that any glib data structures that were being manipulated by the function are in an inconsistent state; thus, even though the signal is a synchronous signal rather than an asynchronous signal, it may be unsafe to attempt accessing any glib data structures to emergency-save data stored in them; it's almost certainly unsafe to longjmp out of the signal handler because the object being modified will be left in a partially-modified state. Perhaps there's one safe approach applications could take: start a new thread every time you need to call glib, hold a robust mutex on the data structures that will be passed to glib functions, and install a SIGABRT handler that simply calls pthread_exit(). This will at least prevent access to data left in a corrupt state (and attempts to access the data will return EOWNERDEAD), but it hardly seems practical...
Handling NULL returns from malloc() will in no way protect you from out-of-memory situations. Due to the practice of overcommiting to memory allocations in the kernel you are far more likely to see the allocation succeed without backing store and only receive a SIGKILL later when you actually go to write something into that memory. No amount of checking will save you from that...
Please spare us the oft-repeated fallacy. Any properly configured system does not overcommit. "Broken systems will break anyway" is not an excuse to write software that breaks on non-broken systems. In case you're unaware, overcommit on Linux is disabled via: echo "2" > /proc/sys/vm/overcommit_memory Moreover, even with overcommit enabled, it's much more likely (on 32-bit machines) that you'll exhaust your ADDRESS SPACE (less than 3GB) before you exhaust physical memory + swap (likely at least 4-8GB and perhaps hundreds of GB). And in the case of address space exhaustion, malloc must return NULL.