After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 674446 - g_malloc terminates the program on failure
g_malloc terminates the program on failure
Status: RESOLVED NOTABUG
Product: glib
Classification: Platform
Component: general
unspecified
Other Linux
: Normal critical
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2012-04-20 02:38 UTC by bugdal
Modified: 2014-09-12 20:55 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description bugdal 2012-04-20 02:38:26 UTC
Since all of glib uses g_malloc for allocation, and since all libraries that use glib (including gtk) must directly or indirectly call g_malloc, it is completely impossible to write a safe, robust application using glib or any such library dependent on glib.

I know this was an intentional design, but that does not make it any less of a critical bug.

Obviously g_malloc cannot be changed, since programs may rely on the fact that it cannot return failure. Instead g_malloc should be deprecated, a replacement function that does not abort on failure should be added in its place, and all internal code in glib and glib-dependent libraries should be fixed to use the replacement function, and to safely handle failure, backing out all partially-completed operations and returning an error to the caller with the program left in a consistent state.
Comment 1 Matthias Clasen 2012-04-20 02:40:46 UTC
You can use g_try_malloc if you want to handle out-of-memory situations. 
GLib and GTK+ do not want to handle those.
Comment 2 bugdal 2012-04-20 02:52:27 UTC
Then that's a major bug. It means it's impossible to write a safe program using glib/gtk. Any moment the library could crash your program and cause it to lose all its data. And since the SIGABRT happens from inside a glib function (the function that called g_malloc), it's possible that any glib data structures that were being manipulated by the function are in an inconsistent state; thus, even though the signal is a synchronous signal rather than an asynchronous signal, it may be unsafe to attempt accessing any glib data structures to emergency-save data stored in them; it's almost certainly unsafe to longjmp out of the signal handler because the object being modified will be left in a partially-modified state.

Perhaps there's one safe approach applications could take: start a new thread every time you need to call glib, hold a robust mutex on the data structures that will be passed to glib functions, and install a SIGABRT handler that simply calls pthread_exit(). This will at least prevent access to data left in a corrupt state (and attempts to access the data will return EOWNERDEAD), but it hardly seems practical...
Comment 3 Allison Karlitskaya (desrt) 2012-04-20 18:38:18 UTC
Handling NULL returns from malloc() will in no way protect you from out-of-memory situations.  Due to the practice of overcommiting to memory allocations in the kernel you are far more likely to see the allocation succeed without backing store and only receive a SIGKILL later when you actually go to write something into that memory.  No amount of checking will save you from that...
Comment 4 bugdal 2012-04-21 05:16:27 UTC
Please spare us the oft-repeated fallacy. Any properly configured system does not overcommit. "Broken systems will break anyway" is not an excuse to write software that breaks on non-broken systems.

In case you're unaware, overcommit on Linux is disabled via:

echo "2" > /proc/sys/vm/overcommit_memory

Moreover, even with overcommit enabled, it's much more likely (on 32-bit machines) that you'll exhaust your ADDRESS SPACE (less than 3GB) before you exhaust physical memory + swap (likely at least 4-8GB and perhaps hundreds of GB). And in the case of address space exhaustion, malloc must return NULL.