After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 698716 - Use of g_mem_set_vtable() breaks after gobject automatic init ctor
Use of g_mem_set_vtable() breaks after gobject automatic init ctor
Status: RESOLVED NOTGNOME
Product: glib
Classification: Platform
Component: build
2.36.x
Other Linux
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2013-04-24 09:23 UTC by Andrey Vihrov
Modified: 2013-05-22 00:31 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
0001-configure-Assume-C90-compatible-malloc-prototype.patch (4.11 KB, patch)
2013-04-26 14:35 UTC, Colin Walters
reviewed Details | Review
0001-configure-Assume-C90-compatible-malloc-prototype.patch (6.14 KB, patch)
2013-04-26 19:32 UTC, Colin Walters
accepted-commit_now Details | Review

Description Andrey Vihrov 2013-04-24 09:23:16 UTC
Steps to reproduce:

Download a Glib 2.36.1 tarball and ./configure it with GCC 4.8.0 and CFLAGS="-Wall". Then config.log contains

configure:21628: checking if malloc() and friends prototypes are gmem.h compatible
configure:21654: gcc -c -Wall -Werror  conftest.c >&5
conftest.c: In function 'main':
conftest.c:66:13: error: variable 'my_realloc_p' set but not used [-Werror=unused-but-set-variable]
     void* (*my_realloc_p) (void*, size_t)  = realloc;
             ^
conftest.c:65:13: error: variable 'my_free_p' set but not used [-Werror=unused-but-set-variable]
     void  (*my_free_p)    (void*)          = free;
             ^
conftest.c:64:13: error: variable 'my_malloc_p' set but not used [-Werror=unused-but-set-variable]
     void* (*my_malloc_p)  (size_t)         = malloc;
             ^
conftest.c:63:13: error: variable 'my_calloc_p' set but not used [-Werror=unused-but-set-variable]
     void* (*my_calloc_p)  (size_t, size_t) = calloc;
             ^
cc1: all warnings being treated as errors
configure:21654: $? = 1
configure: failed program was:
...

instead of the correct (output of ./configure with default CFLAGS)

configure:21628: checking if malloc() and friends prototypes are gmem.h compatible
configure:21654: gcc -c -g -O2 -Werror  conftest.c >&5
configure:21654: $? = 0
configure:21663: result: yes

As a result, the macro SANE_MALLOC_PROTOS gets incorrectly undefined in config.h.
Comment 1 Colin Walters 2013-04-24 21:44:17 UTC
Yeah, this check sucks.   It dates back over 12 years to https://git.gnome.org/browse/glib/commit/?id=782a8e2e7c69c3d98bd69bcfdbb65ded520576f4

GLib uses -Wall by default, so you can work around this by simply not passing it.  But maybe we should try detecting known platforms (glibc e.g.) and assuming it's OK.
Comment 2 Dan Winship 2013-04-25 13:44:24 UTC
This is just checking that malloc(), realloc(), calloc(), and free() have the prototypes specified by C90... I think we can probably just ditch the check.
Comment 3 Andrey Vihrov 2013-04-25 13:53:57 UTC
Note that I just wish to point out that enabling or disabling compiler warnings should not change the behavior of the software (thus, -Werror isn't good). There's little benefit in enabling -Wall when just building the software for production use, of course. But this bug report comes from an anecdotal situation, where Glib 2.36.1 was crashing in Arch Linux and someone observed that simply rebuilding it with "-Wall" fixes the crash. The real difference was SANE_MALLOC_PROTOS being defined or not :-)

Reference: https://bugs.archlinux.org/task/34630
Comment 4 Colin Walters 2013-04-26 14:35:16 UTC
Created attachment 242572 [details] [review]
0001-configure-Assume-C90-compatible-malloc-prototype.patch
Comment 5 Dan Winship 2013-04-26 14:51:12 UTC
Comment on attachment 242572 [details] [review]
0001-configure-Assume-C90-compatible-malloc-prototype.patch

>  * REALLOC_0_WORKS is defined if g_realloc (NULL, x) works.

oh, I guess we don't need this any more either. C90 says "If ptr is a null pointer, the realloc function behaves like the malloc function for the specified size."

>+#define standard_malloc	malloc

so we can get rid of all the standard_* defines too
Comment 6 Colin Walters 2013-04-26 19:32:36 UTC
Created attachment 242615 [details] [review]
0001-configure-Assume-C90-compatible-malloc-prototype.patch

Now with even more code deleted! \o/
Comment 7 Dan Winship 2013-04-26 20:17:57 UTC
Comment on attachment 242615 [details] [review]
0001-configure-Assume-C90-compatible-malloc-prototype.patch

great
Comment 9 Bernhard Beschow 2013-04-27 15:35:06 UTC
This commit reintroduces the bug from Archlinux Andrey mentioned above. How should this bug be fixed?
Comment 10 Colin Walters 2013-04-27 15:44:55 UTC
The relevant comment appears to be this one: https://bugs.archlinux.org/task/34630#comment108896

I'll have to think about this.
Comment 11 Dan Winship 2013-04-27 16:18:05 UTC
what flags does arch compile glib with?
Comment 12 John Lindgren 2013-04-27 18:27:11 UTC
@Colin: It seems like using g_malloc() from __attribute__((constructor)) is a bad idea in the first place because of g_mem_set_vtable().  The documentation for g_mem_set_vtable() says, "This function must be called before using any other GLib functions."  But an __attribute__((constructor)) function gets called even before main(), so in order to use g_mem_set_vtable() safely, an application would have refrain from linking to libgobject.so and instead dlopen() it after calling g_mem_set_vtable().
Comment 13 John Lindgren 2013-04-27 18:35:50 UTC
@Dan: None, until they added -Wall as a workaround for this bug.
https://projects.archlinux.org/svntogit/community.git/plain/trunk/PKGBUILD?h=packages/lib32-glib2
Comment 14 Andrey Vihrov 2013-04-27 19:49:51 UTC
Jonh: Arch packages are compiled using the default values from /etc/makepkg.conf. As of now, for x86-64 these are

CPPFLAGS="-D_FORTIFY_SOURCE=2"
CFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector --param=ssp-buffer-size=4"
CXXFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector --param=ssp-buffer-size=4"
LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro"
Comment 15 Colin Walters 2013-04-29 12:56:39 UTC
Comment #12 sounds right; so https://git.gnome.org/browse/glib/commit/?id=7c42ab23b55c43ab96d0ac2124b550bf1f49c1ec from 2.36 broke users of g_mem_set_vtable that were also linking to libgobject.

What to do about it?  I'm not sure we could just ignore calls to g_mem_set_vtable because the application could assume its own malloc functions were in place, and thus might freely intermix g_free()/myapp_free().

If we can't solve this glib side, we should at least strongly deprecate g_mem_set_vtable(); while theoretically there could be apps that only link to libglib that use it, in practice I doubt their existence.

Ryan, any opinions here?
Comment 16 Allison Karlitskaya (desrt) 2013-04-29 15:36:17 UTC
The only way that I think we can fix this is by advertising the name of a symbol (like "glib_get_mem_config" that you can have in your executable.  If we dlsym() and find it, we let you change the memory config using it.

Otherwise, we probably do want to deprecate it.


I really want to avoid the idea that it's valid to call g_malloc() without calling g_free() (or vice versa).  In the future, in context of our valgrind-friendly bug, I want to be able to switch on malloc/free counting so we can run our testcases outside of valgrind and still catch leaks.
Comment 17 Colin Walters 2013-04-29 16:46:14 UTC
Ok, per:
https://bugs.archlinux.org/task/34630#comment108896
it sounds like Skype is not using g_mem_set_vtable(); it's likely using LD_PRELOAD to override malloc(), or including a definition in its main DSO.

"As it happens, this bug only affects the pointer to malloc(), not the wrapper function; that's why the crash goes away if SANE_MALLOC_PROTOS is not defined."

That indeed sounds very bizarre; I suspect it's something like the dynamic linker runs over data sections before text and processing constructors.  So when the malloc pointer happens to live in a static structure, it gets processed first.

Regardless though, this seems like a generic incompatibility between GLib's use of a constructor and applications that want to override malloc() - both via symbol overrides and g_mem_set_vtable().
Comment 18 John Lindgren 2013-04-30 05:10:13 UTC
> it sounds like Skype is not using g_mem_set_vtable(); it's likely using
> LD_PRELOAD to override malloc(), or including a definition in its main DSO.

More specifically, the Skype binary contains an invalid PLT, which is initialized by some startup code in Skype after shared library initialization.  (This behavior seems to be unique to Skype; a test application that I linked with GCC 4.8.0 produced a valid PLT and did not modify it during startup.)  ld-linux.so apparently assumes that an application's PLT is always valid and that it's okay to use it to resolve symbols in shared libraries also.

Anyway, the reason it works without SANE_MALLOC_PROTOS is simply that GCC can optimize out the PLT relocation for a direct function call but not for a stored function pointer.
Comment 19 Colin Walters 2013-04-30 14:00:04 UTC
(In reply to comment #18)

> More specifically, the Skype binary contains an invalid PLT, which is
> initialized by some startup code in Skype after shared library initialization. 
> (This behavior seems to be unique to Skype; 

Thanks for debugging and explaning this, John!  So there really are two separate bugs here.  First is what we should do about g_mem_set_vtable() - I jumped to the conclusion Skype was using it, but apparently not.

Second is whether we should carry a workaround for the Skype PLT breakage GLib side.  I'd lean towards no.  Unless there were multiple applications affected (say a known broken toolchain).
Comment 20 Dan Winship 2013-04-30 14:29:19 UTC
(In reply to comment #19)
> Thanks for debugging and explaning this, John!  So there really are two
> separate bugs here.  First is what we should do about g_mem_set_vtable()

This is more-or-less the same as bug 687763.
Comment 21 Iain Lane 2013-05-17 08:47:03 UTC
Interesting bug. Has someone reported the issue to Skype? Do you have a recommendation for distros to work around it?
Comment 22 Iain Lane 2013-05-17 11:44:28 UTC
A Skype developer friend reported it internally with issue ID LINUX-1675. I don't think there's a way for Joe Public to view these though.
Comment 23 Dan Winship 2013-05-17 12:07:57 UTC
(In reply to comment #22)
> A Skype developer friend reported it internally with issue ID LINUX-1675. I
> don't think there's a way for Joe Public to view these though.

Knowing exactly what it is that's going wrong, and if/when they expect to fix it would help us figure out how/if we should try to work around it on the glib side.
Comment 24 Iain Lane 2013-05-17 12:14:50 UTC
(In reply to comment #23)
> Knowing exactly what it is that's going wrong, and if/when they expect to fix
> it would help us figure out how/if we should try to work around it on the glib
> side.

Sure. They have a link to this bug so might reply directly. If I hear anything then I'll reply too.
Comment 25 John Lindgren 2013-05-17 22:50:04 UTC
(In reply to comment #21)
> Interesting bug. Has someone reported the issue to Skype? Do you have a
> recommendation for distros to work around it?

In the 2.36 branch, the easiest workaround I know of is to patch gmem.c changing change "#ifdef SANE_MALLOC_PROTOS" to "#if 0".  I don't know of any workaround for glib master since https://git.gnome.org/browse/glib/commit/glib/gmem.c?id=518e3104bf6cdb5d8e6b43d3b721805db5951139.
Comment 26 Gunnar Hjalmarsson 2013-05-21 23:16:51 UTC
Skype for Linux 4.2 has been released:
http://blogs.skype.com/2013/05/20/skype-for-linux-4-2/

The new Skype version seems to work fine with the current glib 2.37.
Comment 27 John Lindgren 2013-05-22 00:28:45 UTC
Skype 4.2 works for me also.
Comment 28 Colin Walters 2013-05-22 00:31:31 UTC
Ok, closing here as I can't think of a sane workaround GLib side, and it's hopefully easy enough for people to update Skype.  If someone comes up with a not-too-gross workaround for GLib, feel free to reopen.