After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 733806 - [review] [th/bgo733806_clang] compile NM with clang, several fixes for bugs and compiler warnings
[review] [th/bgo733806_clang] compile NM with clang, several fixes for bugs a...
Status: RESOLVED FIXED
Product: NetworkManager
Classification: Platform
Component: general
git master
Other Linux
: Normal normal
: ---
Assigned To: NetworkManager maintainer(s)
NetworkManager maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2014-07-27 03:24 UTC by Thomas Haller
Modified: 2014-08-01 17:14 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Thomas Haller 2014-07-27 03:24:33 UTC
I wanted to see if I can compile NM with clang, and it was pretty simple (on Fedora 20, clang 3.4-6.fc20). Some compiler warnings turned out to be real bugs.


How to reproduce:

=================================================

unset LD
export CC=clang
export CXX=clang++
export CFLAGS="-g -O0 -Wstrict-overflow"
export CXXFLAGS="-g -O0 -Wstrict-overflow"

git clean -fdx

./autogen.sh \
    --prefix=/opt/test \
    --sysconfdir=/etc \
    --enable-gtk-doc \
    --enable-more-warnings=error \
    --with-valgrind=yes \
    --enable-ifcfg-rh \
    --enable-ifcfg-suse \
    --enable-ifupdown \
    --enable-ifnet \
    --enable-vala=yes \
    --with-nmtui=yes \
    --with-modem-manager-1 \
    --with-suspend-resume=systemd \
    --enable-teamdctl=yes \
    --enable-tests=yes \
    --with-consolekit=yes \
    --with-systemd-logind=yes \
    --with-session-tracking=systemd

make

=================================================

note, that from glib, I needed commit 012011538f4513569f92e322c4377efa2cb64378 (which I had to work around on Fedora 20).



Strangely, building with gcc(!) seems significantly faster, but I didn't look into that.



After compiling with clang, test-remote-settings-client also crashed. Turned out to be a bug too:

> libnm-glib/test: fix crash in test-remote-settings-client
> libnm-glib/test: add asserts to test-remote-settings-client
Comment 1 Dan Winship 2014-07-28 14:31:04 UTC
> build/clang: fix detection of valid warning flags

> Fix the script to detect compiler warnings by passing
> -Werror, which lets clang fail too.

configure tends to generate really poor code for its tests, so compiling them with -Werror may fail because the code actually hits the warning you're testing for. Is there some other flag you could pass to clang that would make it error out on unknown arguments?



> all: add own macros to ignore deprecation warnings
> 
> Define NM_BEGIN_IGNORE_DEPRECATIONS and NM_END_IGNORE_DEPRECATIONS.
> The glib provided macros do not work for clang.

Really? Is there even a compatibility argument you can pass that will make it? It seems weird that no one would have reported this against GLib before...

Anyway, if this is needed, we should submit a patch to glib and then add wrappers to nm-glib-compat.h



> core: fix warning about comparing unsigned enum values being positive

why the (gsize) casts?



> platform: fix warning warning about unused variable

I feel like it would be clearer to explicitly unref the object in this case, but OK.

Also, "object" looks like it's cruft left over from some earlier change, and it could just be removed and everything could use "obj" instead.



> platform: fix warning initializing static array freq_policy

the static asserts seem indented too far



> core: fix warning calling non-returning g_error()

Blah. Should be reported as a bug against gcc so they can (eventually) improve the definition of g_error().

Also, g_assert_not_reached() would be better, since that's what we really mean.



> platform/test: fix warnings in test

likewise with g_assert_not_reached()



> cli: remove unused static struct nmc_mobile_settings

This seems like it's a bug in nmcli; it's probably supposed to be showing those settings.



> cli: fix warning in parse_output_fields() about strict-overflow

>+			if (FALSE) {
>+found:

That's confusing... isn't there some other way of doing that?
Comment 2 Thomas Haller 2014-07-28 19:03:59 UTC
(In reply to comment #1)
> > build/clang: fix detection of valid warning flags
> 
> > Fix the script to detect compiler warnings by passing
> > -Werror, which lets clang fail too.
> 
> configure tends to generate really poor code for its tests, so compiling them
> with -Werror may fail because the code actually hits the warning you're testing
> for. Is there some other flag you could pass to clang that would make it error
> out on unknown arguments?

I added a fixup. I think that is the way to go and it seems to work. I guess the fixup! is better because without it, the detection will fail if the user defines:

  CFLAGS="... -Wno-unknown-warning-option"

Autotools basically is a bunch of hacked scripts, ... so, add another hack that seems to work :)


> > all: add own macros to ignore deprecation warnings
> > 
> > Define NM_BEGIN_IGNORE_DEPRECATIONS and NM_END_IGNORE_DEPRECATIONS.
> > The glib provided macros do not work for clang.
> 
> Really? Is there even a compatibility argument you can pass that will make it?
> It seems weird that no one would have reported this against GLib before...
> 
> Anyway, if this is needed, we should submit a patch to glib and then add
> wrappers to nm-glib-compat.h

I rewrote the patch to (just) add a compatibility wrapper instead.
See new commit:
 "all: add compatibility macros to ignore deprecation warnings for clang"

AFAIS, there is no other way. 


> > core: fix warning about comparing unsigned enum values being positive
> 
> why the (gsize) casts?

in the case that the enum is actually a signed type (either because we add a negative value, or because another compiler makes it signed). In that case, the
cast would do the right thing, wouldn't it?

> > platform: fix warning warning about unused variable
> 
> I feel like it would be clearer to explicitly unref the object in this case,
> but OK.

I don't mind. How is the fixup commit?


> Also, "object" looks like it's cruft left over from some earlier change, and it
> could just be removed and everything could use "obj" instead.

addressed by fixup!



> > platform: fix warning initializing static array freq_policy
> 
> the static asserts seem indented too far

fixed

> > core: fix warning calling non-returning g_error()
> 
> Blah. Should be reported as a bug against gcc so they can (eventually) improve
> the definition of g_error().
> 
> Also, g_assert_not_reached() would be better, since that's what we really mean.

fixed

> > platform/test: fix warnings in test
> 
> likewise with g_assert_not_reached()

fixed


> > cli: remove unused static struct nmc_mobile_settings
> 
> This seems like it's a bug in nmcli; it's probably supposed to be showing those
> settings.

I dunno. There is no NMSettingMobile. What should that be?


> > cli: fix warning in parse_output_fields() about strict-overflow
> 
> >+			if (FALSE) {
> >+found:
> 
> That's confusing... isn't there some other way of doing that?

Do you prefer the fixup commit instead? (I do not, but I don't mind)
It uses guchar instead of gboolean.
Comment 3 Dan Winship 2014-07-28 20:27:24 UTC
(In reply to comment #2)
> I added a fixup.

Well... that makes it so that we don't use -Werror with gcc I guess, but it doesn't help things in the clang case. But if there's no other way to do it, then, *shrug*.

> > > all: add own macros to ignore deprecation warnings

> I rewrote the patch to (just) add a compatibility wrapper instead.

good, except you still refer to the NM_* names in the commit message.

> > > core: fix warning about comparing unsigned enum values being positive
> > 
> > why the (gsize) casts?
> 
> in the case that the enum is actually a signed type (either because we add a
> negative value, or because another compiler makes it signed). In that case, the
> cast would do the right thing, wouldn't it?

If it's signed but has no negative values then the cast is unnecessary, since the implicit cast to unsigned would do the right thing.

If we add a negative value to the enum... then we'd need the "state >= 0" check again anyway... Doh. (Note that gcc has -Wsign-conversion, which would also warn in this case, but it's disabled by default, probably because of stuff like this.)

> > > platform: fix warning warning about unused variable
> > 
> > I feel like it would be clearer to explicitly unref the object in this case,
> > but OK.
> 
> I don't mind. How is the fixup commit?

yup

> > > cli: remove unused static struct nmc_mobile_settings
> > 
> > This seems like it's a bug in nmcli; it's probably supposed to be showing those
> > settings.
> 
> I dunno. There is no NMSettingMobile. What should that be?

ah, indeed there are already separate gsm and cdma arrays. OK then.

> > > cli: fix warning in parse_output_fields() about strict-overflow
> > 
> > >+			if (FALSE) {
> > >+found:
> > 
> > That's confusing... isn't there some other way of doing that?
> 
> Do you prefer the fixup commit instead? (I do not, but I don't mind)
> It uses guchar instead of gboolean.

Does that actually fix the warning? There's no possibility for "signed overflow" of the "found" variable...

I assume it's talking about "i" or "j" overflowing, though I don't know why. (Any bugs that could be introduced by signed i or j overflowing could also occur if unsigned i or j overflowed...) I also don't understand the ifcfg-rh/reader.c changes in the previous patch...

The whole warning seems kind of flaky. In the ifcfg-rh/reader.c case, you fix it by using "break", but in parse_output_fields() it's not happy with that style either...

(Again, gcc apparently has the same warning, but disables it (or runs it at a lower level) by default.)
Comment 4 Thomas Haller 2014-07-28 21:21:00 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > I added a fixup.
> 
> Well... that makes it so that we don't use -Werror with gcc I guess, but it
> doesn't help things in the clang case. But if there's no other way to do it,
> then, *shrug*.

I think that autotools are a bunch of hacked up, ugly scripts. So, here there is another hack, that seems to work (unless it doesn't). Yes: *shrug* :)



> > > > all: add own macros to ignore deprecation warnings
> 
> > I rewrote the patch to (just) add a compatibility wrapper instead.
> 
> good, except you still refer to the NM_* names in the commit message.

done


> > > > core: fix warning about comparing unsigned enum values being positive
> > > 
> > > why the (gsize) casts?
> > 
> > in the case that the enum is actually a signed type (either because we add a
> > negative value, or because another compiler makes it signed). In that case, the
> > cast would do the right thing, wouldn't it?
> 
> If it's signed but has no negative values then the cast is unnecessary, since
> the implicit cast to unsigned would do the right thing.

This whole "if" is there to catch integer values that are not valid enum values. If we would be less defensive, we could drop the entire "if"

If we want to catch invalid enum values (a "bug"), we should also try catching negative values.

If the enum is actually unsigned, the cast has no effect.

If the enum happens to be signed (on another compiler), the cast should catch negative values.

Arguably, the second cast is not needed. I pushed a fixup, it is now:

-    if (state >= 0 && state < G_N_ELEMENTS (state_table))
+    if ((gsize) state < G_N_ELEMENTS (state_table))
          return state_table[state];

or how about:

+    G_STATIC_ASSERT ( ((NMDeviceStateReason) -1) > ((NMDeviceStateReason) 0) );
-    if (reason >= 0 && reason < G_N_ELEMENTS (reason_table))
+    if (reason < G_N_ELEMENTS (reason_table))
          return reason_table[reason];
     return reason_table[NM_DEVICE_STATE_REASON_UNKNOWN];


> If we add a negative value to the enum... then we'd need the "state >= 0" check
> again anyway... Doh. (Note that gcc has -Wsign-conversion, which would also
> warn in this case, but it's disabled by default, probably because of stuff like
> this.)

Other alternatively:

+++ w/m4/compiler_warnings.m4
@@ -32,3 +32,4 @@ if test "$GCC" = "yes" -a "$set_more_warnings" != "no"; then
                -Wpointer-arith -Winit-self \
-               -Wmissing-include-dirs -Waggregate-return; do
+               -Wmissing-include-dirs -Waggregate-return \
+               -Wno-tautological-compare; do
          SAVE_CFLAGS="$CFLAGS"

I dislike all options. But enums in C really suck (for example, if you have them in an Header file and you add a value, it can be an ABI break -- and whether it is an ABI break, is implementation defined...).



> > > > cli: fix warning in parse_output_fields() about strict-overflow
> > > 
> > > >+			if (FALSE) {
> > > >+found:
> > > 
> > > That's confusing... isn't there some other way of doing that?
> > 
> > Do you prefer the fixup commit instead? (I do not, but I don't mind)
> > It uses guchar instead of gboolean.
> 
> Does that actually fix the warning? There's no possibility for "signed
> overflow" of the "found" variable...

well. Don't tell me :) . Yes changing gboolean to unsigned avoids the warning.


> I assume it's talking about "i" or "j" overflowing, though I don't know why.
> (Any bugs that could be introduced by signed i or j overflowing could also
> occur if unsigned i or j overflowed...) I also don't understand the
> ifcfg-rh/reader.c changes in the previous patch...

signed integer overflow is undefined behaviour. So, having such an overflow is already the bug.


> The whole warning seems kind of flaky. In the ifcfg-rh/reader.c case, you fix
> it by using "break", but in parse_output_fields() it's not happy with that
> style either...

I think -O3 heavily unwraps these loops, and it does so differently in those cases.


> (Again, gcc apparently has the same warning, but disables it (or runs it at a
> lower level) by default.)

The last warnings do not happen with clang. They happen with gcc 4.8.3, compiled with CFLAGS="-O3" (-Wall implies -Wstrict-overflow).
Comment 5 Dan Winship 2014-07-29 16:54:37 UTC
(In reply to comment #1)
> > build/clang: fix detection of valid warning flags
> 
> > Fix the script to detect compiler warnings by passing
> > -Werror, which lets clang fail too.
> 
> configure tends to generate really poor code for its tests, so compiling them
> with -Werror may fail because the code actually hits the warning you're testing
> for. Is there some other flag you could pass to clang that would make it error
> out on unknown arguments?

Oh... You should be able to do "-Werror=unknown-warning-option" rather than "-Werror -Wunknown-warning-option".


(In reply to comment #2)
> > > all: add own macros to ignore deprecation warnings

> I rewrote the patch to (just) add a compatibility wrapper instead.

This leaves "core: add project-wide header file nm-utils-internal.h" unused again.


(In reply to comment #4)
> > > > > core: fix warning about comparing unsigned enum values being positive

> Other alternatively:
> 
> +++ w/m4/compiler_warnings.m4
> @@ -32,3 +32,4 @@ if test "$GCC" = "yes" -a "$set_more_warnings" != "no"; then
>                 -Wpointer-arith -Winit-self \
> -               -Wmissing-include-dirs -Waggregate-return; do
> +               -Wmissing-include-dirs -Waggregate-return \
> +               -Wno-tautological-compare; do

It appears that -Wtautological-compare did not catch anything other than this one place, where I feel like the warningful code is more correct than the warningless code, so I think this is my favorite option.


> > > > > cli: fix warning in parse_output_fields() about strict-overflow

> > Does that actually fix the warning? There's no possibility for "signed
> > overflow" of the "found" variable...
> 
> well. Don't tell me :) . Yes changing gboolean to unsigned avoids the warning.

As with the g_malloc0-vs-coverity thing, I'm opposed to randomly tweaking the code to get around a warning that seems to be demonstrably incorrect.

(Does changing "i" and/or "j" to guint but leaving "found" as gboolean make the warning go away?)
Comment 6 Thomas Haller 2014-07-29 19:08:57 UTC
(In reply to comment #5)
> (In reply to comment #1)
> > > build/clang: fix detection of valid warning flags
> > 
> > > Fix the script to detect compiler warnings by passing
> > > -Werror, which lets clang fail too.
> > 
> > configure tends to generate really poor code for its tests, so compiling them
> > with -Werror may fail because the code actually hits the warning you're testing
> > for. Is there some other flag you could pass to clang that would make it error
> > out on unknown arguments?
> 
> Oh... You should be able to do "-Werror=unknown-warning-option" rather than
> "-Werror -Wunknown-warning-option".

fixed

> (In reply to comment #2)
> > > > all: add own macros to ignore deprecation warnings
> 
> > I rewrote the patch to (just) add a compatibility wrapper instead.
> 
> This leaves "core: add project-wide header file nm-utils-internal.h" unused
> again.

commit removed


> (In reply to comment #4)
> > > > > > core: fix warning about comparing unsigned enum values being positive
> 
> > Other alternatively:
> > 
> > +++ w/m4/compiler_warnings.m4
> > @@ -32,3 +32,4 @@ if test "$GCC" = "yes" -a "$set_more_warnings" != "no"; then
> >                 -Wpointer-arith -Winit-self \
> > -               -Wmissing-include-dirs -Waggregate-return; do
> > +               -Wmissing-include-dirs -Waggregate-return \
> > +               -Wno-tautological-compare; do
> 
> It appears that -Wtautological-compare did not catch anything other than this
> one place, where I feel like the warningful code is more correct than the
> warningless code, so I think this is my favorite option.

hm, thinking about it again,... i dislike explicitly disable this warning, because (AFAIS) it is even enabled by default.

An alternative would be to add a dummy enum value -1, to force the enum always to be signed.

 
> > > > > > cli: fix warning in parse_output_fields() about strict-overflow
> 
> > > Does that actually fix the warning? There's no possibility for "signed
> > > overflow" of the "found" variable...
> > 
> > well. Don't tell me :) . Yes changing gboolean to unsigned avoids the warning.
> 
> As with the g_malloc0-vs-coverity thing, I'm opposed to randomly tweaking the
> code to get around a warning that seems to be demonstrably incorrect.
> 
> (Does changing "i" and/or "j" to guint but leaving "found" as gboolean make the
> warning go away?)

no, that doesn't avoid the warning.

I can't remember why, buy I always compile with -Wstrict-overflow.
With CFLAGS="-O3 -Wstrict-overflow", the warning occures.

The gcc man page seems to suggest that -Wall implies -Wstrict-overflow, so I was thinking that this is important. But there is no warning with CFLAGS="-O3 -Wall" alone.

So, that reduces the priority of this quite a lot.

I still would fix the code to avoid the warning, because there are only few place and I already did it. It is common that compiler warnings point out non-issues too, and then it is correct to tweak the code just to make the compiler happy.

It would also be great if we could just get going with the coverity fixes, because coverity is actually useful, but now it prints so many false positives that it is a pain to review them again.
Comment 7 Dan Winship 2014-07-31 14:03:17 UTC
(In reply to comment #6)
> I can't remember why, buy I always compile with -Wstrict-overflow.
> With CFLAGS="-O3 -Wstrict-overflow", the warning occures.
> 
> The gcc man page seems to suggest that -Wall implies -Wstrict-overflow, so I
> was thinking that this is important. But there is no warning with CFLAGS="-O3
> -Wall" alone.

There are multiple levels of -Wstrict-overflow; it must use a higher level if you specify it explicitly than if you get it implicitly via -Wall.

> I still would fix the code to avoid the warning, because there are only few
> place and I already did it. It is common that compiler warnings point out
> non-issues too, and then it is correct to tweak the code just to make the
> compiler happy.

I'm fine with tweaking code to make the compiler happy in cases where we understand why the compiler is unhappy, and the tweaked code is as correct/readable as the untweaked code. But in the parse_output_fields() case, we don't understand what the warning is about, and both patches result in very non-idiomatic code (which would be likely to get rewritten back to the original form by the next person to look at it anyway).
Comment 8 Thomas Haller 2014-07-31 22:27:36 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > I can't remember why, buy I always compile with -Wstrict-overflow.
> > With CFLAGS="-O3 -Wstrict-overflow", the warning occures.
> > 
> > The gcc man page seems to suggest that -Wall implies -Wstrict-overflow, so I
> > was thinking that this is important. But there is no warning with CFLAGS="-O3
> > -Wall" alone.
> 
> There are multiple levels of -Wstrict-overflow; it must use a higher level if
> you specify it explicitly than if you get it implicitly via -Wall.
> 
> > I still would fix the code to avoid the warning, because there are only few
> > place and I already did it. It is common that compiler warnings point out
> > non-issues too, and then it is correct to tweak the code just to make the
> > compiler happy.
> 
> I'm fine with tweaking code to make the compiler happy in cases where we
> understand why the compiler is unhappy, and the tweaked code is as
> correct/readable as the untweaked code. But in the parse_output_fields() case,
> we don't understand what the warning is about, and both patches result in very
> non-idiomatic code (which would be likely to get rewritten back to the original
> form by the next person to look at it anyway).

I tweaked the code in another way that looks still nice.


Repushed
Comment 9 Dan Williams 2014-08-01 02:42:55 UTC
Branch looks good to me.

At least a couple should get pushed to nm-0-9-10 too, like:

core: fix wrong device state change on failure in NMDeviceEthernet
vpn: fix warning in vpn-manager about implicit conversion of enum types
cli: fix warning about uninitialized value
Comment 10 Thomas Haller 2014-08-01 14:43:03 UTC
Merged.


to master:
http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=b456900531376d724411059b224139424c997159

(merged branch had a bug, so you need http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=d005c77213fac8dc2635a5b4e30e2999c0126a45 too).



Backported to nm-0-9-10 branch:
http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=5fe7d375a6198ed6c0bc98e1bc2a0410c1b48896 (which includes d005c772)




btw. the reason why gcc was fast then clang, was because I used ccache with gcc without noticing.

ccache also works with clang, then the build time is comparable.

export CC='ccache clang'
export CCACHE_CPP2=yes
Comment 11 Thomas Haller 2014-08-01 16:04:03 UTC
I opened two bugs against glib, related to workarounds that were necessary for this branch:

bug 734126  add G_GNUC_BEGIN_IGNORE_DEPRECATIONS macro for clang
bug 734128  clang fails to know that g_error() never returns