Bug 734146 – [review] [th/bgo734146_polkit] rework policykit (polkit) support to not require polkit library but use directly DBUS

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 734146 - [review] [th/bgo734146_polkit] rework policykit (polkit) support to not require polkit library but use directly DBUS


Summary:	[review] [th/bgo734146_polkit] rework policykit (polkit) support to not requi...


Status:	RESOLVED FIXED

Product:	NetworkManager
Classification:	Platform
Component:	general
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	NetworkManager maintainer(s)
QA Contact:	NetworkManager maintainer(s)

URL:
Whiteboard:

Depends on:
Blocks:	nm-1.0

Reported:	2014-08-01 22:23 UTC by Thomas Haller
Modified:	2014-09-29 11:57 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Thomas Haller 2014-08-01 22:23:14 UTC

Extract the use of policykit, so that it can be installed separately as plugin.

Comment 1 Thomas Haller 2014-08-17 21:21:28 UTC

Instead of making polkit a plugin, I (heavily) reworked the code and use the DBUS interface directly.


Please review: th/bgo734146_polkit

Comment 2 Pavel Simerda 2014-08-19 14:36:26 UTC

Any specific reason for the change of mind?

Comment 3 Thomas Haller 2014-08-19 16:13:29 UTC

(In reply to comment #2)
> Any specific reason for the change of mind?

Oh sure.

the original intent was to split out polkit into a separate plugin, so that NM-main has no dependency and that users can conveniently install NM without polkit -- previously you would have to recompile with --disable-polkit.


With the change now, NM has (also) no dependency on polkit, neither at build time, nor at runtime (apart from the running polkit service). Instead you compile it always, and you can enable/disable it in the configuration.

It makes packaging easier (because there is no separate NM-polkit package).

Also, "auth-polkit=yes" has smaller memory footprint, because we don't need to load a libnm-plugin-polkit.so (and no libpolkit-gobject-1.so).




I see only three minor downsides:

- I needed to reimplement some code from the library which adds potential for errors. Thanks to GDBus it wasn't that much and I copied a lot from polkit library.

- if we ever need some more functionality from polkit, it might be a bit more work to reimplement that part then to use the polkit-gobject wrapper. OTOH, probably that need will not arise, and even then it should be simple to use GDBus directly.

- if the user really does not want to use polkit, we always compile some kb of unused code. Maybe it's worth to add a configure option allowing the user to disable it entirely. Hm, I add a configure option for that to see what it effectively brings.

Comment 4 Thomas Haller 2014-08-20 12:28:57 UTC

Pushed one fixup commit, and add another commit to allow not compiling any polkit support at all.


I just wanted to see, how much the polkit support now costs (in terms of object code). Reduced size of the executable is the only win here, because disabling polkit via the  configuration (but still building it) already adds virtually no runtime costs for polkit.

With my usual build options, --enable-polkit=disabled only saves ~19KB (~12KB stripped) code. Arguably, the saving is so minuscule, that we could drop that commit. OTOH, it doesn't add that much of complexity in code either.

Comment 5 Dan Williams 2014-08-20 22:19:28 UTC

I don't mind keeping the compile-time disable stuff.  12KB stripped is actually quite a bit of code, the ADSL plugin reduced the size of NM itself by only 12KB, but if we keep finding 5 or 10 or 15k here and there then suddenly we're down 100KB and that's cool.

Comment 6 Dan Williams 2014-08-20 23:03:26 UTC

> auth: rework polkit autorization to use DBUS interface directly

In nm_auth_subject_get_uid() and nm_auth_subject_get_pid() I don't think TYPE_INTERNAL should be valid?  If the subject is internal, then it shouldn't have the uid and the pid set, because it's an internal request and thus doesn't come from any UID or PID.  Internal is meant for auto-activation or for connection assume instances, which are never triggered by a process or bus name.

Could we move _nm_auth_subject_to_gvariant() and _nm_auth_subject_to_str() into nm-auth-subject.c?  Then we (a) can verify what type the subject is, and (b) we can remove nm_auth_subject_get_system_bus_name_owner(), nm_auth_subject_get_unix_process_start_time(), nm_auth_subject_is_system_bus_name(), in nm-auth-subject.c since they are only called from those _nm_auth_subject_to_XX() functions.

Also, I would s/nm_auth_subject_new_system_bus_name_with_owner/nm_auth_subject_new_with_dbus_sender.  The reason is that the sender is separate from the bus name, and almost all of the time, the sender won't have a bus name.  The "owner"/"dbus-sender" here is like ":0.34" and that's not really a bus name.  Bus names are like "org.freedesktop.NetworkManager".  Instead, the owner/sender is more like an IP address, it just identifies the peer.  So the name "system_bus_name" seems wrong to me.  (but it turns out we don't even need this function, see below...)

Also, for the NM_AUTH_SUBJECT_IS_XXX properties, since all the types are mutually exclusive, I would just have an NM_AUTH_SUBJECT_TYPE enum property and expose SubjectType in nm-auth-subject.h perhaps?  It seems odd to have 3 boolean properties that are all mutually exclusive, when we already have the enum for them.

Even better though, it turns out we never actually used nm_auth_chain_new_dbus_sender() previously.  So that means in the *new* code, we can remove nm_auth_chain_new_dbus_sender() and we'll *always* have a subject in _auth_chain_new().  Which means we can get rid of the "dbus_sender" and "user_uid" arguments to _auth_chain_new() because they are only used from nm_auth_chain_new_dbus_sender().  That also means we can get rid of priv->owner completely, which means we don't need nm_auth_subject_new_system_bus_name_with_owner() anymore too.

Yay!  Simpler.  That's all I've got for now, lets see what the branch looks like after that.

All in all, I like the branch and it's a good cleanup.

Comment 7 Thomas Haller 2014-08-21 16:07:16 UTC

(In reply to comment #6)
> > auth: rework polkit autorization to use DBUS interface directly
> 
> In nm_auth_subject_get_uid() and nm_auth_subject_get_pid() I don't think
> TYPE_INTERNAL should be valid?  If the subject is internal, then it shouldn't
> have the uid and the pid set, because it's an internal request and thus doesn't
> come from any UID or PID.  Internal is meant for auto-activation or for
> connection assume instances, which are never triggered by a process or bus
> name.

I had that before, but then lots of assertion fail. At least nm_auth_subject_get_uid() gets called on internal subjects too.
However, I changed it for get_pid().

For example, the following stack trace:

+ Trace 233981

#0 g_logv
at gmessages.c line 989
#1 g_log
at gmessages.c line 1025
#2 nm_auth_subject_get_uid
at nm-auth-subject.c line 243
#3 nm_manager_activate_connection
at nm-manager.c line 3032
#4 auto_activate_device
at nm-policy.c line 1029
#5 g_main_dispatch
at gmain.c line 3066
#6 g_main_context_dispatch
at gmain.c line 3642
#7 g_main_context_iterate
at gmain.c line 3713
#8 g_main_loop_run
at gmain.c line 3907
#9 main
at main.c line 682




For the rest, I removed lots of code (including all the SYSTEM_BUS type and nm_auth_chain_new_dbus_sender(), as you suggested.

It turns out, we also have to queue calls until the GDBusProxy is ready. Added another fixup for that.


Regarding the --enable-polkit=disabled commit, it saves 12384 bytes stripped code (with my common build options).


Repushed

Comment 8 Dan Williams 2014-08-27 19:04:05 UTC

(In reply to comment #7)
> (In reply to comment #6)
> > > auth: rework polkit autorization to use DBUS interface directly
> > 
> > In nm_auth_subject_get_uid() and nm_auth_subject_get_pid() I don't think
> > TYPE_INTERNAL should be valid?  If the subject is internal, then it shouldn't
> > have the uid and the pid set, because it's an internal request and thus doesn't
> > come from any UID or PID.  Internal is meant for auto-activation or for
> > connection assume instances, which are never triggered by a process or bus
> > name.
> 
> I had that before, but then lots of assertion fail. At least
> nm_auth_subject_get_uid() gets called on internal subjects too.
> However, I changed it for get_pid().
> 
> For example, the following stack trace:

That's because nm_auth_uid_in_acl() requires a UID, but it turns out we don't need one there at all, we could just pass in NMAuthSubject instead and have nm_auth_uid_in_acl() check for the subject type.  If it's internal, pass (since UID = 0 in that case).  The only place that passes a numeric UID right now is in the connection_request_add_agent(), but agents have their own subject anyway, so we can just call nm_secret_agent_get_subject() there too.

(In fact, we can actually remove the session monitor stuff from there eventually too (per bug 686997) but that currently conflicts with your changes so we'll do that after this branch gets merged).

Comment 9 Dan Williams 2014-08-27 19:09:45 UTC

> libnm: do not assert secret flag in get_secret_flags()

Can you post a stacktrace for this?  I think it's a programmer error to call nm_setting_get_secret_flags() on any non-secret property, so I'm really curious where this is coming from.  All the callers in nm-setting.c already check if (prop_spec->flags & NM_SETTING_PARAM_SECRET) so it should be safe to do this.

Comment 10 Dan Williams 2014-08-27 20:57:15 UTC

I'd still like to see NM_AUTH_SUBJECT_IS_INTERNAL and NM_AUTH_SUBJECT_IS_UNIX_PROCESS consolidated into a single NM_AUTH_SUBJECT_TYPE property of type NMAuthSubjectType (eg, SubjectType renamed).  Having two boolean properties that are mutually exclusive, and are actually converted to the enum internally seems redundant, we should just use the enum.

nm_auth_subject_to_polkit_gvariant() should also just g_return_val_if_fail (subject_type == UNIX_PROCESS) at the top instead of bothering to build the variant and then just clear it.  The only caller nm_auth_manager_polkit_authority_check_authorization() should never ever call the function with a non-unix-process subject.

A) few things with nm_auth_chain_add_call():

1) it should g_return_val_if_fail (permission != NULL, FALSE); at the top; all callers *must* pass a permission, otherwise there's no reason at all to call this function.  It's sole purpose is to authorize a permission, so passing it a NULL permission makes no sense

2) the "gs_unref_object NMAuthSubject *subject = NULL;" looks like dead code since it's never assigned value.  NMAuthChains always have a subject, so I think you can remove that variable and always use "self->subject" in the call to nm_auth_manager_polkit_authority_check_authorization()

3) I think we can remove the "gboolean" and make nm_auth_chain_add_call() return 'void' now.  No caller pays attention to the return value, because the function is designed to always schedule the error response in an idle handler to simplify caller's code.

With those three things, I think the function could look like this:

void
nm_auth_chain_add_call (NMAuthChain *self,
                        const char *permission,
                        gboolean allow_interaction)
{
	AuthCall *call;
	NMAuthManager *auth_manager = nm_auth_manager_get ();

	g_return_val_if_fail (self != NULL, FALSE);
	g_return_val_if_fail (self->subject, FALSE);
	g_return_val_if_fail (permission != NULL, FALSE);
	g_return_val_if_fail (nm_auth_subject_is_unix_process (self->subject) || nm_auth_subject_is_internal (self->subject), FALSE);

	call = auth_call_new (self, permission);

	if (   nm_auth_subject_is_unix_process (self->subject)
	    && nm_auth_subject_get_uid (self->subject) == 0
	    && nm_auth_manager_get_polkit_enabled (auth_manager)) {
		/* Non-root always gets authenticated when using polkit */
#if WITH_POLKIT
		call->cancellable = g_cancellable_new ();
		if (!nm_auth_manager_polkit_authority_check_authorization (auth_manager,
			                                                       self->subject,
			                                                       permission,
			                                                       allow_interaction,
			                                                       call->cancellable,
			                                                       pk_call_cb,
			                                                       call))
			auth_call_schedule_complete_with_error (call, "Unexpected error asking polkit for authorization");
#else
		auth_call_schedule_complete_with_error (call, "Polkit support is disabled at compile time");
#endif
	} else {
		/* Root user, internal, and non-polkit always get the permission */
		nm_auth_chain_set_data (self, permission, GUINT_TO_POINTER (NM_AUTH_CALL_RESULT_YES), NULL);
		call->call_idle_id = g_idle_add ((GSourceFunc) auth_call_complete, call);
	}
}

B) can we just get rid of SUBJECT_TYPE_INVALID?  I don't think there's a huge need for that, because there are only two locations that actually create the subject:

_new_unix_process()
nm_auth_subject_new_internal()

And these two functions can certainly check validity and return NULL.  In fact, I don't think we should *ever* let an invalid subject get out of nm-auth-subject.c ever, and that implies that we dont' actually need any of the code elsewhere that does:

	g_return_val_if_fail (nm_auth_subject_is_unix_process (subject) || nm_auth_subject_is_internal (subject), NULL);

We should simply block an invalid subject from ever being created.  Given that we have two subject types anyway, all these lines are dead code since they will always be TRUE.

One way to determine validity would be to add a private _subject_is_valid() method which ensures validity and then if that returns FALSE, destroy the object in _new_unix_process()/nm_auth_subject_new_internal() and return NULL.  Basically, I think the code should just let the properties be set (which implies removing _set_subject_type() since that would only be used by the unix-process related properties, since we'd consolidate PROP_IS_INTERNAL and PROP_IS_UNIX_PROCESS per (B)) and then have a single validity function that contains the rules.  That makes it easier to see what the validity rules actually are, IMHO.

C) Per danw's recent email about this, there's some cleanup you can do with the GVariant code.  Like:

+_gvariant_to_polkit_authorization_result (GVariant *value,
+                                          gboolean *out_is_authorized,
+                                          gboolean *out_is_challenge)
+{
+	GVariant *dict;
+
+	g_variant_get (value,
+	               "(bb@a{ss})",
+	               out_is_authorized,
+	               out_is_challenge,
+	               &dict);
+	g_variant_unref (dict);
+}

I think you can skip the 'dict' there entirely and pass NULL, right?  Which means couldn't we just fold this function into check_authorization_cb()?  If you'd like to keep it separate, then move the child value processing for value/result_value into _gvariant_to_polkit_authorization_result() too.

nm_auth_manager_polkit_authority_check_authorization() - do you really need to sink *and* unref subject_value+details_value?  Can't we just pass them to g_variant_new() and let it consume them, and get rid of the sink + unref calls completely?

D) I think we should remove the "complete_in_idle" parameter from _call_check_authorization_complete_with_error() and just always complete in idle.  The only case where the code currently doesn't is _dbus_new_proxy_cb(), but that's only executed once during startup, and it's not a huge issue to complete those in idle anyway.  The only time this would be a speedup anyway is if Polkit is compiled into NM, but is not running on the system at all, and obviously at that point everything fails.

E) I think nm_auth_manager_polkit_authority_check_authorization_finish() should just accept gboolean * arguments for is_authorized and is_challenge instead of exposing NMAuthManagerPolkitResultCheckAuthorization, there's no good reason to make callers aware of the internal structure used to keep track of a tuple value.

Looking much better though!

Comment 11 Thomas Haller 2014-08-28 14:31:14 UTC

(In reply to comment #10)
> I'd still like to see NM_AUTH_SUBJECT_IS_INTERNAL and
> NM_AUTH_SUBJECT_IS_UNIX_PROCESS consolidated into a single NM_AUTH_SUBJECT_TYPE
> property of type NMAuthSubjectType (eg, SubjectType renamed).  Having two
> boolean properties that are mutually exclusive, and are actually converted to
> the enum internally seems redundant, we should just use the enum.

I like
  if (nm_auth_subject_is_unix_process (subject))
better then
  if (nm_auth_subject_get_type (subject) == NM_AUTH_SUBJECT_TYPE_UNIX_PROCESS)
?

_get_type() would be useful, if we had many different types and would occasionally:
  switch (nm_auth_subject_get_type (subject)) {
but, that is not going to happen.



> nm_auth_subject_to_polkit_gvariant() should also just g_return_val_if_fail
> (subject_type == UNIX_PROCESS) at the top instead of bothering to build the
> variant and then just clear it.  The only caller
> nm_auth_manager_polkit_authority_check_authorization() should never ever call
> the function with a non-unix-process subject.

done.


> A) few things with nm_auth_chain_add_call():
> 
> 1) it should g_return_val_if_fail (permission != NULL, FALSE); at the top; all
> callers *must* pass a permission, otherwise there's no reason at all to call
> this function.  It's sole purpose is to authorize a permission, so passing it a
> NULL permission makes no sense

done.


> 2) the "gs_unref_object NMAuthSubject *subject = NULL;" looks like dead code
> since it's never assigned value.  NMAuthChains always have a subject, so I
> think you can remove that variable and always use "self->subject" in the call
> to nm_auth_manager_polkit_authority_check_authorization()

done.


> 3) I think we can remove the "gboolean" and make nm_auth_chain_add_call()
> return 'void' now.  No caller pays attention to the return value, because the
> function is designed to always schedule the error response in an idle handler
> to simplify caller's code.

done.



> B) can we just get rid of SUBJECT_TYPE_INVALID?  I don't think there's a huge
> need for that, because there are only two locations that actually create the
> subject:
> 
> _new_unix_process()
> nm_auth_subject_new_internal()
>
> And these two functions can certainly check validity and return NULL.  In fact,
> I don't think we should *ever* let an invalid subject get out of
> nm-auth-subject.c ever, and that implies that we dont' actually need any of the
> code elsewhere that does:
> 
>     g_return_val_if_fail (nm_auth_subject_is_unix_process (subject) ||
> nm_auth_subject_is_internal (subject), NULL);

You can always create an invalid NMAuthSubject simply with

  g_object_new (NM_TYPE_AUTH_SUBJECT,
                NM_AUTH_SUBJECT_IS_INTERNAL, TRUE,
                NM_AUTH_SUBJECT_IS_UNIX_PROCESS, TRUE,
                NULL);

Sure, nobody is going to do that to current knowledge. Every single assert we have is not going to fail to current knowledge (except the one in get_secret_flags(), that always crashes NM just when I want to join a meeting :). Added the stack trace to the commit message).


I merged SUBJECT_TYPE_UNKNOWN with SUBJECT_TYPE_INVALID, do you like that better?


> We should simply block an invalid subject from ever being created.  Given that
> we have two subject types anyway, all these lines are dead code since they will
> always be TRUE.
>
> One way to determine validity would be to add a private _subject_is_valid()
> method which ensures validity and then if that returns FALSE, destroy the
> object in _new_unix_process()/nm_auth_subject_new_internal() and return NULL. 
> Basically, I think the code should just let the properties be set (which
> implies removing _set_subject_type() since that would only be used by the
> unix-process related properties, since we'd consolidate PROP_IS_INTERNAL and
> PROP_IS_UNIX_PROCESS per (B)) and then have a single validity function that
> contains the rules.  That makes it easier to see what the validity rules
> actually are, IMHO.


most of the validation was already done in constructed().
I moved the check for unix_process.start_time!=0 there too, and if the process information could not be read, we create an INVALID subject.


> C) Per danw's recent email about this, there's some cleanup you can do with the
> GVariant code.  Like:
> 
> +_gvariant_to_polkit_authorization_result (GVariant *value,
> +                                          gboolean *out_is_authorized,
> +                                          gboolean *out_is_challenge)
> +{
> +    GVariant *dict;
> +
> +    g_variant_get (value,
> +                   "(bb@a{ss})",
> +                   out_is_authorized,
> +                   out_is_challenge,
> +                   &dict);
> +    g_variant_unref (dict);
> +}
> 
> I think you can skip the 'dict' there entirely and pass NULL, right?  Which
> means couldn't we just fold this function into check_authorization_cb()?  If
> you'd like to keep it separate, then move the child value processing for
> value/result_value into _gvariant_to_polkit_authorization_result() too.
> 
> nm_auth_manager_polkit_authority_check_authorization() - do you really need to
> sink *and* unref subject_value+details_value?  Can't we just pass them to
> g_variant_new() and let it consume them, and get rid of the sink + unref calls
> completely?

done


> D) I think we should remove the "complete_in_idle" parameter from
> _call_check_authorization_complete_with_error() and just always complete in
> idle.  The only case where the code currently doesn't is _dbus_new_proxy_cb(),
> but that's only executed once during startup, and it's not a huge issue to
> complete those in idle anyway.  The only time this would be a speedup anyway is
> if Polkit is compiled into NM, but is not running on the system at all, and
> obviously at that point everything fails.

done


> E) I think nm_auth_manager_polkit_authority_check_authorization_finish() should
> just accept gboolean * arguments for is_authorized and is_challenge instead of
> exposing NMAuthManagerPolkitResultCheckAuthorization, there's no good reason to
> make callers aware of the internal structure used to keep track of a tuple
> value.

done




repushed.

Comment 12 Dan Williams 2014-09-10 19:24:04 UTC

I pushed a fixup commit that was my idea of what I suggested above. Basically:

- Combines PROP_IS_INTERNAL and PROP_IS_UNIX_PROCESS into PROP_SUBJECT_TYPE

- Changes PROP_SUBJECT_TYPE to get g_param_spec_enum(), which does bounds validation during set_property()

- Removed the "expected_subject_type == SUBJECT_TYPE_INVALID" from CHECK_SUBJECT because it could cause invalid subjects to pass the check

- in _new_unix_process() we need to set "success = FALSE" otherwise we return an uninitialized value

- do validation in nm_auth_subject_new_internal() just like in _new_unix_process() to ensure we never ever return an invalid subject

- set_property(): we don't need to do any parameter validation here (except for the subject_type > INVALID because INVALID is an enum member) because these properties are all CONSTRUCT_ONLY, and the validation that was being done was duplicated in constructed().  Plus, we don't  need to check for != G_MAXULONG because the value is *already* G_MAXULONG due to nm_auth_subject_init(), and there's no problem in re-setting it to the same value.  Might as well save some LoC and make it a bit simpler

Thoughts?  I really do want the combined property because having two properties for the two mutually exclusive types doesn't make sense to me.

Comment 13 Thomas Haller 2014-09-23 16:02:11 UTC

(In reply to comment #12)
> I pushed a fixup commit that was my idea of what I suggested above. Basically:
> 
> - Combines PROP_IS_INTERNAL and PROP_IS_UNIX_PROCESS into PROP_SUBJECT_TYPE
> 
> - Changes PROP_SUBJECT_TYPE to get g_param_spec_enum(), which does bounds
> validation during set_property()
> 
> - Removed the "expected_subject_type == SUBJECT_TYPE_INVALID" from
> CHECK_SUBJECT because it could cause invalid subjects to pass the check
> 
> - in _new_unix_process() we need to set "success = FALSE" otherwise we return
> an uninitialized value
> 
> - do validation in nm_auth_subject_new_internal() just like in
> _new_unix_process() to ensure we never ever return an invalid subject
> 
> - set_property(): we don't need to do any parameter validation here (except for
> the subject_type > INVALID because INVALID is an enum member) because these
> properties are all CONSTRUCT_ONLY, and the validation that was being done was
> duplicated in constructed().  Plus, we don't  need to check for != G_MAXULONG
> because the value is *already* G_MAXULONG due to nm_auth_subject_init(), and
> there's no problem in re-setting it to the same value.  Might as well save some
> LoC and make it a bit simpler
> 
> Thoughts?  I really do want the combined property because having two properties
> for the two mutually exclusive types doesn't make sense to me.

Ok.

I rebased the branch on master, and added yet another fixup on top of yours.
(also, I reordered the patches).

How is it now?

Comment 14 Dan Williams 2014-09-23 23:15:25 UTC

Looks fine to me now!

Comment 15 Thomas Haller 2014-09-29 11:57:25 UTC

merged: http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=4cc3c6ab71cc9d89c3345752bd26262d60f636e4