After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 767160 - Implement HTTP Strict Transport Security (HSTS)
Implement HTTP Strict Transport Security (HSTS)
Status: RESOLVED OBSOLETE
Product: libsoup
Classification: Core
Component: HTTP Transport
unspecified
Other Linux
: Normal normal
: ---
Assigned To: libsoup-maint@gnome.bugs
libsoup-maint@gnome.bugs
: 628298 (view as bug list)
Depends on:
Blocks: 628298
 
 
Reported: 2016-06-02 12:09 UTC by Adrien Plazas
Modified: 2018-09-21 16:25 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Add SoupHstsEnforcer (151.21 KB, patch)
2016-06-02 12:14 UTC, Adrien Plazas
none Details | Review
Add SoupHstsEnforcer (151.46 KB, patch)
2016-06-13 20:11 UTC, Adrien Plazas
none Details | Review
Add SoupHstsEnforcer (178.93 KB, patch)
2018-06-07 08:06 UTC, Claudio Saavedra
reviewed Details | Review

Description Adrien Plazas 2016-06-02 12:09:55 UTC
It would be nice to have HSTS (http://tools.ietf.org/html/rfc6797) implemented into libsoup.
Comment 1 Adrien Plazas 2016-06-02 12:14:39 UTC
Created attachment 328945 [details] [review]
Add SoupHstsEnforcer

Here is a patch implementing most of HSTS as the SoupSessionFeature nmed SoupHstsEnforcer. A permanent SQLite based version of the feature exists as SoupHstsEnforcerDb.

I'm looking for some feedback on the implementation of the feature and its quality. Of course having everything in a single commit is dirty, it's not the part that need review for now.
Comment 2 Michael Catanzaro 2016-06-11 17:06:11 UTC
Review of attachment 328945 [details] [review]:

I have a lot of feedback since this is a big patch, but overall I'm very pleased; this is good quality code. It's clear you've been writing GNOME code for a while and you already well understand most of our conventions.

Now, prepare for some review pain.

First up: you need to update soup-autocleanups.h!

Secondly: Dan won't accept this without tests. ;)

Thirdly: you use g_return_if_fail in many static functions. We normally only use these in public functions, to help developers using the library.

::: libsoup/soup-hsts-enforcer-db.c
@@ +24,3 @@
+ *
+ * #SoupHstsEnforcerDB is a #SoupHstsEnforcer that reads HSTS policies from
+ * and writes them to a sqlite database.

Hm, this phrasing is a bit awkward, maybe "#SoupHstsEnforcerDB is a #SoupHstsEnforcer that stores HSTS policy information in an SQLite database."

@@ +136,3 @@
+
+static int
+callback (void *data, int argc, char **argv, char **colname)

Needs a more descriptive name... "load_cb"? "load_policy_cb"? "load_policy_for_row"?

@@ +163,3 @@
+		soup_hsts_enforcer_set_policy (hsts_enforcer, policy);
+	else
+		soup_date_free (expires);

Hm, this looks awkward; I think soup_hsts_policy_new should take care of freeing the SoupDate on failure. You don't want it to be transfer full when it succeeds and transfer none only when it fails, right? I'm actually not sure what our convention is here, so please ask Carlos or Dan for a second opinion on this; I guess it would also be awkward to free it inside the function, as then you'd need your g_return_if_fail inside a conditional that checks the same condition....

There are several callers you'd need to update if you change this. Anyway, good job making sure it doesn't leak.

@@ +182,3 @@
+exec_query_with_try_create_table (sqlite3 *db,
+				  const char *sql,
+				  int (*callback)(void*,int,char**,char**),

You should use a typedef for this. The typical GNOME style would be:

typedef int (*SQLiteCallback) (void *, int, char **, char **);

Then you can simply write "SQLiteCallback callback" in the parameter list instead of "int (*callback)(void*,int,char**,char**)", much easier to read.

@@ +188,3 @@
+	gboolean try_create = TRUE;
+
+try_exec:

This use of goto was not your best code style decision. :p As a rule, it's best to use goto only for error handling. Look how much cleaner this is if you write it linearly:

		if (sqlite3_exec (db, sql, callback, argument, NULL)) {
			char *error = NULL;
			try_create_table (db);
			if (sqlite3_exec (db, sql, callback, argument, &error) {
				g_warning ("Failed to execute query: %s", error);
				sqlite3_free (error);
			}
		}

@@ +203,3 @@
+}
+
+/* Follows sqlite3 convention; returns TRUE on error */

Good use of comments; this part would be confusing otherwise, but you avoid unnecessary comments elsewhere in the code. Good.

@@ +213,3 @@
+
+	if (sqlite3_open (priv->filename, &priv->db)) {
+		sqlite3_close (priv->db);

This is correct, because "[w]hether or not an error occurs when it is opened, resources associated with the database connection handle should be released by passing it to sqlite3_close() when it is no longer required." But I want to point out it's insane. Never design an API like this, where it requires a call to close() when open() fails. I bet failure to do this is a common mistake when working with sqlite.

@@ +220,3 @@
+
+	if (sqlite3_exec (priv->db, "PRAGMA synchronous = OFF; PRAGMA secure_delete = 1;", NULL, NULL, &error)) {
+		g_warning ("Failed to execute query: %s", error);

This is intended to be nonfatal? It'd be a good idea to leave a one-line comment, else some future programmer might decide you just forgot and add a return TRUE here.

@@ +258,3 @@
+		query = sqlite3_mprintf (QUERY_DELETE,
+					 old_policy->domain);
+		exec_query_with_try_create_table (priv->db, query, NULL, NULL);

Not a big deal, but exec_query_with_try_create_table doesn't seem ideal here: you're just trying to delete the policy, so if the table doesn't exist, it doesn't make sense to try creating it. Probably would be best to use sqlite3_exec directly here.

@@ +293,3 @@
+
+	hsts_enforcer_class->is_persistent = soup_hsts_enforcer_db_is_persistent;
+	hsts_enforcer_class->changed       = soup_hsts_enforcer_db_changed;

I checked a couple files and leaving the extra space here is not the right style for libsoup (nor for most GNOME code, I think).

@@ +295,3 @@
+	hsts_enforcer_class->changed       = soup_hsts_enforcer_db_changed;
+
+	object_class->finalize     = soup_hsts_enforcer_db_finalize;

Ditto.

::: libsoup/soup-hsts-enforcer.c
@@ +6,3 @@
+ */
+
+/* TODO Use only internationalized domain names */

Dan, does libsoup already support IDNA? If not, perhaps we can leave this as TODO and land anyway.

@@ +36,3 @@
+ *
+ * Note that the base #SoupHstsEnforcer class does not support any form
+ * of long-term HSTS policy persistence.

Mention #SoupHstsEnforcerDb here.

@@ +64,3 @@
+
+	priv->host_policies = g_hash_table_new_full (soup_str_case_hash,
+						soup_str_case_equal,

Indentation is wrong here

@@ +80,3 @@
+
+	g_hash_table_iter_init (&iter, priv->host_policies);
+	while (g_hash_table_iter_next (&iter, &key, &value))

Don't iterate over these hash tables in finalize. Instead, just pass soup_hsts_policy_free as the value_destroy_func to g_hash_table_new_full.

@@ +139,3 @@
+ *
+ * Creates a new #SoupHstsEnforcer. The base #SoupHstsEnforcer class does
+ * not support persistent storage of HSTS policies; use a subclass for

"use a subclass, such as #SoupHstsEnforcerDb, for that"

@@ +158,3 @@
+	g_return_if_fail (SOUP_IS_HSTS_ENFORCER (hsts_enforcer));
+
+	g_assert_true (old || new);

Just use g_assert; g_assert_true is for testcases.

@@ +183,3 @@
+			g_hash_table_remove (priv->host_policies, domain);
+			soup_hsts_enforcer_changed (hsts_enforcer, policy, NULL);
+			soup_hsts_policy_free (policy);

You shouldn't have to free this here, because you'll pass the free function to g_hash_table_new_full.

@@ +203,3 @@
+	policy = g_hash_table_lookup (priv->host_policies, domain);
+
+	g_assert_nonnull (policy);

I think libsoup style is:

g_assert (policy != NULL);

My preferred style is:

g_assert (policy);

@@ +207,3 @@
+	g_hash_table_remove (priv->host_policies, domain);
+	soup_hsts_enforcer_changed (hsts_enforcer, policy, NULL);
+	soup_hsts_policy_free (policy);

You won't need this free.

@@ +225,3 @@
+	g_return_if_fail (new_policy != NULL);
+
+	g_assert_false (soup_hsts_policy_is_expired (new_policy));

g_assert (!soup_hsts_policy_is_expired (new_policy));

@@ +238,3 @@
+	old_policy = g_hash_table_lookup (policies, domain);
+
+	g_assert_nonnull (old_policy);

g_assert

@@ +261,3 @@
+	g_return_if_fail (policy != NULL);
+
+	g_assert_false (soup_hsts_policy_is_expired (policy));

g_assert

@@ +360,3 @@
+ * @include_sub_domains: %TRUE if the policy applies on sub domains
+ *
+ * Sets a session policy@domain's HSTS policy to @policy. If @policy is expired, any

missing space

@@ +421,3 @@
+}
+
+static inline const gchar*

This is too big to be inlined.

Also: gchar *

@@ +469,3 @@
+	g_return_if_fail (msg != NULL);
+
+	/* TODO if connection error or warnings received, do nothing. */

This looks important. ;)

@@ +471,3 @@
+	/* TODO if connection error or warnings received, do nothing. */
+
+	/* TODO if header received on hazardous connection, do nothing. */

This looks important. ;)

@@ +499,3 @@
+	g_return_val_if_fail (uri != NULL, FALSE);
+
+	// HSTS secures only HTTP connections.

What about web sockets? For WebKit that does have to be handled at the WebKit level, but soup has a separate implementation of WebSockets it would be good to protect. You can use a TODO here.

@@ +523,3 @@
+	soup_uri_free (dst_uri);
+
+	soup_message_set_redirect (msg, 301, dst);

Probably for the best.

::: libsoup/soup-hsts-enforcer.h
@@ +33,3 @@
+			 SoupHstsPolicy   *new_policy);
+
+	/* Padding for future expansion */

I think we usually use four bytes of padding to give a bit more flexibility for future expansion.

::: libsoup/soup-hsts-policy.c
@@ +21,3 @@
+ * @see_also: #SoupHstsEnforcer
+ *
+ * #SoupHstsPolicy implements HTTP policies, as described by <ulink

"#SoupHstsPolicy stores HSTS policy details, as described by"...

I wouldn't say "implements" because there's not much implementation here; the whole class can be thought of as data with some convenience wrappers.

Note: HTTP->HSTS

@@ +42,3 @@
+ * @expires will be non-%NULL if the policy has been set by the host and
+ * hence has an expiry time. If @expires is %NULL, it indicates that the
+ * policy is a session policy set by the user agent.

Maybe elaborate on "session policy" more here. It's intended to be used for preload lists, right?

@@ +58,3 @@
+ * Copies @policy.
+ *
+ * Return value: a copy of @policy

Pretty sure this needs to be (transfer full). It's obvious to you and me, but I think language bindings will leak it otherwise.

@@ +65,3 @@
+soup_hsts_policy_copy (SoupHstsPolicy *policy)
+{
+	SoupHstsPolicy *copy = g_slice_new0 (SoupHstsPolicy);

This is fine, but I will mention that there is some question as to whether we should be using GSlice anymore. See https://git.gnome.org/browse/gnome-devel-docs/commit/programming-guidelines/C/memory-management.page?id=5fca5a4e6f2b17e79fb2e5aec518d61268123972

Nowadays I usually use g_new, but there's been talk of rewriting the GSlice allocator, so it's still OK to use IMO; just something to be aware of.

@@ +68,3 @@
+
+	copy->domain = g_strdup (policy->domain);
+	copy->expires = policy->expires ? soup_date_copy(policy->expires)

Missing space here.

@@ +78,3 @@
+ * soup_hsts_policy_equal:
+ * @policy1: a #SoupCookie
+ * @policy2: a #SoupCookie

Need to update this comment block to remove the references to SoupCookie, cookie domains, and fix the Since: 2.24 tag.

@@ +113,3 @@
+}
+
+static inline const char *

I don't know how to guess about the performance implications of inlining this code, but there's a decent chance this is a pessimization as the function includes a loop. It's usually better to let GCC decide for you what to inline and what not to inline; it's a static function so it's free to do that.

@@ +114,3 @@
+
+static inline const char *
+skip_lws (const char *s)

Write it out: skip_leading_whitespace

@@ +122,3 @@
+
+static inline const char *
+unskip_lws (const char *s, const char *start)

Same problems here.

@@ +178,3 @@
+
+		has_value = (*p == '=');
+#define MATCH_NAME(name) ((end - start == strlen (name)) && !g_ascii_strncasecmp (start, name, end - start))

Put this outside the function

@@ +220,3 @@
+						  include_sub_domains);
+
+fail:

Since there's no memory to free here, I would just return NULL everywhere instead of goto fail. It's personal preference, though (unless Dan says otherwise ;)

@@ +239,3 @@
+		return FALSE;
+
+	/* IP addresses are not valid hostnames, only domain names are.

Yes, very good; I had starred this in my copy of the spec to see that you got this right. Now, to make sure some future programmer doesn't break this, you should mention the part of the specification that requires this (8.1.1).

@@ +244,3 @@
+		return FALSE;
+
+	/* The hostname should be a valid domain name.

This seems like the right place to:

 * Check that the hostname is valid UTF-8
 * Check that it's valid IDNA

The first is easy (g_utf8_validate), but IDNA will require some research. A 'git grep -i idna' indicates that libsoup might not support IDNA at all (ask Dan), so perhaps leaving a FIXME here would suffice. But if IDNA is already supported somehow, we probably need to check that here.

@@ +266,3 @@
+ * must also be enforced on all subdomains of @domain.
+ *
+ * Return value: a new #SoupHstsPolicy.

(transfer full)

@@ +279,3 @@
+
+	policy = g_slice_new0 (SoupHstsPolicy);
+	policy->domain = g_strdup (domain);

I wonder if we should check the domain here, instead of assuming it's valid. Then these functions would need to be nullable. Make a decision based on how you use these functions elsewhere.

@@ +304,3 @@
+ * must also be enforced on all subdomains of @domain.
+ *
+ * Return value: a new #SoupHstsPolicy.

Ditto

@@ +347,3 @@
+ * must also be enforced on all subdomains of @domain.
+ *
+ * Return value: a new #SoupHstsPolicy.

Ditto

@@ +367,3 @@
+ * "Strict-Transport-Security" response header was found.
+ *
+ * Return value: (nullable): a new #SoupHstsPolicy, or %NULL if no valid

Ditto.
Comment 3 Michael Catanzaro 2016-06-11 17:06:28 UTC
Review of attachment 328945 [details] [review]:

I have a lot of feedback since this is a big patch, but overall I'm very pleased; this is good quality code. It's clear you've been writing GNOME code for a while and you already well understand most of our conventions.

Now, prepare for some review pain.

First up: you need to update soup-autocleanups.h!

Secondly: Dan won't accept this without tests. ;)

Thirdly: you use g_return_if_fail in many static functions. We normally only use these in public functions, to help developers using the library.

::: libsoup/soup-hsts-enforcer-db.c
@@ +24,3 @@
+ *
+ * #SoupHstsEnforcerDB is a #SoupHstsEnforcer that reads HSTS policies from
+ * and writes them to a sqlite database.

Hm, this phrasing is a bit awkward, maybe "#SoupHstsEnforcerDB is a #SoupHstsEnforcer that stores HSTS policy information in an SQLite database."

@@ +136,3 @@
+
+static int
+callback (void *data, int argc, char **argv, char **colname)

Needs a more descriptive name... "load_cb"? "load_policy_cb"? "load_policy_for_row"?

@@ +163,3 @@
+		soup_hsts_enforcer_set_policy (hsts_enforcer, policy);
+	else
+		soup_date_free (expires);

Hm, this looks awkward; I think soup_hsts_policy_new should take care of freeing the SoupDate on failure. You don't want it to be transfer full when it succeeds and transfer none only when it fails, right? I'm actually not sure what our convention is here, so please ask Carlos or Dan for a second opinion on this; I guess it would also be awkward to free it inside the function, as then you'd need your g_return_if_fail inside a conditional that checks the same condition....

There are several callers you'd need to update if you change this. Anyway, good job making sure it doesn't leak.

@@ +182,3 @@
+exec_query_with_try_create_table (sqlite3 *db,
+				  const char *sql,
+				  int (*callback)(void*,int,char**,char**),

You should use a typedef for this. The typical GNOME style would be:

typedef int (*SQLiteCallback) (void *, int, char **, char **);

Then you can simply write "SQLiteCallback callback" in the parameter list instead of "int (*callback)(void*,int,char**,char**)", much easier to read.

@@ +188,3 @@
+	gboolean try_create = TRUE;
+
+try_exec:

This use of goto was not your best code style decision. :p As a rule, it's best to use goto only for error handling. Look how much cleaner this is if you write it linearly:

		if (sqlite3_exec (db, sql, callback, argument, NULL)) {
			char *error = NULL;
			try_create_table (db);
			if (sqlite3_exec (db, sql, callback, argument, &error) {
				g_warning ("Failed to execute query: %s", error);
				sqlite3_free (error);
			}
		}

@@ +203,3 @@
+}
+
+/* Follows sqlite3 convention; returns TRUE on error */

Good use of comments; this part would be confusing otherwise, but you avoid unnecessary comments elsewhere in the code. Good.

@@ +213,3 @@
+
+	if (sqlite3_open (priv->filename, &priv->db)) {
+		sqlite3_close (priv->db);

This is correct, because "[w]hether or not an error occurs when it is opened, resources associated with the database connection handle should be released by passing it to sqlite3_close() when it is no longer required." But I want to point out it's insane. Never design an API like this, where it requires a call to close() when open() fails. I bet failure to do this is a common mistake when working with sqlite.

@@ +220,3 @@
+
+	if (sqlite3_exec (priv->db, "PRAGMA synchronous = OFF; PRAGMA secure_delete = 1;", NULL, NULL, &error)) {
+		g_warning ("Failed to execute query: %s", error);

This is intended to be nonfatal? It'd be a good idea to leave a one-line comment, else some future programmer might decide you just forgot and add a return TRUE here.

@@ +258,3 @@
+		query = sqlite3_mprintf (QUERY_DELETE,
+					 old_policy->domain);
+		exec_query_with_try_create_table (priv->db, query, NULL, NULL);

Not a big deal, but exec_query_with_try_create_table doesn't seem ideal here: you're just trying to delete the policy, so if the table doesn't exist, it doesn't make sense to try creating it. Probably would be best to use sqlite3_exec directly here.

@@ +293,3 @@
+
+	hsts_enforcer_class->is_persistent = soup_hsts_enforcer_db_is_persistent;
+	hsts_enforcer_class->changed       = soup_hsts_enforcer_db_changed;

I checked a couple files and leaving the extra space here is not the right style for libsoup (nor for most GNOME code, I think).

@@ +295,3 @@
+	hsts_enforcer_class->changed       = soup_hsts_enforcer_db_changed;
+
+	object_class->finalize     = soup_hsts_enforcer_db_finalize;

Ditto.

::: libsoup/soup-hsts-enforcer.c
@@ +6,3 @@
+ */
+
+/* TODO Use only internationalized domain names */

Dan, does libsoup already support IDNA? If not, perhaps we can leave this as TODO and land anyway.

@@ +36,3 @@
+ *
+ * Note that the base #SoupHstsEnforcer class does not support any form
+ * of long-term HSTS policy persistence.

Mention #SoupHstsEnforcerDb here.

@@ +64,3 @@
+
+	priv->host_policies = g_hash_table_new_full (soup_str_case_hash,
+						soup_str_case_equal,

Indentation is wrong here

@@ +80,3 @@
+
+	g_hash_table_iter_init (&iter, priv->host_policies);
+	while (g_hash_table_iter_next (&iter, &key, &value))

Don't iterate over these hash tables in finalize. Instead, just pass soup_hsts_policy_free as the value_destroy_func to g_hash_table_new_full.

@@ +139,3 @@
+ *
+ * Creates a new #SoupHstsEnforcer. The base #SoupHstsEnforcer class does
+ * not support persistent storage of HSTS policies; use a subclass for

"use a subclass, such as #SoupHstsEnforcerDb, for that"

@@ +158,3 @@
+	g_return_if_fail (SOUP_IS_HSTS_ENFORCER (hsts_enforcer));
+
+	g_assert_true (old || new);

Just use g_assert; g_assert_true is for testcases.

@@ +183,3 @@
+			g_hash_table_remove (priv->host_policies, domain);
+			soup_hsts_enforcer_changed (hsts_enforcer, policy, NULL);
+			soup_hsts_policy_free (policy);

You shouldn't have to free this here, because you'll pass the free function to g_hash_table_new_full.

@@ +203,3 @@
+	policy = g_hash_table_lookup (priv->host_policies, domain);
+
+	g_assert_nonnull (policy);

I think libsoup style is:

g_assert (policy != NULL);

My preferred style is:

g_assert (policy);

@@ +207,3 @@
+	g_hash_table_remove (priv->host_policies, domain);
+	soup_hsts_enforcer_changed (hsts_enforcer, policy, NULL);
+	soup_hsts_policy_free (policy);

You won't need this free.

@@ +225,3 @@
+	g_return_if_fail (new_policy != NULL);
+
+	g_assert_false (soup_hsts_policy_is_expired (new_policy));

g_assert (!soup_hsts_policy_is_expired (new_policy));

@@ +238,3 @@
+	old_policy = g_hash_table_lookup (policies, domain);
+
+	g_assert_nonnull (old_policy);

g_assert

@@ +261,3 @@
+	g_return_if_fail (policy != NULL);
+
+	g_assert_false (soup_hsts_policy_is_expired (policy));

g_assert

@@ +360,3 @@
+ * @include_sub_domains: %TRUE if the policy applies on sub domains
+ *
+ * Sets a session policy@domain's HSTS policy to @policy. If @policy is expired, any

missing space

@@ +421,3 @@
+}
+
+static inline const gchar*

This is too big to be inlined.

Also: gchar *

@@ +469,3 @@
+	g_return_if_fail (msg != NULL);
+
+	/* TODO if connection error or warnings received, do nothing. */

This looks important. ;)

@@ +471,3 @@
+	/* TODO if connection error or warnings received, do nothing. */
+
+	/* TODO if header received on hazardous connection, do nothing. */

This looks important. ;)

@@ +499,3 @@
+	g_return_val_if_fail (uri != NULL, FALSE);
+
+	// HSTS secures only HTTP connections.

What about web sockets? For WebKit that does have to be handled at the WebKit level, but soup has a separate implementation of WebSockets it would be good to protect. You can use a TODO here.

@@ +523,3 @@
+	soup_uri_free (dst_uri);
+
+	soup_message_set_redirect (msg, 301, dst);

Probably for the best.

::: libsoup/soup-hsts-enforcer.h
@@ +33,3 @@
+			 SoupHstsPolicy   *new_policy);
+
+	/* Padding for future expansion */

I think we usually use four bytes of padding to give a bit more flexibility for future expansion.

::: libsoup/soup-hsts-policy.c
@@ +21,3 @@
+ * @see_also: #SoupHstsEnforcer
+ *
+ * #SoupHstsPolicy implements HTTP policies, as described by <ulink

"#SoupHstsPolicy stores HSTS policy details, as described by"...

I wouldn't say "implements" because there's not much implementation here; the whole class can be thought of as data with some convenience wrappers.

Note: HTTP->HSTS

@@ +42,3 @@
+ * @expires will be non-%NULL if the policy has been set by the host and
+ * hence has an expiry time. If @expires is %NULL, it indicates that the
+ * policy is a session policy set by the user agent.

Maybe elaborate on "session policy" more here. It's intended to be used for preload lists, right?

@@ +58,3 @@
+ * Copies @policy.
+ *
+ * Return value: a copy of @policy

Pretty sure this needs to be (transfer full). It's obvious to you and me, but I think language bindings will leak it otherwise.

@@ +65,3 @@
+soup_hsts_policy_copy (SoupHstsPolicy *policy)
+{
+	SoupHstsPolicy *copy = g_slice_new0 (SoupHstsPolicy);

This is fine, but I will mention that there is some question as to whether we should be using GSlice anymore. See https://git.gnome.org/browse/gnome-devel-docs/commit/programming-guidelines/C/memory-management.page?id=5fca5a4e6f2b17e79fb2e5aec518d61268123972

Nowadays I usually use g_new, but there's been talk of rewriting the GSlice allocator, so it's still OK to use IMO; just something to be aware of.

@@ +68,3 @@
+
+	copy->domain = g_strdup (policy->domain);
+	copy->expires = policy->expires ? soup_date_copy(policy->expires)

Missing space here.

@@ +78,3 @@
+ * soup_hsts_policy_equal:
+ * @policy1: a #SoupCookie
+ * @policy2: a #SoupCookie

Need to update this comment block to remove the references to SoupCookie, cookie domains, and fix the Since: 2.24 tag.

@@ +113,3 @@
+}
+
+static inline const char *

I don't know how to guess about the performance implications of inlining this code, but there's a decent chance this is a pessimization as the function includes a loop. It's usually better to let GCC decide for you what to inline and what not to inline; it's a static function so it's free to do that.

@@ +114,3 @@
+
+static inline const char *
+skip_lws (const char *s)

Write it out: skip_leading_whitespace

@@ +122,3 @@
+
+static inline const char *
+unskip_lws (const char *s, const char *start)

Same problems here.

@@ +178,3 @@
+
+		has_value = (*p == '=');
+#define MATCH_NAME(name) ((end - start == strlen (name)) && !g_ascii_strncasecmp (start, name, end - start))

Put this outside the function

@@ +220,3 @@
+						  include_sub_domains);
+
+fail:

Since there's no memory to free here, I would just return NULL everywhere instead of goto fail. It's personal preference, though (unless Dan says otherwise ;)

@@ +239,3 @@
+		return FALSE;
+
+	/* IP addresses are not valid hostnames, only domain names are.

Yes, very good; I had starred this in my copy of the spec to see that you got this right. Now, to make sure some future programmer doesn't break this, you should mention the part of the specification that requires this (8.1.1).

@@ +244,3 @@
+		return FALSE;
+
+	/* The hostname should be a valid domain name.

This seems like the right place to:

 * Check that the hostname is valid UTF-8
 * Check that it's valid IDNA

The first is easy (g_utf8_validate), but IDNA will require some research. A 'git grep -i idna' indicates that libsoup might not support IDNA at all (ask Dan), so perhaps leaving a FIXME here would suffice. But if IDNA is already supported somehow, we probably need to check that here.

@@ +266,3 @@
+ * must also be enforced on all subdomains of @domain.
+ *
+ * Return value: a new #SoupHstsPolicy.

(transfer full)

@@ +279,3 @@
+
+	policy = g_slice_new0 (SoupHstsPolicy);
+	policy->domain = g_strdup (domain);

I wonder if we should check the domain here, instead of assuming it's valid. Then these functions would need to be nullable. Make a decision based on how you use these functions elsewhere.

@@ +304,3 @@
+ * must also be enforced on all subdomains of @domain.
+ *
+ * Return value: a new #SoupHstsPolicy.

Ditto

@@ +347,3 @@
+ * must also be enforced on all subdomains of @domain.
+ *
+ * Return value: a new #SoupHstsPolicy.

Ditto

@@ +367,3 @@
+ * "Strict-Transport-Security" response header was found.
+ *
+ * Return value: (nullable): a new #SoupHstsPolicy, or %NULL if no valid

Ditto.
Comment 4 Adrien Plazas 2016-06-13 16:08:11 UTC
Review of attachment 328945 [details] [review]:

Thanks for the review, it's really helpful! I prefer painful reviews than blend ones: it means we can do even better. =)

Here are some responses to some comments. Whenever I didn't respond is because I simply agreed on the change and applied it.

::: libsoup/soup-hsts-enforcer-db.c
@@ +163,3 @@
+		soup_hsts_enforcer_set_policy (hsts_enforcer, policy);
+	else
+		soup_date_free (expires);

Yes I wasn't sure how to handle this while avoiding useless copies and destruction of objects while keeping the function clean. I asked myself all the questions you did, but didn't find a good answer. :|

@@ +182,3 @@
+exec_query_with_try_create_table (sqlite3 *db,
+				  const char *sql,
+				  int (*callback)(void*,int,char**,char**),

Maybe we should update SoupCookieJarDb too then (see a later comment).

@@ +188,3 @@
+	gboolean try_create = TRUE;
+
+try_exec:

Maybe we should update SoupCookieJarDb too then (see the next comment).

@@ +213,3 @@
+
+	if (sqlite3_open (priv->filename, &priv->db)) {
+		sqlite3_close (priv->db);

Maybe we should wrap usage of SQLite into private functions to confine its different behaviors and share them with SoupCookieJarDb? Which mean we should clean up SoupCookieJarDb first, then use these SQLite wrappers in SoupHstsEnforcerDb.

@@ +220,3 @@
+
+	if (sqlite3_exec (priv->db, "PRAGMA synchronous = OFF; PRAGMA secure_delete = 1;", NULL, NULL, &error)) {
+		g_warning ("Failed to execute query: %s", error);

I have no idea, I just copied that code from SoupCookieJarDb. :/

Another code these classes could probably share.

::: libsoup/soup-hsts-enforcer.c
@@ +158,3 @@
+	g_return_if_fail (SOUP_IS_HSTS_ENFORCER (hsts_enforcer));
+
+	g_assert_true (old || new);

OK I'll update that, but don't we loose semantic by using g_assert instead of its specialized variants?

@@ +421,3 @@
+}
+
+static inline const gchar*

Even if we only use it in one place and split this code from where it is used for readability's sake, adding semantic via the function's name?

@@ +469,3 @@
+	g_return_if_fail (msg != NULL);
+
+	/* TODO if connection error or warnings received, do nothing. */

Indeed, but I didn't find how to handle that. =/

@@ +471,3 @@
+	/* TODO if connection error or warnings received, do nothing. */
+
+	/* TODO if header received on hazardous connection, do nothing. */

Ditto.

@@ +499,3 @@
+	g_return_val_if_fail (uri != NULL, FALSE);
+
+	// HSTS secures only HTTP connections.

I never really heard of WebSocket before, I have no idea how to handle it and if it requires something special. I added a "TODO".

@@ +523,3 @@
+	soup_uri_free (dst_uri);
+
+	soup_message_set_redirect (msg, 301, dst);

I just replaced 301 by SOUP_STATUS_MOVED_PERMANENTLY, for readability's sake.

::: libsoup/soup-hsts-policy.c
@@ +21,3 @@
+ * @see_also: #SoupHstsEnforcer
+ *
+ * #SoupHstsPolicy implements HTTP policies, as described by <ulink

I switched to "represents" instead of "implements" or "store" as I want SoupHstsPolicy to _be_ the policies, not a policy storage. Also "represents" feels less technical than the two others.

That being said, the stored data probably should be made private as no method allow to change it on purpose: I wanted policies to be immutable to simplify their usage. Any thought on that?

@@ +58,3 @@
+ * Copies @policy.
+ *
+ * Return value: a copy of @policy

Should we do this on constructors too then? As this is kinda like a constructor.

@@ +65,3 @@
+soup_hsts_policy_copy (SoupHstsPolicy *policy)
+{
+	SoupHstsPolicy *copy = g_slice_new0 (SoupHstsPolicy);

I copied it from SoupCookie, should I add a FIXME here to remember checking this out later?

@@ +113,3 @@
+}
+
+static inline const char *

Doesn't inlining a function just copies its code to its call points, saving the cost of function calls at the expense of a larger binary? Also doesn't inlining allow the compiler to perform more optimizations as the function's code is merged with its call point, breaking the optimization barrier that the function call is?

The only pessimization I can think of is that the binary will be larger but given how often we call this function and how small it is, I feel like inlining it is a good tradeoff. How can inlining a loop create a pessimization, it's the first time I ear about that?

Whatever your final opinion is I'll adapt to it, I just what to know more. =)

@@ +178,3 @@
+
+		has_value = (*p == '=');
+#define MATCH_NAME(name) ((end - start == strlen (name)) && !g_ascii_strncasecmp (start, name, end - start))

Maybe we should update SoupCookie too then. Maybe even making it an inline function and sharing it between SoupCookie and SoupHstsPolicy.

@@ +220,3 @@
+						  include_sub_domains);
+
+fail:

I did it that way so if for any reason we need to free memory or we want to print an error message we can easily do it. But yeah... goto. =)

Let's see what Dan suggest.

@@ +279,3 @@
+
+	policy = g_slice_new0 (SoupHstsPolicy);
+	policy->domain = g_strdup (domain);

I don't know, I usually don't like constructors returning NULL as it's really counter intuitive in higher level languages: for example in Vala you expect "new Soup.HstsPolicy (...)" to give you an object, not null. We should probably throw an error to give some information on why the construction failed instead, at least the semantic is clear even on higher level languages.

Another solution I like but which is more annoying to implement in C is to always let the construction go well and to lazily check the validity of the object when using it, throwing errors on every method rather than on the constructor.
Comment 5 Michael Catanzaro 2016-06-13 17:48:29 UTC
(In reply to Adrien Plazas from comment #4)
> OK I'll update that, but don't we loose semantic by using g_assert instead
> of its specialized variants?

Nah, it's easy enough to read an assert.

> @@ +421,3 @@
> +}
> +
> +static inline const gchar*
> 
> Even if we only use it in one place and split this code from where it is
> used for readability's sake, adding semantic via the function's name?

It's debatable, but I personally never use inline myself, except for one-line functions.
 
> I never really heard of WebSocket before, I have no idea how to handle it
> and if it requires something special. I added a "TODO".

You'll want to upgrade ws:// (web socket) connections to wss:// (web socket secure). It can be a TODO item indeed, since this is not part of the HSTS spec.

> That being said, the stored data probably should be made private as no
> method allow to change it on purpose: I wanted policies to be immutable to
> simplify their usage. Any thought on that?

Seems good to me.

> @@ +58,3 @@
> + * Copies @policy.
> + *
> + * Return value: a copy of @policy
> 
> Should we do this on constructors too then? As this is kinda like a
> constructor.

Hmmmm, you might be right; perhaps if the function name ends with "_new" then transfer full is implied, as I don't see these annotations elsewhere in libsoup. We seem to use them inconsistently within WebKit; exactly one of the webkit_web_view_new functions has the annotation.

I found no documentation of this at https://wiki.gnome.org/Projects/GObjectIntrospection/Annotations though.

> @@ +65,3 @@
> +soup_hsts_policy_copy (SoupHstsPolicy *policy)
> +{
> +	SoupHstsPolicy *copy = g_slice_new0 (SoupHstsPolicy);
> 
> I copied it from SoupCookie, should I add a FIXME here to remember checking
> this out later?

Nah, if anyone wants to change it later, they'll be grepping for g_slice_new.

> @@ +113,3 @@
> +}
> +
> +static inline const char *
> 
> Doesn't inlining a function just copies its code to its call points, saving
> the cost of function calls at the expense of a larger binary? Also doesn't
> inlining allow the compiler to perform more optimizations as the function's
> code is merged with its call point, breaking the optimization barrier that
> the function call is?
> 
> The only pessimization I can think of is that the binary will be larger but
> given how often we call this function and how small it is, I feel like
> inlining it is a good tradeoff. How can inlining a loop create a
> pessimization, it's the first time I ear about that?
> 
> Whatever your final opinion is I'll adapt to it, I just what to know more. =)

The larger binary hurts cache locality, though -- less of the program can fit in the CPU's fast data caches -- which could harm performance much more than inlining would help. Thing is that human developers can't reason about performance implications of small performance optimizations due to side effects on caching; the only way to make reasoned decisions here is with a profiler in hand.

I don't think it would affect compiler optimizations at all, since it's a static (private) function, I presume the compiler is going to inline it with or without the keyword specified if it wants to. It might even ignore your request for inlining; the keyword does trigger semantic changes but I don't think it's actually guaranteed that the function will be inlined anymore. But I am answering this off the top of my head, without looking any of this up, so take it with a grain of salt; you might be interested in researching further.

> @@ +279,3 @@
> +
> +	policy = g_slice_new0 (SoupHstsPolicy);
> +	policy->domain = g_strdup (domain);
> 
> I don't know, I usually don't like constructors returning NULL as it's
> really counter intuitive in higher level languages: for example in Vala you
> expect "new Soup.HstsPolicy (...)" to give you an object, not null. We
> should probably throw an error to give some information on why the
> construction failed instead, at least the semantic is clear even on higher
> level languages.

Hmmm, very good point. The right solution for C is to return NULL, but the right solution for Vala is to throw an exception, which in C would mean to pass a GError to the new function... that's not possible for GObjects, though, and I don't think it's common for boxed types either. I wonder what Carlos and Dan prefer.

> Another solution I like but which is more annoying to implement in C is to
> always let the construction go well and to lazily check the validity of the
> object when using it, throwing errors on every method rather than on the
> constructor.

That's more complicated. No thank you. :)
Comment 6 Adrien Plazas 2016-06-13 20:11:50 UTC
Created attachment 329721 [details] [review]
Add SoupHstsEnforcer

Here is a summary of the still open questions from the previous conversation and other notes.

Questions:
 - Does libsoup support IDNA?
 - What aobut WebSocket and HSTS, should we upgrade ws:// connections to wss://? I don't think HSTS' RFC says anything about WebSocket.
 - Should we check validity of the hostname in the SoupHstsPolicy constructors?
 - How should SoupHstsPolicy constructors fail:
    - by returning NULL and stating the return value as nullale or
    - by throwing errors?
   Returning NULL is good for C but not for language bindings and errors give reasons of the failure.
 - How to handle the ownership of the SoupDate when soup_hsts_policy_new() fails while avoinding useless copies and having no reference counting?
 - Should we use g_slice_new? 
 - What about using 'inline'?
 - Should we share SQLite related code between SoupCookieJarDb and SoupHstsEnforcerDB when possible, confining SQLite's behavior into the functions wrapping its usage?
 - Should we update SoupCookie, SoupCookieJar and SoupCookieJarDb to match what we considered good design in SoupHstsPolicy, SoupHstsEnforcer and SoupHstsEnforcerDB?
 - Should we make MATCH_NAME a (inline) function?
 - How to fix these TODO I wrote?
    - /* TODO if connection error or warnings received, do nothing. */
    - /* TODO if header received on hazardous connection, do nothing. */
 - In soup-hsts-policy.c, I usage of "goto fail" OK?
 - This code has been copied from SoupCookieJarDb to SoupHstsEnforcerDB:
    - if (sqlite3_exec (priv->db, "PRAGMA synchronous = OFF; PRAGMA secure_delete = 1;", NULL, NULL, &error)) {
    -     g_warning ("Failed to execute query: %s", error);
   Michael asked the following question and I can't anwser it:
   "This is intended to be nonfatal? It'd be a good idea to leave a one-line comment, else some future programmer might decide you just forgot and add a return TRUE here."

TODO:
 - Update soup-autocleanups.h.
 - Add tests.
 - Make fields of SoupHstsPolicy private.
 - Remove useless precondition checks in static functions.

Notes:
 - Ownership of returned values: https://wiki.gnome.org/Projects/GObjectIntrospection/Annotations Default Annotations... return values: (transfer full)
Comment 7 Claudio Saavedra 2017-08-25 08:13:30 UTC
I've tested this today with WebKitGTK+ (by adding the HSTS feature to the SoupNetworkSession) and I can load https://hsts.badssl.com/ correctly. I'd be interested in taking this work forward so that we can merge it into libsoup in the near future. Adding Dan to the CC.
Comment 8 Adrien Plazas 2017-08-28 13:36:00 UTC
(In reply to Claudio Saavedra from comment #7)
> I've tested this today with WebKitGTK+ (by adding the HSTS feature to the
> SoupNetworkSession) and I can load https://hsts.badssl.com/ correctly. I'd
> be interested in taking this work forward so that we can merge it into
> libsoup in the near future. Adding Dan to the CC.

For what it's worth, you can pursue this work, no problem. :)
Comment 9 Claudio Saavedra 2018-06-07 08:06:13 UTC
Created attachment 372589 [details] [review]
Add SoupHstsEnforcer

This is a WIP.

Add the SoupSessionFeature SoupHstsEnforcer to allow a session to
support HSTS (RFC 6797).

This also adds SoupHstsPolicy and an early version of
SoupHstsEnforcerDb.

HSTS: Rewrite the HSTS feature and add tests

This is a comprehensive rework of the HSTS enforcer and related
classes, based upon Adrien Plazas work. A summary of the most
relevant changes:

SoupHSTSEnforcer:

- The enforcer will listen on headers both on message queueing
  and restarting. This is necessary in order to be able to enforce
  HSTS redirections on messages that are restarted for whatever
  reason.
- Instead of causing a redirection, the URI will be overwritten
  directly on the message before this one is sent. Redirections
  are for use on the server side, and the tests added show that
  it is not a reliable way to do HSTS enforcing. Currently,
  the only way to find out that a HSTS policy has been enforced
  is by listening to the SoupMessage:uri property changes, but
  this might be impractical, so this could be revisited in the
  future.
- soup_hsts_enforcer_policy() will not steal the given policy.
  Doing so is prone to leaks and not customary.
- SoupHSTSEnforcerClass now has a has_valid_policy() vfunc.
  It currently works exactly as before, but the idea here is to
  make it possible for subclasses to implement their own check
  for existence of valid policies for domains, instead of all
  subclasses having to add their policies to the base
  SoupHSTSEnforcer class. This will be useful when having a large
  number of pre-loaded HSTS policies (either in SoupHSTSEnforcerDB
  or in an enforcer using libhsts as a backend) to avoid having
  potentially thousands of policies in memory at all times.
- HSTS headers are parsed using soup's available utilities,
  instead of parsing them by hand. The specification is carefully
  followed so as to not accept any header that is not fully
  compliant.

SoupHSTSEnforcerDB:

- Store the max-age attribute in the database. This was done before
  errata 5372 was reported to RFC 6797, and its necessity will
  depend on how the errata is treated.

Other:

- Added tests for both enforcer classes that cover most of the
  specification.
- Added the gtk-doc documentation and update all the documentation
  comments.
- Rename SoupHsts classes to SoupHSTS for consistent naming and
  other minor renaming of parameters and methods.
Comment 10 Claudio Saavedra 2018-06-07 08:09:33 UTC
The above patch is a squash of both Adrien's original patch and of my work on top of it. I decided to squash both to make review easier. Both patches are preserved in the hsts branch in gitlab: https://gitlab.gnome.org/GNOME/libsoup/tree/hsts

Once we get this reviewed and ready to land, I will merge the hsts branch instead of pushing this.
Comment 11 Michael Catanzaro 2018-06-08 04:18:37 UTC
Review of attachment 372589 [details] [review]:

Hm, we are on Bugzilla... did you decide not want to migrate libsoup bugs to GitLab? You have to request it from Carlos Soriano if you want it... he is not going to migrate any of the bugs unless you request it.

Anyway, this looks like high-quality code, so r=me. Excellent. Apologies for the huge review, but I wanted to give this the attention it deserves. Most of my comments are fairly minor.

::: docs/reference/Makefile.am
@@ -47,3 +47,3 @@
 	soup-proxy-resolver-wrapper.h soup-proxy-uri-resolver.h \
 	soup-cache-private.h soup-cache-client-input-stream.h \
-	soup-socket-private.h soup-value-utils.h soup-xmlrpc-old.h
+	soup-socket-private.h soup-value-utils.h soup-xmlrpc-old.h \

Meson changes are missing. (When are you planning to get rid of autotools?)

::: docs/reference/libsoup-2.4-sections.txt
@@ +1367,3 @@
+soup_hsts_policy_is_expired
+soup_hsts_policy_includes_subdomains
+soup_hsts_policy_is_permanet

This API function is misspelled! permanent

::: libsoup/soup-hsts-enforcer-db.c
@@ +59,3 @@
+
+	g_free (priv->filename);
+	g_clear_pointer (&priv->db, sqlite3_close);

Nit:

It's weird to use g_free and then g_clear_pointer. There are two competing schools of thought here:

 * g_clear_pointer is unnecessary in finalize, since finalize is guaranteed to be the final called last
 * Using g_clear_pointer in finalize can help with debugging issues, and is parallel to dispose

Anyway, whichever you prefer, it makes sense to be consistent: either wrap the call to g_free with g_clear_pointer, or call sqlite3_close directly instead of via g_clear_pointer.

@@ +126,3 @@
+#define QUERY_ALL "SELECT id, host, max_age, expiry, include_subdomains FROM soup_hsts_policies;"
+#define CREATE_TABLE "CREATE TABLE soup_hsts_policies (id INTEGER PRIMARY KEY, host TEXT UNIQUE, max_age INTEGER, expiry INTEGER, include_subdomains INTEGER)"
+#define QUERY_INSERT "INSERT OR REPLACE INTO soup_hsts_policies VALUES((SELECT id FROM soup_hsts_policies WHERE host=%Q), %Q, %d, %d, %d);"

I was a bit concerned that you're using string formats instead of prepared statements, but I guess %Q is sufficient to guard against SQL injection? OK then.

@@ +188,3 @@
+exec_query_with_try_create_table (sqlite3 *db,
+				  const char *sql,
+				  int (*callback)(void*,int,char**,char**),

I like to use typedefs for these.

@@ +198,3 @@
+		if (try_create) {
+			try_create = FALSE;
+			try_create_table (db);

Hm, I had to stare at this code for a little while to see what you're doing here. I think it would be less confusing if you got rid of the goto and the try_create variable:

        if (sqlite3_exec (db, sql, callback, argument, &error)) {
                try_create_table (db);
                sqlite3_free (error);
                error = NULL;

                if (sqlite3_exec (db, sql, callback, argument, &error)) {
                        g_warning ("Failed to execute query: %s", error);
                        sqlite3_free (error);
                }
        }

That's not really any longer, and the code flow is nice and linear, and only the condition itself has to be duplicated.

What error code gets returned if the table does not exist? I looked through the list at https://sqlite.org/rescode.html and I'm guessing there's no way to tell if the call failed because the table does not exist? I was thinking it would be better to only try creating the table in that case, but that would only be good if it's easy to do.

Also, why is this superior to adding an ensure_table_exists() function? Is there a reason that would not work as well?

@@ +259,3 @@
+
+	if (old_policy && !new_policy) {
+		query = sqlite3_mprintf (QUERY_DELETE,

I would NULL-check the result, it does not nicely crash on OOM like g_malloc() will.

::: libsoup/soup-hsts-enforcer.c
@@ +7,3 @@
+ */
+
+/* TODO Use only internationalized domain names */

Yeah... there is an example of how to do this using ICU in ephy-uri-helpers.c (ephy_uri_decode). Your code would be simpler because you wouldn't have to worry about IDN homographs (that's what Epiphany's evaluate_host_for_display() is designed to deal with).

@@ +28,3 @@
+ * A #SoupHSTSEnforcer stores HSTS policies and enforces them when
+ * required. #SoupHSTSEnforcer implements #SoupSessionFeature, so you
+ * can add a HSTS enforcer to a session with

Tricky English nit: since H starts with a vowel sound ("eitch"), write "an HSTS enforcer" rather than "a HSTS enforcer."

@@ +76,3 @@
+	hsts_enforcer->priv->host_policies = g_hash_table_new_full (soup_str_case_hash,
+								    soup_str_case_equal,
+								    g_free, NULL);

Why are you using NULL here instead of soup_hsts_policy_free()? Is there some problem with this? It would allow you to remove the entire finalize function, and a bunch of manual calls to soup_hsts_policy_free() elsewhere in this file. Definitely safer all around.

@@ +181,3 @@
+ *
+ * Creates a new #SoupHSTSEnforcer. The base #SoupHSTSEnforcer class does
+ * not support persistent storage of HSTS policies; use a subclass for

It's worth mentioning #SoupHSTSDBEnforcer again here.

@@ +198,3 @@
+			    SoupHSTSPolicy *old, SoupHSTSPolicy *new)
+{
+	g_assert_true (old || new);

g_assert_true() is intended for use in unit tests. Same goes for the g_assert_false(), g_assert_nonnull(), etc. throughout this file. It's better to use plain g_assert() instead (outside of unit tests).

@@ +212,3 @@
+		   policy is actually removed from the policies hash
+		   table, which could be problematic, or not. On the
+		   other hand, I have my doubts that the ::changed

Ummm, what's going on here? A comment questioning whether a new public API should exist in the patch that adds it seems pretty problematic. I do see SoupHSTSDBEnforcer uses the ::changed signal, after all.

Emitting the ::changed signal too soon does seem like only a minor problem, but wouldn't it be fairly simple to avoid by using GHashTableIter and g_hash_table_iter_remove()?

@@ +216,3 @@
+		*/
+		soup_hsts_enforcer_changed (enforcer, policy, NULL);
+		soup_hsts_policy_free (policy);

Remember to remove this line if you follow my recommendation to set a GDestroyNotify and get rid of finalize. (It's clear that you were pretty careful here, which is why I'm confused that you didn't use it already.)

@@ +245,3 @@
+	g_hash_table_remove (hsts_enforcer->priv->host_policies, domain);
+	soup_hsts_enforcer_changed (hsts_enforcer, policy, NULL);
+	soup_hsts_policy_free (policy);

Ditto.

@@ +272,3 @@
+	g_hash_table_replace (policies, g_strdup (domain), soup_hsts_policy_copy (new_policy));
+	if (!is_permanent && !soup_hsts_policy_equal (old_policy, new_policy))
+		soup_hsts_enforcer_changed (hsts_enforcer, old_policy, new_policy);

Why is this only called when the policy is not permanent?

@@ +273,3 @@
+	if (!is_permanent && !soup_hsts_policy_equal (old_policy, new_policy))
+		soup_hsts_enforcer_changed (hsts_enforcer, old_policy, new_policy);
+	soup_hsts_policy_free (old_policy);

Regarding GDestroyNotify, ditto.

@@ +303,3 @@
+	g_hash_table_insert (policies, g_strdup (domain), soup_hsts_policy_copy (policy));
+	if (!is_permanent)
+		soup_hsts_enforcer_changed (hsts_enforcer, NULL, policy);

Why is this only called when the policy is not permanent?

@@ +311,3 @@
+ * @policy: (transfer none): the policy of the HSTS host
+ *
+ * Sets @domain's HSTS policy to @policy. If @policy is expired, any

@domain is not a parameter. Maybe: "Add @policy to the enforcer" or "Add @policy to @hsts_enforcer"

@@ +317,3 @@
+ * is, one created with soup_hsts_policy_new_permanent(), the policy
+ * will not expire and will be enforced during the lifetime of
+ * @soup_enforcer's #SoupSession.

@hsts_enforcer

@@ +382,3 @@
+static gboolean
+soup_hsts_enforcer_host_includes_subdomains (SoupHSTSEnforcer *hsts_enforcer,
+					      const char *domain)

Extra space here

@@ +401,3 @@
+}
+
+static inline const char*

char *

@@ +408,3 @@
+	g_assert_nonnull (domain);
+
+	for (; *iter != '\0' && *iter != '.' ; iter++);

Important: the URI can be UTF-8, right? Don't you need to use g_utf8_next_char()/g_utf8_get_char() here?

@@ +409,3 @@
+
+	for (; *iter != '\0' && *iter != '.' ; iter++);
+	for (; *iter == '.' ; iter++);

You're skipping over multiple periods in a row? So URI labels can be empty?

@@ +445,3 @@
+
+	/* TODO if connection error or warnings received, do nothing. */
+	/* TODO if header received on hazardous connection, do nothing. */

Any plans for this? It might be tricky since you'd need to check TLS errors from the underlying GTlsConnection, right? And you only have access to that from SoupSocket?

@@ +512,3 @@
+
+static void
+message_restarted_cb (SoupMessage *msg, gpointer user_data)

When does this happen?

@@ +517,3 @@
+
+}
+static void

Missing blank line above here.

@@ +531,3 @@
+				     SoupMessage *msg)
+{
+	g_signal_handlers_disconnect_by_func (msg, got_sts_headers_cb, feature);

Is this supposed to be message_restarted_cb?

Or does this prevent got_sts_headers_cb from being called multiple times when the message is restarted?

::: libsoup/soup-hsts-policy.c
@@ +232,3 @@
+
+	policy = soup_hsts_policy_new (domain, 0, include_subdomains);
+	g_clear_pointer (&policy->expires, soup_date_free);

Wouldn't it be better to implement it with soup_hsts_policy_new_full() instead? Then you don't need the unnecessary allocation/deallocation of the SoupDate here.

@@ +271,3 @@
+			continue;
+
+		origin = soup_message_get_uri (msg);

In Epiphany and WebKit "origin" means security origin <protocol, host, port> specifically, not the full URI. Probably best to follow that convention and avoid the name "origin" here too?

@@ +288,3 @@
+		if (include_subdomains_value)
+			goto out;
+		/* if there are extra params, the HSTS spec demands the header to be ignored. */

Hmmm, but the spec says (section 6.1):

"5.  If an STS header field contains directive(s) not recognized by the UA, the UA MUST ignore the unrecognized directives, and if the STS header field otherwise satisfies the above requirements (1 through 4), the UA MUST process the recognized directives."

So I don't think this is right: the intent is that future specs can add new stuff here without causing libsoup to ignore the STS header. Consider how hard that would make it to update the spec in the future. I would add a test for this.

(But if the content of the header does not adhere to the expected grammar at all, then yeah the header should be ignored.)

::: tests/hsts-db-test.c
@@ +59,3 @@
+	soup_message_set_flags (msg, SOUP_MESSAGE_NO_REDIRECT);
+	soup_session_send_message (session, msg);
+	/* g_assert_cmpint (soup_uri_get_port (soup_message_get_uri (msg)), ==, soup_uri_get_port (https_uri)); */

?

::: tests/hsts-test.c
@@ +41,3 @@
+			soup_message_headers_append (msg->response_headers,
+						     "Strict-Transport-Security",
+						     "max-age=31536000");

max-age can have optional quotations. I would add a test to ensure that works:

"max-age=\"31536000\""

@@ +50,3 @@
+						     "Strict-Transport-Security",
+						     "max-age=3");
+		} else if (strcmp (path, "/delete") == 0) {

I would also test that sending a response with the STS header completely missing does not delete the policy.

@@ +54,3 @@
+						     "Strict-Transport-Security",
+						     "max-age=0");
+		} else if (strcmp (path, "/subdomains") == 0) {

Would be good to also test this bit from section 8.1.1: "The UA MUST NOT modify the expiry time or the includeSubDomains directive of any superdomain matched Known HSTS Host." Although that would require a bit more effort, since you'd really need to run a SoupServer on a subdomain to do that. But it should be easy, you can have the subdomain set max-age=0 and verify that it doesn't delete the superdomain's policy.

@@ +85,3 @@
+						     "Strict-Transport-Security",
+						     "max-age=3600; includeDomains; includeDomains");
+		} else if (strcmp (path, "/case-insensitive") == 0) {

And also worth testing that Strict-Transport-Security *is* case-sensitive. "STRICT-TRANSPORT-SECURITY" should not work.

@@ +271,3 @@
+	SoupSession *session = hsts_session_new (NULL);
+	session_get_uri (session, "https://localhost/multiple-headers", SOUP_STATUS_OK);
+	session_get_uri (session, "http://localhost/multiple-headers", SOUP_STATUS_OK);

I would also check that it's the first header that's being honored. This test would currently pass if the second header were improperly honored instead.

@@ +279,3 @@
+{
+	SoupSession *session = hsts_session_new (NULL);
+	session_get_uri (session, "http://localhost/insecure", SOUP_STATUS_OK);

How does this test work? Won't it always pass regardless of whether libsoup ignores the HSTS header?

@@ +307,3 @@
+		SoupSession *session = hsts_session_new (NULL);
+		char *uri = g_strdup_printf ("https://localhost/extra-values-%i", i);
+		session_get_uri (session, "http://localhost", SOUP_STATUS_MOVED_PERMANENTLY);

As per my earlier comment, I think this one should be SOUP_STATUS_OK.
Comment 12 Michael Catanzaro 2018-06-08 04:31:36 UTC
Review of attachment 372589 [details] [review]:

We should also at least add a TODO to implement Mitigation #1 from https://webkit.org/blog/8146/protecting-against-hsts-abuse/. I'm not quite sure how you could do that....

I guess Mitigation #2 would be a WebKit-level issue?

::: libsoup/soup-hsts-enforcer.c
@@ +354,3 @@
+
+/**
+ * soup_hsts_enforcer_set_session_policy:

A session policy is a policy that is permanent to the lifetime of the SoupSession and doesn't expire... what does this represent? It's not used anywhere currently, right? Is it to be used for pinning?
Comment 13 Michael Catanzaro 2018-06-08 04:32:25 UTC
(In reply to Michael Catanzaro from comment #12)
> Is it to be used for pinning?

I meant "preloading."
Comment 14 Michael Catanzaro 2018-06-11 20:13:27 UTC
Just for future reference:

> @@ +307,3 @@
> +		SoupSession *session = hsts_session_new (NULL);
> +		char *uri = g_strdup_printf ("https://localhost/extra-values-%i", i);
> +		session_get_uri (session, "http://localhost",
> SOUP_STATUS_MOVED_PERMANENTLY);
> 
> As per my earlier comment, I think this one should be SOUP_STATUS_OK.

Turns out it's important. I ran across this header in the wild today:

Strict-Transport-Security: max-age=31536000; includeSubDomains; preload

You cannot get your domain on a preload list unless you add preload to the header. Browsers are supposed to ignore just the unrecognized "preload" part, not the entire header.
Comment 15 Michael Catanzaro 2018-06-21 21:10:42 UTC
*** Bug 628298 has been marked as a duplicate of this bug. ***
Comment 16 Claudio Saavedra 2018-09-19 14:18:05 UTC
Review of attachment 372589 [details] [review]:

::: docs/reference/Makefile.am
@@ -47,3 +47,3 @@
 	soup-proxy-resolver-wrapper.h soup-proxy-uri-resolver.h \
 	soup-cache-private.h soup-cache-client-input-stream.h \
-	soup-socket-private.h soup-value-utils.h soup-xmlrpc-old.h
+	soup-socket-private.h soup-value-utils.h soup-xmlrpc-old.h \

Fixed this.

::: docs/reference/libsoup-2.4-sections.txt
@@ +1367,3 @@
+soup_hsts_policy_is_expired
+soup_hsts_policy_includes_subdomains
+SoupHSTSPolicy

Fixed.

::: libsoup/soup-hsts-enforcer-db.c
@@ +59,3 @@
+
+	g_free (priv->filename);
+	g_clear_pointer (&priv->db, sqlite3_close);

Noted, I've removed the g_clear_pointer() use.

@@ +126,3 @@
+#define QUERY_ALL "SELECT id, host, max_age, expiry, include_subdomains FROM soup_hsts_policies;"
+#define CREATE_TABLE "CREATE TABLE soup_hsts_policies (id INTEGER PRIMARY KEY, host TEXT UNIQUE, max_age INTEGER, expiry INTEGER, include_subdomains INTEGER)"
+#define QUERY_INSERT "INSERT OR REPLACE INTO soup_hsts_policies VALUES((SELECT id FROM soup_hsts_policies WHERE host=%Q), %Q, %d, %d, %d);"

Because we're using %Q and mprintf(), I understand that this is safe to do.

@@ +188,3 @@
+exec_query_with_try_create_table (sqlite3 *db,
+				  const char *sql,
+				  int (*callback)(void*,int,char**,char**),

Added a typedef.

@@ +198,3 @@
+		if (try_create) {
+			try_create = FALSE;
+			try_create_table (db);

From what I see this code comes from the cookies-db implementation, most of this file is based on that, and I didn't pay that much attention to it.

@@ +259,3 @@
+
+	if (old_policy && !new_policy) {
+		query = sqlite3_mprintf (QUERY_DELETE,

I've added a g_assert() here, not sure how to deal with an OOM scenario, but I guess that the assertion will do?

::: libsoup/soup-hsts-enforcer-private.h
@@ +13,3 @@
+				    SoupHSTSPolicy    *policy);
+
+#endif /* SOUP_HSTS_ENFORCER_PRIVATE_H */

This file is going away and soup_hsts_enforcer_set_policy() made public, as we need it for testing.

::: libsoup/soup-hsts-enforcer.c
@@ +28,3 @@
+ * A #SoupHSTSEnforcer stores HSTS policies and enforces them when
+ * required. #SoupHSTSEnforcer implements #SoupSessionFeature, so you
+ * can add a HSTS enforcer to a session with

Fixed.

@@ +76,3 @@
+	hsts_enforcer->priv->host_policies = g_hash_table_new_full (soup_str_case_hash,
+								    soup_str_case_equal,
+								    g_free, NULL);

IIRC it's because of the _replace_policy() method. In this method the ::changed signal is emitted, which takes the original policy and the new one as parameters. Doing this with GDestroyNotify would mean that the old policy is no longer available for the emission. I agree with you that using GDestroyNotify instead would be safer and cleaner, but then we would have to take care of this. We can change the ::changed API though.

@@ +181,3 @@
+ *
+ * Creates a new #SoupHSTSEnforcer. The base #SoupHSTSEnforcer class does
+ * not support persistent storage of HSTS policies; use a subclass for

OK.

@@ +198,3 @@
+			    SoupHSTSPolicy *old, SoupHSTSPolicy *new)
+{
+	g_assert_true (old || new);

Right.

@@ +212,3 @@
+		   policy is actually removed from the policies hash
+		   table, which could be problematic, or not. On the
+		   other hand, I have my doubts that the ::changed

Yeah, I added this comment because I am not entirely sure that the ::changed signal, as is, is good API. For one there's the problem you commented about earlier. Second, this.

@@ +272,3 @@
+	g_hash_table_replace (policies, g_strdup (domain), soup_hsts_policy_copy (new_policy));
+	if (!is_permanent && !soup_hsts_policy_equal (old_policy, new_policy))
+		soup_hsts_enforcer_changed (hsts_enforcer, old_policy, new_policy);

Okay, I think we need to improve the API here. The permanent policies are the "session policies". That is, a policy that is permanent to the lifetime of the session. The idea of these is that they can be set by the UA from a preload list, for example, and then they are enforced without expiring. This in contrast with the policies that come from servers sending the Strict-Transport-Security header. Problem 1. is that we are using both the concept of "permanent" and "session" policies, which is already confusing.

2. observation is that  session/permanent policies are the only ones that can be added programmatically using the API. Non-permanent ones come from servers that use the Strict-Transport-Security header (as soup_hsts_enforcer_set_policy() is not public but private API).

3. issue is what you point out here: We're only emitting the "changed" signal when we modify server-side policies. This might be related to the fact that these are the ones that the DB enforcer wants to act upon (the others shouldn't be stored, because they are tied to the session), but this causes issues with other potential users of this signal. I think that instead we might want to emit this signal always and let the DB enforcer decide which policies to store. That would make more sense to me.

@@ +303,3 @@
+	g_hash_table_insert (policies, g_strdup (domain), soup_hsts_policy_copy (policy));
+	if (!is_permanent)
+		soup_hsts_enforcer_changed (hsts_enforcer, NULL, policy);

Same explanation as above.

@@ +311,3 @@
+ * @policy: (transfer none): the policy of the HSTS host
+ *
+ * Sets @domain's HSTS policy to @policy. If @policy is expired, any

Fixed.

@@ +317,3 @@
+ * is, one created with soup_hsts_policy_new_permanent(), the policy
+ * will not expire and will be enforced during the lifetime of
+ * @soup_enforcer's #SoupSession.

Fixed.

@@ +354,3 @@
+
+/**
+ * soup_hsts_enforcer_set_session_policy:

Yeah, as explained above, this is something that can be used to preload HSTS policies. But we need to fix the API naming to make it less confusing.

@@ +382,3 @@
+static gboolean
+soup_hsts_enforcer_host_includes_subdomains (SoupHSTSEnforcer *hsts_enforcer,
+					      const char *domain)

Fixed.

@@ +401,3 @@
+}
+
+static inline const char*

Fixed.

@@ +408,3 @@
+	g_assert_nonnull (domain);
+
+	for (; *iter != '\0' && *iter != '.' ; iter++);

Not sure about that, at least the SoupURI code seems to just iterate similarly..

@@ +409,3 @@
+
+	for (; *iter != '\0' && *iter != '.' ; iter++);
+	for (; *iter == '.' ; iter++);

This is called for internal uris, as returned by soup_uri_get_host(), that have been queued. Shouldn't that be guarantee enough that they are valid?

@@ +445,3 @@
+
+	/* TODO if connection error or warnings received, do nothing. */
+	/* TODO if header received on hazardous connection, do nothing. */

I haven't looked into this in detail yet..

@@ +512,3 @@
+
+static void
+message_restarted_cb (SoupMessage *msg, gpointer user_data)

It happens for example if there is a redirection. HSTS-compliants HTTP servers are supposed to send a redirection, so then SoupSession restarts the message to the redirected address. This is why we need to listen to this signal.

@@ +517,3 @@
+
+}
+static void

Fixed.

@@ +531,3 @@
+				     SoupMessage *msg)
+{
+	g_signal_handlers_disconnect_by_func (msg, got_sts_headers_cb, feature);

Yeah this is wrong.. it comes from the first patch when process_sts_header was used as a callback from the request_queued function. Then again, both signals handlers are connected, so probably the safest would be to disconnect both.

::: libsoup/soup-hsts-policy.c
@@ +232,3 @@
+
+	policy = soup_hsts_policy_new (domain, 0, include_subdomains);
+	g_clear_pointer (&policy->expires, soup_date_free);

Good point.. fixed that.

@@ +271,3 @@
+			continue;
+
+		origin = soup_message_get_uri (msg);

Good point, no need to use "origin" here.

@@ +288,3 @@
+		if (include_subdomains_value)
+			goto out;
+		/* if there are extra params, the HSTS spec demands the header to be ignored. */

You are right! I misunderstood 4. I'll fix this.

::: tests/hsts-db-test.c
@@ +59,3 @@
+	soup_message_set_flags (msg, SOUP_MESSAGE_NO_REDIRECT);
+	soup_session_send_message (session, msg);
+	/* g_assert_cmpint (soup_uri_get_port (soup_message_get_uri (msg)), ==, soup_uri_get_port (https_uri)); */

Removed, we don't really need this.

::: tests/hsts-test.c
@@ +41,3 @@
+			soup_message_headers_append (msg->response_headers,
+						     "Strict-Transport-Security",
+						     "max-age=31536000");

Added one.

@@ +50,3 @@
+						     "Strict-Transport-Security",
+						     "max-age=3");
+		} else if (strcmp (path, "/delete") == 0) {

Added.

@@ +54,3 @@
+						     "Strict-Transport-Security",
+						     "max-age=0");
+		} else if (strcmp (path, "/subdomains") == 0) {

I don't think it's possible to run a subdomain locally with localhost, etc. So I manually added such a policy to the enforcer and checked that it didn't remove the superdomain's policy. We do similarly with cookies tests.

@@ +85,3 @@
+						     "Strict-Transport-Security",
+						     "max-age=3600; includeDomains; includeDomains");
+		} else if (strcmp (path, "/case-insensitive") == 0) {

From rfc7230: "Each header field consists of a case-insensitive field name followed by a colon (":"), optional leading whitespace, the field value, and optional trailing whitespace." So it should work. I added a test for that too.

@@ +271,3 @@
+	SoupSession *session = hsts_session_new (NULL);
+	session_get_uri (session, "https://localhost/multiple-headers", SOUP_STATUS_OK);
+	session_get_uri (session, "http://localhost/multiple-headers", SOUP_STATUS_OK);

Fixed this test to actually check that the first is honored and the second ignored.

@@ +279,3 @@
+{
+	SoupSession *session = hsts_session_new (NULL);
+	session_get_uri (session, "http://localhost/insecure", SOUP_STATUS_OK);

Yes, There was another get_uri() call missing. I added it.

@@ +306,3 @@
+	for (int i = 0; i < 3; i++) {
+		SoupSession *session = hsts_session_new (NULL);
+		char *uri = g_strdup_printf ("https://localhost/extra-values-%i", i);

I didn't realize that this is not being used at all, so this test is no-op. I'm fixing it.
Comment 17 Michael Catanzaro 2018-09-19 15:00:24 UTC
(In reply to Claudio Saavedra from comment #16)
> @@ +259,3 @@
> +
> +	if (old_policy && !new_policy) {
> +		query = sqlite3_mprintf (QUERY_DELETE,
> 
> I've added a g_assert() here, not sure how to deal with an OOM scenario, but
> I guess that the assertion will do?

Yes, basically the goal is to crash the application rather than continue with a NULL pointer where we expected newly-allocated memory to be. The nice thing about g_new(), g_malloc() and friends is that they crash the application if allocation fails. SQLite doesn't do that; you have to manually check every allocation to see if it succeeded, same as with vanilla C malloc().

> @@ +354,3 @@
> +
> +/**
> + * soup_hsts_enforcer_set_session_policy:
> 
> Yeah, as explained above, this is something that can be used to preload HSTS
> policies. But we need to fix the API naming to make it less confusing.

I don't understand it. :) I'd expect "session" policies to be somehow temporal: they last for the lifetime of the session and then are discarded. But here they are the more permanent ones?

> @@ +409,3 @@
> +
> +	for (; *iter != '\0' && *iter != '.' ; iter++);
> +	for (; *iter == '.' ; iter++);
> 
> This is called for internal uris, as returned by soup_uri_get_host(), that
> have been queued. Shouldn't that be guarantee enough that they are valid?

I guess. I think the loop implies that you're skipping over multiple . characters in a row, which is mildly surprising. Anyway, I don't know what's right or wrong here, it just caught my eye.
Comment 18 Claudio Saavedra 2018-09-20 07:41:35 UTC
(In reply to Michael Catanzaro from comment #17)
> (In reply to Claudio Saavedra from comment #16)
> > @@ +354,3 @@
> > +
> > +/**
> > + * soup_hsts_enforcer_set_session_policy:
> > 
> > Yeah, as explained above, this is something that can be used to preload HSTS
> > policies. But we need to fix the API naming to make it less confusing.
> 
> I don't understand it. :) I'd expect "session" policies to be somehow
> temporal: they last for the lifetime of the session and then are discarded.
> But here they are the more permanent ones?

The policies that come from a server implementing the HSTS directives will have a max-age directive with a value. That is, those are not *permanent*. They will expire eventually. They can be *persistent* if the client is using the DB enforcer, that is, they can be saved to disk, and reloaded in a new session, but they are still not permanent, because they will expire at some point or another.

The ones that you can create with the above method don't expire and will be enforced for as long as the session exists. So a client can preload rules for known sites with them.

Is this still confusing? Yes, I definitely agree.
Comment 19 Claudio Saavedra 2018-09-20 14:54:46 UTC
Discussion on this will continue in gitlab, in this merge request: https://gitlab.gnome.org/GNOME/libsoup/merge_requests/22
Comment 20 GNOME Infrastructure Team 2018-09-21 16:25:30 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/libsoup/issues/93.