After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 646896 - SoupRequestData should unescape base64 URIs before trying to decode them
SoupRequestData should unescape base64 URIs before trying to decode them
Status: RESOLVED FIXED
Product: libsoup
Classification: Core
Component: Misc
unspecified
Other Linux
: Normal normal
: ---
Assigned To: libsoup-maint@gnome.bugs
libsoup-maint@gnome.bugs
Depends on:
Blocks:
 
 
Reported: 2011-04-06 09:48 UTC by Sergio Villar
Modified: 2011-04-07 18:46 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch (1.66 KB, patch)
2011-04-06 09:49 UTC, Sergio Villar
needs-work Details | Review
Patch (4.15 KB, patch)
2011-04-07 08:22 UTC, Sergio Villar
accepted-commit_now Details | Review

Description Sergio Villar 2011-04-06 09:48:19 UTC
If a client creates a SoupURI by providing a base64 data: URL with some whitespaces, then SoupURI will encode them as %20. We must decode those before decoding the base64 data.

Although whitespaces do not belong to the base64 alphabet they're sometimes inserted at the beginning of each line of the base64 data [1].

[1] http://www.w3.org/Graphics/SVG/Test/20061213/htmlObjectHarness/basic-filters-blend-01-b.html
Comment 1 Sergio Villar 2011-04-06 09:49:17 UTC
Created attachment 185274 [details] [review]
Patch
Comment 2 Dan Winship 2011-04-06 11:46:32 UTC
Comment on attachment 185274 [details] [review]
Patch

I don't see any data: URIs on the page you linked to?

Can you add a test case or two to tests/uri-parsing.c to cover these changes?

>-	uristr = soup_uri_to_string (uri, FALSE);
>-	start = uristr + 5;
>+	/* We know is a data: URL so just get the path, this will save
>+	 * us a soup_uri_to_string().
>+	 */
>+	uristr = soup_uri_decode (uri->path);

Although technically incorrect, this:

    data:,foo?bar#baz

shows up as "foo?bar#baz" in Firefox, but would turn into just "foo" with your patch. (I think there might even be WebKit tests for this behavior?)

(That is an edge case though... maybe it would be more efficient to check if uri->query and uri->fragment are NULL, and use uri->path in that case?)

Also, in theory, there could be %-encoded "," ";" or "=" in the URI, so you have to split the string into the proper pieces (like it does now) before decoding anything.
Comment 3 Sergio Villar 2011-04-06 17:20:21 UTC
(In reply to comment #2)
> (From update of attachment 185274 [details] [review])
> I don't see any data: URIs on the page you linked to?

Then you have to take a closer look :). It's actually inside of one of the svg's of the page
 
> Can you add a test case or two to tests/uri-parsing.c to cover these changes?

Sure, makes sense

> >-	uristr = soup_uri_to_string (uri, FALSE);
> >-	start = uristr + 5;
> >+	/* We know is a data: URL so just get the path, this will save
> >+	 * us a soup_uri_to_string().
> >+	 */
> >+	uristr = soup_uri_decode (uri->path);
> 
> Although technically incorrect, this:
> 
>     data:,foo?bar#baz
> 
> shows up as "foo?bar#baz" in Firefox, but would turn into just "foo" with your
> patch. (I think there might even be WebKit tests for this behavior?)
> 
> (That is an edge case though... maybe it would be more efficient to check if
> uri->query and uri->fragment are NULL, and use uri->path in that case?)

Yep, fair enough.

> Also, in theory, there could be %-encoded "," ";" or "=" in the URI, so you
> have to split the string into the proper pieces (like it does now) before
> decoding anything.

oh true, good point.
Comment 4 Sergio Villar 2011-04-07 08:22:44 UTC
Created attachment 185397 [details] [review]
Patch

New version of the patch:

* parses the string before decoding
* uses g_base64_decode_inplace to save one string copy
* created a soup-uri-private.h to allow the request data to use uri_decoded_copy
* nice side effect -> WebKitGtk+ passes acid2 pixel tests again :)
Comment 5 Dan Winship 2011-04-07 12:01:39 UTC
Comment on attachment 185397 [details] [review]
Patch

>Subject: [PATCH] fix for the bug
>
>---

fix the commit message before committing obviously

> 	soup-uri.c			\
>+	soup-uri-private.h \
> 	soup-value-utils.c		\

indent that "\" with the others. (using tabs)

>+			buf = g_base64_decode_inplace ((gchar*) buf, &data->priv->content_length);

ah, nice. I guess that didn't exist when we first wrote the code?

>+#define SOUP_URI_PRIVATE_H 1
>+
>+G_BEGIN_DECLS

You don't need G_BEGIN_DECLS/G_END_DECLS in private headers. (They're only needed for headers that may be included from C++.)
Comment 6 Sergio Villar 2011-04-07 16:27:31 UTC
(In reply to comment #5)
> (From update of attachment 185397 [details] [review])
> >+			buf = g_base64_decode_inplace ((gchar*) buf, &data->priv->content_length);
> 
> ah, nice. I guess that didn't exist when we first wrote the code?

According to the docs it has been there since 2.20, so might be we overlooked it

> >+#define SOUP_URI_PRIVATE_H 1
> >+
> >+G_BEGIN_DECLS
> 
> You don't need G_BEGIN_DECLS/G_END_DECLS in private headers. (They're only
> needed for headers that may be included from C++.)

Yeah :)
Comment 7 Sergio Villar 2011-04-07 18:46:44 UTC
committed 48acff9e9e764931d3ea4f1119f4522451de3706