GNOME Bugzilla – Bug 646896
SoupRequestData should unescape base64 URIs before trying to decode them
Last modified: 2011-04-07 18:46:44 UTC
If a client creates a SoupURI by providing a base64 data: URL with some whitespaces, then SoupURI will encode them as %20. We must decode those before decoding the base64 data. Although whitespaces do not belong to the base64 alphabet they're sometimes inserted at the beginning of each line of the base64 data [1]. [1] http://www.w3.org/Graphics/SVG/Test/20061213/htmlObjectHarness/basic-filters-blend-01-b.html
Created attachment 185274 [details] [review] Patch
Comment on attachment 185274 [details] [review] Patch I don't see any data: URIs on the page you linked to? Can you add a test case or two to tests/uri-parsing.c to cover these changes? >- uristr = soup_uri_to_string (uri, FALSE); >- start = uristr + 5; >+ /* We know is a data: URL so just get the path, this will save >+ * us a soup_uri_to_string(). >+ */ >+ uristr = soup_uri_decode (uri->path); Although technically incorrect, this: data:,foo?bar#baz shows up as "foo?bar#baz" in Firefox, but would turn into just "foo" with your patch. (I think there might even be WebKit tests for this behavior?) (That is an edge case though... maybe it would be more efficient to check if uri->query and uri->fragment are NULL, and use uri->path in that case?) Also, in theory, there could be %-encoded "," ";" or "=" in the URI, so you have to split the string into the proper pieces (like it does now) before decoding anything.
(In reply to comment #2) > (From update of attachment 185274 [details] [review]) > I don't see any data: URIs on the page you linked to? Then you have to take a closer look :). It's actually inside of one of the svg's of the page > Can you add a test case or two to tests/uri-parsing.c to cover these changes? Sure, makes sense > >- uristr = soup_uri_to_string (uri, FALSE); > >- start = uristr + 5; > >+ /* We know is a data: URL so just get the path, this will save > >+ * us a soup_uri_to_string(). > >+ */ > >+ uristr = soup_uri_decode (uri->path); > > Although technically incorrect, this: > > data:,foo?bar#baz > > shows up as "foo?bar#baz" in Firefox, but would turn into just "foo" with your > patch. (I think there might even be WebKit tests for this behavior?) > > (That is an edge case though... maybe it would be more efficient to check if > uri->query and uri->fragment are NULL, and use uri->path in that case?) Yep, fair enough. > Also, in theory, there could be %-encoded "," ";" or "=" in the URI, so you > have to split the string into the proper pieces (like it does now) before > decoding anything. oh true, good point.
Created attachment 185397 [details] [review] Patch New version of the patch: * parses the string before decoding * uses g_base64_decode_inplace to save one string copy * created a soup-uri-private.h to allow the request data to use uri_decoded_copy * nice side effect -> WebKitGtk+ passes acid2 pixel tests again :)
Comment on attachment 185397 [details] [review] Patch >Subject: [PATCH] fix for the bug > >--- fix the commit message before committing obviously > soup-uri.c \ >+ soup-uri-private.h \ > soup-value-utils.c \ indent that "\" with the others. (using tabs) >+ buf = g_base64_decode_inplace ((gchar*) buf, &data->priv->content_length); ah, nice. I guess that didn't exist when we first wrote the code? >+#define SOUP_URI_PRIVATE_H 1 >+ >+G_BEGIN_DECLS You don't need G_BEGIN_DECLS/G_END_DECLS in private headers. (They're only needed for headers that may be included from C++.)
(In reply to comment #5) > (From update of attachment 185397 [details] [review]) > >+ buf = g_base64_decode_inplace ((gchar*) buf, &data->priv->content_length); > > ah, nice. I guess that didn't exist when we first wrote the code? According to the docs it has been there since 2.20, so might be we overlooked it > >+#define SOUP_URI_PRIVATE_H 1 > >+ > >+G_BEGIN_DECLS > > You don't need G_BEGIN_DECLS/G_END_DECLS in private headers. (They're only > needed for headers that may be included from C++.) Yeah :)
committed 48acff9e9e764931d3ea4f1119f4522451de3706