Bug 694843 – g_base64_decode_step () produces invalid data

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 694843 - g_base64_decode_step () produces invalid data


Summary:	g_base64_decode_step () produces invalid data


Status:	RESOLVED FIXED

Product:	glib
Classification:	Platform
Component:	general
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gtkdev
QA Contact:	gtkdev

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2013-02-28 08:53 UTC by Ognyan Tonchev (redstar_)
Modified:	2013-03-05 15:57 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
a C program illustrating the problem (869 bytes, text/x-csrc) 2013-02-28 08:54 UTC, Ognyan Tonchev (redstar_)		Details
Remember if we got padding character in the previous sequence (1.67 KB, patch) 2013-02-28 13:29 UTC, Ognyan Tonchev (redstar_)	none	Details \| Review
Remember if we got padding character in the previous sequence (3.66 KB, patch) 2013-02-28 15:56 UTC, Ognyan Tonchev (redstar_)	none	Details \| Review
Remember if we got padding character in the previous sequence (3.56 KB, patch) 2013-02-28 16:20 UTC, Ognyan Tonchev (redstar_)	none	Details \| Review
Remember if we got padding character in the previous sequence (2.73 KB, patch) 2013-03-01 14:07 UTC, Ognyan Tonchev (redstar_)	none	Details \| Review
0001-base64-Add-tests-for-incremental-decoding-with-very-.patch (2.70 KB, patch) 2013-03-01 18:33 UTC, Colin Walters	accepted-commit_now	Details \| Review
0002-base64-Fix-g_base64_decode_step.patch (2.08 KB, patch) 2013-03-01 18:33 UTC, Colin Walters	accepted-commit_now	Details \| Review

Description Ognyan Tonchev (redstar_) 2013-02-28 08:53:54 UTC

Decoding "Cg==" in a single step produces correct data: 0x0a
But executing the function twice for "Cg=" and "=" will add an extra \0: 0x0a,0x00
I am attaching also a small program which illustrates the problem.

Comment 1 Ognyan Tonchev (redstar_) 2013-02-28 08:54:43 UTC

Created attachment 237582 [details]
a C program illustrating the problem

Comment 2 Ognyan Tonchev (redstar_) 2013-02-28 13:13:28 UTC

The main problem is that g_base64_decode_step () does not remember if it got padding character in the previous sequence. Submitting a patch which solves the problem.

Comment 3 Ognyan Tonchev (redstar_) 2013-02-28 13:29:42 UTC

Created attachment 237596 [details] [review]
Remember if we got padding character in the previous sequence

Comment 4 Colin Walters 2013-02-28 13:34:17 UTC

Review of attachment 237596 [details] [review]:

Can you add a test case for this to glib/tests/gbase64.c ?

::: glib/gbase64.c
@@ +345,3 @@
+
+  /* check if we got padding character in the previous sequence */
+  if (i < 0) {

Braces on a new line - GLib is GNU style (just look at the rest of the code).

Comment 5 Ognyan Tonchev (redstar_) 2013-02-28 15:56:53 UTC

Created attachment 237620 [details] [review]
Remember if we got padding character in the previous sequence

added new test to glib/tests/base64.c and fixed indentation

Comment 6 Ognyan Tonchev (redstar_) 2013-02-28 16:20:36 UTC

Created attachment 237626 [details] [review]
Remember if we got padding character in the previous sequence

Updated the new test, the test program is now obsolete

Comment 7 Ognyan Tonchev (redstar_) 2013-03-01 14:07:26 UTC

Created attachment 237709 [details] [review]
Remember if we got padding character in the previous sequence

Yesterday i was running some stab tests where base64 encoded data is exchanged over a network (glib used on both sides) and discovered a problem in my patch.
The 'pad' variable which indicated whether we got a padding character was not unset after a group of 4 base64 bytes was successfully converted to 3 bytes. Thus, one byte was always 'lost' from the new group directly following the padding characters. For example, 'Cg==QUJD' was converted to '\nAC' where the correct result should be '\nABC'. This also showed that the proposed unit test was not good enough.

I have rewritten my patch and the diff to the original version of the code is much smaller now. No new variables, put bck the ones which were removed, just using the state to remember whether we got padding character in the previous sequence. Also improved the unit test for g_base64_decode_step ().

Sorry for my omission when i was submitting the initial version of the patch.

Comment 8 Colin Walters 2013-03-01 18:32:45 UTC

The base64 decoding bits are a bit hairy; this updated patch looks right to me.

However your added test case didn't make much sense to me.  I've made a new one; does it look better to you as well?

Comment 9 Colin Walters 2013-03-01 18:33:06 UTC

Created attachment 237728 [details] [review]
0001-base64-Add-tests-for-incremental-decoding-with-very-.patch

Comment 10 Colin Walters 2013-03-01 18:33:23 UTC

Created attachment 237729 [details] [review]
0002-base64-Fix-g_base64_decode_step.patch

Comment 11 Colin Walters 2013-03-01 18:34:37 UTC

Note I've revised your patch so that it just uncomments the test cases that would fail, and now pass.

Comment 12 Ognyan Tonchev (redstar_) 2013-03-04 08:55:42 UTC

Yes, the test definitely looks better now. If i may suggest a small change to your patch:

--- glib/tests/base64.c	2013-03-04 09:51:59.230209350 +0100
+++ glib/tests/base64.c.changed	2013-03-04 09:51:18.201593885 +0100
@@ -337,14 +337,14 @@
 {
   const guint blocksize = GPOINTER_TO_UINT (blocksize_p);
   guint i;
+  gint state = 0;
+  guint save = 0;
 
   for (i = 0; ok_100_encode_strs[i]; i++)
     {
       const char *str = ok_100_encode_strs[i];
       const char *p;
       gsize len = strlen (str);
-      gint state = 0;
-      guint save = 0;
       guchar *decoded;
       gsize decoded_size = 0;
       guchar *decoded_atonce;

Comment 13 Colin Walters 2013-03-04 15:47:30 UTC

(In reply to comment #12)
> Yes, the test definitely looks better now. If i may suggest a small change to
> your patch:
> 
> --- glib/tests/base64.c    2013-03-04 09:51:59.230209350 +0100
> +++ glib/tests/base64.c.changed    2013-03-04 09:51:18.201593885 +0100
> @@ -337,14 +337,14 @@
>  {
>    const guint blocksize = GPOINTER_TO_UINT (blocksize_p);
>    guint i;
> +  gint state = 0;
> +  guint save = 0;

Hmm, I don't see how that change is correct because we need to provide "state" and "save" zero-initialized to *each* call to g_base64_decode.

Comment 14 Ognyan Tonchev (redstar_) 2013-03-04 16:13:06 UTC

"state" will be set to zero in g_base64_decode_step () after each 4 base64 byte sequence, so this change won't break anything if the function is correct. Moreover, that way we test also decoding base64 encoded data after 4 bytes long chunk with padding characters. It  is fairly Ok to do:

g_base64_decode_step ("ab==", &state, &save);
g_base64_decode_step ("cdef", &state, &save);

Comment 15 Dan Winship 2013-03-04 16:25:08 UTC

(In reply to comment #14)
> It  is fairly Ok to do:
> 
> g_base64_decode_step ("ab==", &state, &save);
> g_base64_decode_step ("cdef", &state, &save);

no, it's not. that's invalid base64, so there are no guarantees on what output bytes the function will produce at that point

Comment 16 Ognyan Tonchev (redstar_) 2013-03-05 06:43:53 UTC

ouch, ok then, i thought the intent was to support that case too. Now i see why my test didn't make sense to you. Skip my comment.

Comment 17 Colin Walters 2013-03-05 15:14:36 UTC

Review of attachment 237729 [details] [review]:

Makes sense to me.

Comment 18 Colin Walters 2013-03-05 15:15:29 UTC

Ognyan, can you mark the test patch as accepted-commit-now if you agree it makes sense?

Comment 19 Ognyan Tonchev (redstar_) 2013-03-05 15:28:41 UTC

Review of attachment 237728 [details] [review]:

The patch looks good!

Comment 20 Colin Walters 2013-03-05 15:57:11 UTC

https://git.gnome.org/browse/glib/commit/?id=27b19cee1bb5007f9dd123e171bcf3f978263f15
https://git.gnome.org/browse/glib/commit/?id=06a59f889a8d3c8a63622af64d253632a0530017