After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 535888 - gif reader fails when LZW code is split across three datablocks
gif reader fails when LZW code is split across three datablocks
Status: RESOLVED FIXED
Product: GIMP
Classification: Other
Component: Plugins
git master
Other All
: Normal normal
: 2.4
Assigned To: GIMP Bugs
GIMP Bugs
Depends on:
Blocks:
 
 
Reported: 2008-05-31 09:41 UTC by Rik Snel
Modified: 2008-06-01 15:14 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
fixes bug (312 bytes, patch)
2008-05-31 09:42 UTC, Rik Snel
committed Details | Review
doens't load properly in current svn (113.16 KB, image/gif)
2008-05-31 09:43 UTC, Rik Snel
  Details

Description Rik Snel 2008-05-31 09:41:12 UTC
gif-load.c assumes that if it has not enough bits in the buffer to read the next LZW code it is sufficient to read one datablock to get the desired amount of bits.

This is not true, since a it is legal to have a datablock with only one eight bits. (it is however difficult to imagine why a .gif encoder would do such a thing)

I will add a .gif file that demonstrates the problem and a simple patch to fix it.
Comment 1 Rik Snel 2008-05-31 09:42:26 UTC
Created attachment 111836 [details] [review]
fixes bug
Comment 2 Rik Snel 2008-05-31 09:43:15 UTC
Created attachment 111837 [details]
doens't load properly in current svn
Comment 3 Martin Nordholts 2008-06-01 04:57:28 UTC
Thanks for the patch!

I can confirm that it is necessary to have the patch applied in order to properly load the attached gif.

I don't fully understand how or why the patch fixes the problem though, could you please elaborate? Are you sure it will not break loading other type of gifs?
Comment 4 Rik Snel 2008-06-01 07:39:46 UTC
I will try:

gif pixel data is stored in LZW codes those codes are between 3 and 12 bits long. Here is how the codes are packed into bytes: (from right to left)

 byte n       byte 5   byte 4   byte 3   byte 2   byte 1
              MSB  LSB MSB  LSB MSB  LSB MSB  LSB MSB  LSB
 +-.....-----+--------+--------+--------+--------+--------+
 | and so on |hhhhhggg|ggfffffe|eeeedddd|dcccccbb|bbbaaaaa|
 +-.....-----+--------+--------+--------+--------+--------+
(this illustration is stolen from the gif spec: gif87a.txt)

So aaaaa is the first LZW code, bbbbb the second, etc. 

Those bytes are then split in blocks. A block contains at most 255 data bytes.
Each block is prefixed with it's length in the gif file.

So when reading a .gif file, gif-load.c calls GetDataBlock to read the first
datablock into a buffer and starts outputting LZW codes. When it detects the end of the data block it moves the last two bytes to the start of the buffer and reads (at the position of the third byte) the next datablock. And then it continues to read the bits.

Now the following can occur, suppose we have LZW codes of 11 bits.
      byte n+1 byte n   byte n-1
      MSB  LSB MSB  LSB MSB  LSB
 ----+--------+--------+--------+-----
 cccc|cccccccb|bbbbbbbb|bbaaaaaa|aaaaa 
 ----+--------+--------+--------+-----
  block M + 1 |block M |  block M - 1

So block M - 1 contains byte n-1, byte n-2 etc, block M only contains one byte, namely byte n, and block M + 1 contains byte n+1, byte n+2 etc.

The LZW reader has just read code aaaaaaaaaaa, and wants to read bbbbbbbbbbb.
It detects that it needs to load block M (this is what the if() statement referred to in my patch does, it says "if the buffer has not enough bits, read the next datablock"). But once the next block is read, there are still not enough bits in the buffer (the last bit of bbbbbbbbbbb is in block M + 1, gif-load.c in svn does not detect this, and happily goes on reading 11 bits from the buffer (in this case the last bit of bbbbbbbbbbb which is in block M + 1 will be garbage (it will be the LSB of the second byte of block M - 1), in the case of test.gif this bit is a 0 while it should be a 1, hence the corruption).

By replacing the if() with a while() I make the code re-check if there are enough bits after reading block M. The vast majority of .gif's do not have datablocks of only one byte, these .gifs will always satisfy the while() loop after at most one iteration (so the while() acts like an if()). Therefore the patch will not break loading of other .gifs.

In the example, after reading block M, it is detected that there are now 10 bits in the buffer (so that one bit is missing) and block + M is loaded into the buffer (while preserving the last two bytes of the previous contents of the buffer) and code bbbbbbbbbbb can now be read completely.

I hope this explanation is clear. If you have questions, please ask.
Comment 5 Martin Nordholts 2008-06-01 15:11:30 UTC
Ok fair enough, you obviously know what you're talking about :)

Applied to trunk (rev 25876) and gimp-2-5 (rev 25877):

2008-06-01  Martin Nordholts  <martinn@svn.gnome.org>

	* plug-ins/common/gif-load.c (GetCode): Applied patch from Rik
	Snel that fixes loading of .gif files that contains 1-byte data
	blocks. (Bug #535888.)
Comment 6 Martin Nordholts 2008-06-01 15:14:36 UTC
gimp-2-4*