GNOME Bugzilla – Bug 535888
gif reader fails when LZW code is split across three datablocks
Last modified: 2008-06-01 15:14:36 UTC
gif-load.c assumes that if it has not enough bits in the buffer to read the next LZW code it is sufficient to read one datablock to get the desired amount of bits. This is not true, since a it is legal to have a datablock with only one eight bits. (it is however difficult to imagine why a .gif encoder would do such a thing) I will add a .gif file that demonstrates the problem and a simple patch to fix it.
Created attachment 111836 [details] [review] fixes bug
Created attachment 111837 [details] doens't load properly in current svn
Thanks for the patch! I can confirm that it is necessary to have the patch applied in order to properly load the attached gif. I don't fully understand how or why the patch fixes the problem though, could you please elaborate? Are you sure it will not break loading other type of gifs?
I will try: gif pixel data is stored in LZW codes those codes are between 3 and 12 bits long. Here is how the codes are packed into bytes: (from right to left) byte n byte 5 byte 4 byte 3 byte 2 byte 1 MSB LSB MSB LSB MSB LSB MSB LSB MSB LSB +-.....-----+--------+--------+--------+--------+--------+ | and so on |hhhhhggg|ggfffffe|eeeedddd|dcccccbb|bbbaaaaa| +-.....-----+--------+--------+--------+--------+--------+ (this illustration is stolen from the gif spec: gif87a.txt) So aaaaa is the first LZW code, bbbbb the second, etc. Those bytes are then split in blocks. A block contains at most 255 data bytes. Each block is prefixed with it's length in the gif file. So when reading a .gif file, gif-load.c calls GetDataBlock to read the first datablock into a buffer and starts outputting LZW codes. When it detects the end of the data block it moves the last two bytes to the start of the buffer and reads (at the position of the third byte) the next datablock. And then it continues to read the bits. Now the following can occur, suppose we have LZW codes of 11 bits. byte n+1 byte n byte n-1 MSB LSB MSB LSB MSB LSB ----+--------+--------+--------+----- cccc|cccccccb|bbbbbbbb|bbaaaaaa|aaaaa ----+--------+--------+--------+----- block M + 1 |block M | block M - 1 So block M - 1 contains byte n-1, byte n-2 etc, block M only contains one byte, namely byte n, and block M + 1 contains byte n+1, byte n+2 etc. The LZW reader has just read code aaaaaaaaaaa, and wants to read bbbbbbbbbbb. It detects that it needs to load block M (this is what the if() statement referred to in my patch does, it says "if the buffer has not enough bits, read the next datablock"). But once the next block is read, there are still not enough bits in the buffer (the last bit of bbbbbbbbbbb is in block M + 1, gif-load.c in svn does not detect this, and happily goes on reading 11 bits from the buffer (in this case the last bit of bbbbbbbbbbb which is in block M + 1 will be garbage (it will be the LSB of the second byte of block M - 1), in the case of test.gif this bit is a 0 while it should be a 1, hence the corruption). By replacing the if() with a while() I make the code re-check if there are enough bits after reading block M. The vast majority of .gif's do not have datablocks of only one byte, these .gifs will always satisfy the while() loop after at most one iteration (so the while() acts like an if()). Therefore the patch will not break loading of other .gifs. In the example, after reading block M, it is detected that there are now 10 bits in the buffer (so that one bit is missing) and block + M is loaded into the buffer (while preserving the last two bytes of the previous contents of the buffer) and code bbbbbbbbbbb can now be read completely. I hope this explanation is clear. If you have questions, please ask.
Ok fair enough, you obviously know what you're talking about :) Applied to trunk (rev 25876) and gimp-2-5 (rev 25877): 2008-06-01 Martin Nordholts <martinn@svn.gnome.org> * plug-ins/common/gif-load.c (GetCode): Applied patch from Rik Snel that fixes loading of .gif files that contains 1-byte data blocks. (Bug #535888.)
gimp-2-4*