GNOME Bugzilla – Bug 763377
gcab corrupts identical files on extraction
Last modified: 2016-03-09 17:26:43 UTC
Created attachment 323518 [details] cab1.cab Gcab is unable to extract identical files that share compressed data. The result is that one or more of the files are extracted with corrupted contents. Attached is a cab archive that exhibits the problem. It was created by WiX, which apparently optimises the storage of files with identical contents by making them share compressed data. As far as I can tell, each group of same-content files will begin at offset 0 in the same folder for obvious reasons; this can probably be relied upon in case it would make the code simpler. The attached cab contains two identical files, frn1.pdf and frn2.pdf (a public domain document, by the way). Extracting them with gcab will result in one of them being corrupt. Other tools extract the files without error. This bug may be related to bug 763376, or at least have related solutions.
The following fix has been pushed: 616e77a extract: learn to rewind if needed
Created attachment 323524 [details] [review] extract: learn to rewind if needed In some cases, files my point to previously treated offset. Let's rewind in this case.
(In reply to Mattias Engdegård from comment #0) > The attached cab contains two identical files, frn1.pdf and frn2.pdf (a > public domain document, by the way). interesting reading ;)
(In reply to Marc-Andre Lureau from comment #1) > The following fix has been pushed: > 616e77a extract: learn to rewind if needed Thank you!