After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 763377 - gcab corrupts identical files on extraction
gcab corrupts identical files on extraction
Status: RESOLVED FIXED
Product: msitools
Classification: Other
Component: gcab
0.95
Other Linux
: Normal normal
: 1.0
Assigned To: msitools maintainer(s)
msitools maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2016-03-09 15:30 UTC by Mattias Engdegård
Modified: 2016-03-09 17:26 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
cab1.cab (50.01 KB, application/vnd.ms-cab-compressed)
2016-03-09 15:30 UTC, Mattias Engdegård
  Details
extract: learn to rewind if needed (1.18 KB, patch)
2016-03-09 17:08 UTC, Marc-Andre Lureau
committed Details | Review

Description Mattias Engdegård 2016-03-09 15:30:05 UTC
Created attachment 323518 [details]
cab1.cab

Gcab is unable to extract identical files that share compressed data. The result is that one or more of the files are extracted with corrupted contents.

Attached is a cab archive that exhibits the problem. It was created by WiX, which apparently optimises the storage of files with identical contents by making them share compressed data. As far as I can tell, each group of same-content files will begin at offset 0 in the same folder for obvious reasons; this can probably be relied upon in case it would make the code simpler.

The attached cab contains two identical files, frn1.pdf and frn2.pdf (a public domain document, by the way). Extracting them with gcab will result in one of them being corrupt. Other tools extract the files without error.

This bug may be related to bug 763376, or at least have related solutions.
Comment 1 Marc-Andre Lureau 2016-03-09 17:07:59 UTC
The following fix has been pushed:
616e77a extract: learn to rewind if needed
Comment 2 Marc-Andre Lureau 2016-03-09 17:08:04 UTC
Created attachment 323524 [details] [review]
extract: learn to rewind if needed

In some cases, files my point to previously treated offset. Let's rewind
in this case.
Comment 3 Marc-Andre Lureau 2016-03-09 17:08:54 UTC
(In reply to Mattias Engdegård from comment #0)
> The attached cab contains two identical files, frn1.pdf and frn2.pdf (a
> public domain document, by the way). 

interesting reading ;)
Comment 4 Mattias Engdegård 2016-03-09 17:26:43 UTC
(In reply to Marc-Andre Lureau from comment #1)
> The following fix has been pushed:
> 616e77a extract: learn to rewind if needed

Thank you!