After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 730085 - tracker handles epubs as zips, frequently complains about "No member in zip file"
tracker handles epubs as zips, frequently complains about "No member in zip f...
Status: RESOLVED DUPLICATE of bug 746437
Product: tracker
Classification: Core
Component: Extractor
1.0.x
Other Linux
: Normal normal
: ---
Assigned To: tracker-extractor
tracker-extractor
Depends on:
Blocks:
 
 
Reported: 2014-05-13 21:15 UTC by Adam Williamson
Modified: 2015-03-20 20:35 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
example epub that causes the problem (438.14 KB, application/epub+zip)
2014-05-13 21:21 UTC, Adam Williamson
Details

Description Adam Williamson 2014-05-13 21:15:19 UTC
My logs contain several thousand lines of the following type:

Apr 25 13:52:40 adam.happyassassin.net gnome-session[2210]: (tracker-extract:2690): Tracker-WARNING **: No member './Oath_of_Fealty_split_060.html' in zip file 'file:///home/adamw/Documents/books/Elizabeth%20Moon/Elizabeth%20Moon%20-%20Legend%20of%20Paksenarrion%2006%20-%20Paladin's%20Legacy%2001%20-%20Oath%20of%20Fealty.epub'

covering apparently all (or at least a lot) of the epub files tracker encounters on my disk. I don't think these WARNINGs should be logged by default anyway (that's https://bugzilla.gnome.org/show_bug.cgi?id=730083 ), but it seems tracker could improve its handling of epub files. There's so many of these errors that it seems like it can't really be the case that all those files are 'wrong' - either they're actually right, or they're 'wrong' but it's a wrong-ness so common in commercially-distributed (or calibre-produced) epub files that tracker should handle it better.
Comment 1 Adam Williamson 2014-05-13 21:21:35 UTC
Created attachment 276477 [details]
example epub that causes the problem

Here's a copyright-safe (hopefully, it's from Project Gutenberg, A Christmas Carol by Dickens) epub which reproduces the problem if you call:

/usr/libexec/tracker-extract -v 3 -f ~/Documents/ebooks/Charles\ Dickens/A\ Christmas\ Carol\ \(102\)/A\ Christmas\ Carol\ -\ Charles\ Dickens.epub
Comment 2 Kunaal Jain 2015-03-19 10:07:17 UTC
These HTML files do exist. The bug is not with epub extract file.

The problem is while searching file name "./something" find_member function in tracker_gsf file thinks "." as a directory and calls gsf_infile_child_by_name with directory as "." . gsf_infile_child_by_name unfortunately doesn't recognize "." as current directory. 

I am opening a new bug for this.
Comment 3 Kunaal Jain 2015-03-19 10:25:54 UTC
Created bug https://bugzilla.gnome.org/show_bug.cgi?id=746437 .

I am surprised no one noticed this yet. The log is full of these errors whenever "./something" is searched. As a result these metadata are not indexed though they exist.
Comment 4 Martyn Russell 2015-03-20 20:15:20 UTC
Kunaal, you should mark this bug as a duplicate of the one you linked here.
Normally, you should find the earliest bug or the one with the most detail and patch that and mark others as its duplicate. Since you patched bug report #746437, I would mark others as duplicating that.

Thanks.
Comment 5 Kunaal Jain 2015-03-20 20:20:04 UTC
Thanks, I'll take care next time. I filed new bug and patched it as the problem I found after debugging was more generic in nature. 
Unfortunately, I can't mark this as duplicate as I don't have access rights. Please do it.
Comment 6 Martyn Russell 2015-03-20 20:35:57 UTC
Thanks for taking the time to report this.
This particular bug has already been reported into our bug tracking system, but we are happy to tell you that the problem has already been fixed in the code repository.

*** This bug has been marked as a duplicate of bug 746437 ***