GNOME Bugzilla – Bug 730085
tracker handles epubs as zips, frequently complains about "No member in zip file"
Last modified: 2015-03-20 20:35:57 UTC
My logs contain several thousand lines of the following type: Apr 25 13:52:40 adam.happyassassin.net gnome-session[2210]: (tracker-extract:2690): Tracker-WARNING **: No member './Oath_of_Fealty_split_060.html' in zip file 'file:///home/adamw/Documents/books/Elizabeth%20Moon/Elizabeth%20Moon%20-%20Legend%20of%20Paksenarrion%2006%20-%20Paladin's%20Legacy%2001%20-%20Oath%20of%20Fealty.epub' covering apparently all (or at least a lot) of the epub files tracker encounters on my disk. I don't think these WARNINGs should be logged by default anyway (that's https://bugzilla.gnome.org/show_bug.cgi?id=730083 ), but it seems tracker could improve its handling of epub files. There's so many of these errors that it seems like it can't really be the case that all those files are 'wrong' - either they're actually right, or they're 'wrong' but it's a wrong-ness so common in commercially-distributed (or calibre-produced) epub files that tracker should handle it better.
Created attachment 276477 [details] example epub that causes the problem Here's a copyright-safe (hopefully, it's from Project Gutenberg, A Christmas Carol by Dickens) epub which reproduces the problem if you call: /usr/libexec/tracker-extract -v 3 -f ~/Documents/ebooks/Charles\ Dickens/A\ Christmas\ Carol\ \(102\)/A\ Christmas\ Carol\ -\ Charles\ Dickens.epub
These HTML files do exist. The bug is not with epub extract file. The problem is while searching file name "./something" find_member function in tracker_gsf file thinks "." as a directory and calls gsf_infile_child_by_name with directory as "." . gsf_infile_child_by_name unfortunately doesn't recognize "." as current directory. I am opening a new bug for this.
Created bug https://bugzilla.gnome.org/show_bug.cgi?id=746437 . I am surprised no one noticed this yet. The log is full of these errors whenever "./something" is searched. As a result these metadata are not indexed though they exist.
Kunaal, you should mark this bug as a duplicate of the one you linked here. Normally, you should find the earliest bug or the one with the most detail and patch that and mark others as its duplicate. Since you patched bug report #746437, I would mark others as duplicating that. Thanks.
Thanks, I'll take care next time. I filed new bug and patched it as the problem I found after debugging was more generic in nature. Unfortunately, I can't mark this as duplicate as I don't have access rights. Please do it.
Thanks for taking the time to report this. This particular bug has already been reported into our bug tracking system, but we are happy to tell you that the problem has already been fixed in the code repository. *** This bug has been marked as a duplicate of bug 746437 ***