GNOME Bugzilla – Bug 733317
tracker-extract: remove application/vnd.ms-* catchall from msoffice
Last modified: 2014-10-16 13:10:42 UTC
Otherwise we match on application/vnd.ms-asf (the .asf video container format), which is not an OLE2 file and msoffice cannot handle.
Created attachment 280979 [details] [review] tracker-extract: remove application/vnd.ms-* catchall from msoffice
Created attachment 280995 [details] [review] tracker-extract: recognize application/vnd.ms-asf for the gstreamer plugin application/vnd.ms-asf is the new standard name for video/x-ms-asf.
Comment on attachment 280979 [details] [review] tracker-extract: remove application/vnd.ms-* catchall from msoffice I'm not really sure if this is the right approach, these are the ones we would miss if we applied this patch: $ grep -i application/vnd.ms- /usr/share/mime/types|uniq |sort|grep -v application/msword|grep -v application/vnd.ms-powerpoint | grep -v application/vnd.ms-excel application/vnd.ms-access application/vnd.ms-cab-compressed application/vnd.ms-htmlhelp application/vnd.ms-publisher application/vnd.ms-tnef application/vnd.ms-word application/vnd.ms-word.document.macroenabled.12 application/vnd.ms-word.document.macroEnabled.12 application/vnd.ms-word.template.macroenabled.12 application/vnd.ms-word.template.macroEnabled.12 application/vnd.ms-works application/vnd.ms-wpl I can see a few in there that we definitely should not be missing and actually, it looks like we have got the msword type completely wrong too in our existing rule file. If we remove the ms-* we should add in all the ones above that we would miss.
Comment on attachment 280995 [details] [review] tracker-extract: recognize application/vnd.ms-asf for the gstreamer plugin Thanks for this patch!
Created attachment 282242 [details] [review] tracker-extract: remove application/vnd.ms-* catchall from msoffice Otherwise we match on application/vnd.ms-asf (the .asf video container format), which is not an OLE2 file and msoffice cannot handle. I did not all of them, but only those that are in OLE2 format that libgsf can recognize.
Comment on attachment 282242 [details] [review] tracker-extract: remove application/vnd.ms-* catchall from msoffice Thanks for the patch revision. Comments: 1. I noticed that vnd.ms-word is in there twice, please remove one of those. 2. I wonder if vnd.ms-htmlhelp makes sense to include? Please update the patch for #1 and use your discretion on #2 and then commit. Thanks!
Giovanni, do you need help committing this patch?
Sorry, I was on vacation last week and then I overlooked this in my TODO list. Attachment 280995 [details] pushed as e0a8085 - tracker-extract: recognize application/vnd.ms-asf for the gstreamer plugin Attachment 282242 [details] pushed as cf04b2d - tracker-extract: remove application/vnd.ms-* catchall from msoffice
Created attachment 288652 [details] [review] tracker-extract: add application/msword back to msoffice rules With the application/msword type excluded (some?) MS Word documents are no more indexed - it is the mime type which libmagic/GIO returns. Identified by functional-tests/400-extractor.py, case office-doc, on Nemomobile.
Comment on attachment 288652 [details] [review] tracker-extract: add application/msword back to msoffice rules Nice catch, thanks for the patch. It would be nice if we could integrate the functional tests into distcheck...