GNOME Bugzilla – Bug 741109
BMP extractor doesn't retrieve image height and width
Last modified: 2015-02-05 22:25:04 UTC
tracker-extrack can't get correct width and height for BMP format file. the width and height equal to -1 for each of BMP format files.
(In reply to comment #0) > tracker-extrack can't get correct width and height for BMP format file. > the width and height equal to -1 for each of BMP format files. The BMP extractor doesn't get that information, which is probably why you have this problem: https://git.gnome.org/browse/tracker/tree/src/tracker-extract/tracker-extract-bmp.c Unless you're using another extractor for BMP files. Patches accepted ;)
Created attachment 292380 [details] [review] implement get_metadata I attached a tracker-extract-bmp.c to get the information of width and height. but I still can not get the width and height of bmp file from sparql or tracker-extract command.
Created attachment 292419 [details] fixed source code
Comment on attachment 292380 [details] [review] implement get_metadata >#ifndef _GNU_SOURCE >#define _GNU_SOURCE >#endif There should be no need for these, we use AC_USE_SYSTEM_EXTENSIONS in configure.ac. See commit 82503863417bc686d50e029382c8db2e3ec7317c. >typedef struct { // BMP file header > char bfType[2]; // "BM" > uint bfSize; // size of file > ushort bfReserved1; > ushort bfReserved2; > uint bfOffBits; // pointer to the pixmap bits >}__attribute__((packed)) BMPFILEHEADER; Coding style, we don't use //, we use /* */ ;) Also, are those "__attribute__" cases GCC-isms? That might be problematic for people not using GCC to compile. >static gboolean find_max_width_and_height (const GFile *file, gint64 *width, gint64 *height) More coding style issues, please see: https://wiki.gnome.org/Projects/Tracker/Documentation/CodingStyle >{ > GFileInputStream *stream; > GError *error = NULL; > BMPINFOHEADER bmpinfo; > > stream = g_file_read (file, NULL, &error); We should check error before trying to skip to the stream. > if (!g_input_stream_skip (G_INPUT_STREAM (stream),sizeof(BMPFILEHEADER), NULL, &error)) Also, re-using error when it is already set will cause a critical error by GLib here if the initial g_file_read() failed. > { > width = -1; > height = -1; You're setting a pointer to a gint64 to -1, seems wrong? You mean *width and *height here right? > g_warning ("Error skip bmp file header from stream: '%s'", error->message); > g_error_free (error); > g_object_unref(stream); > return FALSE; > } > > if (!g_input_stream_read (G_INPUT_STREAM (stream), &bmpinfo, sizeof(BMPINFOHEADER), NULL, &error)) > { > width = -1; > height = -1; I would instead set *width and *height before anything is done at the start of the function to avoid repetition. > g_warning ("Error read bmp file info header from stream: '%s'", error->message); > g_error_free (error); > g_object_unref(stream); > return FALSE; > } > > width = bmpinfo.biWidth; > height = bmpinfo.biHeight; > g_input_stream_close (G_INPUT_STREAM (stream), NULL, NULL); > g_object_unref(stream); > return TRUE; >} > >G_MODULE_EXPORT gboolean >tracker_extract_get_metadata (TrackerExtractInfo *info) >{ > TrackerSparqlBuilder *preupdate, *metadata; > goffset size; > const gchar *graph; > gchar *filename, *uri; > GFile *file; > gint64 max_width, max_height; > > preupdate = tracker_extract_info_get_preupdate_builder (info); > metadata = tracker_extract_info_get_metadata_builder (info); > graph = tracker_extract_info_get_graph (info); > > file = tracker_extract_info_get_file (info); > filename = g_file_get_path (file); > size = tracker_file_get_size (filename); > > if (size < 14) { > /* Smaller than BMP header, can't be a real BMP file */ > g_free (filename); > return FALSE; > } > > tracker_sparql_builder_predicate (metadata, "a"); > tracker_sparql_builder_object (metadata, "nfo:Image"); > tracker_sparql_builder_object (metadata, "nmm:Photo"); > > /* TODO: Add actual metadata extraction for BMP files */ > > > if (find_max_width_and_height (file, &max_width, &max_height)) { > g_message("tracker-extract-bmp: %d max_width=%d, max_height=%d", __LINE__, max_width, max_height); > if (max_width > 0) { > tracker_sparql_builder_predicate (metadata, "nfo:width"); > tracker_sparql_builder_object_int64 (metadata, (gint64) max_width); > } > if (max_height > 0) { > tracker_sparql_builder_predicate (metadata, "nfo:height"); > tracker_sparql_builder_object_int64 (metadata, (gint64) max_height); Looks right to me, not sure why you have to cast here though? > } > } > > g_object_unref(file); Looks like filename is leaked here too. Might have been a leak before your patch by the looks of it?
Finally, thanks for writing this patch, I appreciate the work here. If you're able to clean up the comments I mentioned, I would gladly accept this patch! (In reply to comment #2) > I attached a tracker-extract-bmp.c to get the information of width and height. > > but I still can not get the width and height of bmp file from sparql or > tracker-extract command. If you run: $ tracker-extract -v 3 -f FILE.bmp What output do you get?
Created attachment 292439 [details] patch for metadata of bmp clean up the comments Martyn Russell mentions.
(In reply to comment #6) > Created an attachment (id=292439) [details] > patch for metadata of bmp > > clean up the comments Martyn Russell mentions. do not use this attachment. there are some errors.
Created attachment 292442 [details] [review] patch-2014-1210-2148 fix an overflow issue.
(In reply to comment #5) > Finally, thanks for writing this patch, I appreciate the work here. If you're > able to clean up the comments I mentioned, I would gladly accept this patch! > > (In reply to comment #2) > > I attached a tracker-extract-bmp.c to get the information of width and height. > > > > but I still can not get the width and height of bmp file from sparql or > > tracker-extract command. > > If you run: > > $ tracker-extract -v 3 -f FILE.bmp > > What output do you get? [system@localhost user]$ /usr/libexec/tracker-extract -v 3 -f /home/user/bmptest.bmp Locale 'TRACKER_LOCALE_LANGUAGE' was set to 'zh_CN.UTF-8' Locale 'TRACKER_LOCALE_TIME' was set to 'zh_CN.UTF-8' Locale 'TRACKER_LOCALE_COLLATE' was set to 'zh_CN.UTF-8' Locale 'TRACKER_LOCALE_NUMERIC' was set to 'zh_CN.UTF-8' Locale 'TRACKER_LOCALE_MONETARY' was set to 'zh_CN.UTF-8' Initializing Storage... Mount monitors set up for to watch for added, removed and pre-unmounts... No mounts found to iterate Setting priority nice level to 19 Loading extractor rules... (/usr/share/tracker/extract-rules) Loaded rule '10-abw.rule' Loaded rule '10-bmp.rule' Loaded rule '10-dvi.rule' Loaded rule '10-epub.rule' Loaded rule '10-flac.rule' Loaded rule '10-gif.rule' Loaded rule '10-html.rule' Loaded rule '10-ico.rule' Loaded rule '10-jpeg.rule' Loaded rule '10-mp3.rule' Loaded rule '10-msoffice.rule' Loaded rule '10-oasis.rule' Loaded rule '10-pdf.rule' Loaded rule '10-png.rule' Loaded rule '10-ps.rule' Loaded rule '10-tiff.rule' Loaded rule '10-vorbis.rule' Loaded rule '10-xmp.rule' Loaded rule '11-msoffice-xml.rule' Loaded rule '15-playlist.rule' Loaded rule '90-text-generic.rule' Loaded rule '93-mplayer-generic.rule' Loaded rule '93-totem-generic.rule' Extractor rules loaded Setting memory limitations: total is 2.1 GB, minimum is 256 MB, recommended is ~1 GB Virtual/Heap set to 1.0 GB (50% of total or MAXLONG) Guessing mime type as 'image/bmp' Extracting... Using /usr/lib/tracker-0.18/extract-modules/libextract-bmp.so... Done (4 items) SPARQL pre-update: -- -- SPARQL item: -- a nfo:Image , nmm:Photo ; nfo:width 800 ; nfo:height 640 . -- SPARQL where clause: -- -- SPARQL post-update: -- --
Comment on attachment 292442 [details] [review] patch-2014-1210-2148 Apart from some coding style issues that still exist, it looks better! :)
(In reply to comment #9) > (In reply to comment #5) > > Finally, thanks for writing this patch, I appreciate the work here. If you're > > able to clean up the comments I mentioned, I would gladly accept this patch! > > > > (In reply to comment #2) > > > I attached a tracker-extract-bmp.c to get the information of width and height. > > > > > > but I still can not get the width and height of bmp file from sparql or > > > tracker-extract command. > > > > If you run: > > > > $ tracker-extract -v 3 -f FILE.bmp > > > > What output do you get? > > [system@localhost user]$ /usr/libexec/tracker-extract -v 3 -f > /home/user/bmptest.bmp Just to be clear, have you done "make install" in src/tracker-extract/ ? [snip] > a nfo:Image , nmm:Photo ; > nfo:width 800 ; > nfo:height 640 . Looks like the width and height are included then?
(In reply to comment #11) > (In reply to comment #9) > > (In reply to comment #5) > > > Finally, thanks for writing this patch, I appreciate the work here. If you're > > > able to clean up the comments I mentioned, I would gladly accept this patch! > > > > > > (In reply to comment #2) > > > > I attached a tracker-extract-bmp.c to get the information of width and height. > > > > > > > > but I still can not get the width and height of bmp file from sparql or > > > > tracker-extract command. > > > > > > If you run: > > > > > > $ tracker-extract -v 3 -f FILE.bmp > > > > > > What output do you get? > > > > [system@localhost user]$ /usr/libexec/tracker-extract -v 3 -f > > /home/user/bmptest.bmp > > Just to be clear, have you done "make install" in src/tracker-extract/ ? > > [snip] > > > a nfo:Image , nmm:Photo ; > > nfo:width 800 ; > > nfo:height 640 . > > Looks like the width and height are included then? yes, with the patch, I can get width and height. I will continue to fix coding style issue.
Created attachment 293810 [details] [review] fixed source code fix some coding style issues, pls help to review again, thx.
Comment on attachment 293810 [details] [review] fixed source code >/* > * Copyright (C) 2013-2014 Jolla Ltd. <andrew.den.exter@jollamobile.com> > * Author: Philip Van Hoof <philip@codeminded.be> Author goes under the legal statement usually. > * > * This library is free software; you can redistribute it and/or > * modify it under the terms of the GNU Lesser General Public > * License as published by the Free Software Foundation; either > * version 2.1 of the License, or (at your option) any later version. > * > * This library is distributed in the hope that it will be useful, > * but WITHOUT ANY WARRANTY; without even the implied warranty of > * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > * Lesser General Public License for more details. > * > * You should have received a copy of the GNU Lesser General Public > * License along with this library; if not, write to the > * Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, > * Boston, MA 02110-1301, USA. Author goes here... >static gboolean >get_img_resolution (const GFile *file, > gint64 *width, > gint64 *height) Coding style again in the definition. >{ > GFileInputStream *stream; > GError *error = NULL; > char bfType[2]; Spaces here after "char" for some reason? > *width = 0; > *height = 0; > > stream = g_file_read (file, NULL, &error); > if (error) { > g_debug ("Could not read bmp file : %s", error->message); > g_error_free (error); > return FALSE; > } Just a note, I know it's not GLib's policy, but it *can* happen that an error is unset and a stream is NULL (i.e. if 'file' fails G_IS_FILE()), so I would usually use: if (!stream) { g_message ("Could not read bmp file, %s", error ? error->message : "no error given"); g_clear_error (&error); return FALSE; } NOTICE: 1. We check 'error' is set before using it 2. We use the g_clear_error() which will check if error is set before freeing it for us 3. I've used a g_message() not a g_debug() here, because we don't expect this part to fail > > if (!g_input_stream_read (G_INPUT_STREAM (stream), bfType, 2, NULL, &error)) { > g_debug ("Error skip bmp file header from stream: '%s'", error->message); > g_error_free (error); > g_object_unref(stream); > return FALSE; > } Again, this will fail if error us unset but we couldn't read the stream and there will be warnings all over the console. Same as above applies. > if(bfType[0] != 'B' || bfType[1] != 'M') { Coding style error after 'if'. > g_error_free (error); > g_object_unref(stream); Again here. > return FALSE; > } > > /*skip some useless bytes*/ Again here for comments. > if (!g_input_stream_skip(G_INPUT_STREAM (stream), 16, NULL, &error)) { Again here after 'skip'. > g_debug ("Error skip bmp file header from stream: '%s'", error->message); Errors like this are not debug statements, they should be either a g_warning() or g_message(), depending on whether you expect or don't expect this to happen. > g_error_free (error); > g_object_unref(stream); Coding style. I've skipped a bunch of stuff below, there are more cases of coding style and what I've already mentioned :) > return FALSE; > } > > if (!g_input_stream_read (G_INPUT_STREAM (stream), width, sizeof(uint), NULL, &error)) { Instead of repeatedly casting and type checking, why not just have a line up top for: input_stream = G_INPUT_STREAM (stream); > if (get_img_resolution (file, &bmpwidth, &bmpheight)) { > if (bmpwidth > 0) { > tracker_sparql_builder_predicate (metadata, "nfo:width"); > tracker_sparql_builder_object_int64 (metadata, bmpwidth); > } > if (bmpheight > 0) { > tracker_sparql_builder_predicate (metadata, "nfo:height"); > tracker_sparql_builder_object_int64 (metadata, bmpheight); > } > } > > g_free (filename); > g_object_unref(file); Coding style again :) The actual code looks mostly fine, just some rough edges to smooth over. Thanks :)
Created attachment 296195 [details] [review] patch for metadata of bmp pls help to review. thx.
Comment on attachment 296195 [details] [review] patch for metadata of bmp Thanks for the patch, I've fixed up your version and committed it now. Just a few things: 1. The GFile shouldn't have been unref'd 2. I put some protective code around the get height and width function. Thanks again :)
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.