After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 666372 - More information about bootable ISOs
More information about bootable ISOs
Status: RESOLVED FIXED
Product: tracker
Classification: Core
Component: Extractor
git master
Other Linux
: Normal normal
: ---
Assigned To: tracker-extractor
Jamie McCracken
Depends on:
Blocks: 666373
 
 
Reported: 2011-12-16 16:23 UTC by Zeeshan Ali
Modified: 2012-02-13 20:01 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Proposed patch (11.95 KB, patch)
2011-12-16 16:30 UTC, Christophe Fergeau
needs-work Details | Review
Add iso extractor using libosinfo (12.24 KB, patch)
2012-02-07 11:39 UTC, Christophe Fergeau
reviewed Details | Review
updated patch (12.25 KB, patch)
2012-02-13 17:27 UTC, Christophe Fergeau
committed Details | Review

Description Zeeshan Ali 2011-12-16 16:23:36 UTC
Currently Tracker does not provide any details on bootable ISOs (Think OS installation or live media). Using libosinfo[1], tracker can provide more info:

* If the ISO is bootable.
* Which OS is on a bootable media.
* Whether media provides an installer for an OS.
* Whether media supports booting live OS.

[1] https://fedorahosted.org/libosinfo/
Comment 1 Christophe Fergeau 2011-12-16 16:29:25 UTC
Patch sent to the mailing list http://mail.gnome.org/archives/tracker-list/2011-December/msg00017.html
Comment 2 Christophe Fergeau 2011-12-16 16:30:09 UTC
Created attachment 203681 [details] [review]
Proposed patch
Comment 3 Jürg Billeter 2012-01-31 18:42:15 UTC
Review of attachment 203681 [details] [review]:

Thanks for the patch and sorry for the review delay.

::: data/ontologies/93-libosinfo.ontology
@@ +12,3 @@
+	nao:lastModified "2011-12-16T08:00:00Z" .
+
+osinfo:isInstaller a rdf:Property ;

I suggest adding a class for operating system installers instead of the above boolean property. The class should be a subclass of nfo:Software, but not a subclass of nfo:OperatingSystem.

@@ +17,3 @@
+	rdfs:range xsd:boolean .
+
+osinfo:isLiveCD a rdf:Property ;

I suggest that we add the nfo:OperationSystem type only for live CDs (and disk images but that's not relevant here). That would make the above property redundant.

@@ +22,3 @@
+	rdfs:range xsd:boolean .
+
+osinfo:osName a rdf:Property ;

I think we should just use nie:title here. We prefer to use generic properties where possible.

::: src/tracker-extract/tracker-extract-iso.c
@@ +87,3 @@
+	}
+	db = osinfo_loader_get_db (loader);
+	os = osinfo_db_guess_os_from_media(db, media, &matched_media);

According to Zeeshan, os can be NULL here, yet it is used without a check further down.

@@ +104,3 @@
+
+	tracker_sparql_builder_predicate (metadata, "nfo:duration");
+	tracker_sparql_builder_object_int64 (metadata, 75);

What's the purpose of this?
Comment 4 Jürg Billeter 2012-01-31 18:45:08 UTC
I forgot to mention, we have nfo:FileSystemImage which should be used here as well.
Comment 5 Christophe Fergeau 2012-02-03 15:13:13 UTC
(In reply to comment #3)
> @@ +17,3 @@
> +    rdfs:range xsd:boolean .
> +
> +osinfo:isLiveCD a rdf:Property ;
> 
> I suggest that we add the nfo:OperationSystem type only for live CDs (and disk
> images but that's not relevant here). That would make the above property
> redundant.

This isLiveCD is used to diffentiate between installer ISOs and live cd images, I think we want nfo:OperatingSystem to be set both for live CDs and installer ISOs

> 
> @@ +104,3 @@
> +
> +    tracker_sparql_builder_predicate (metadata, "nfo:duration");
> +    tracker_sparql_builder_object_int64 (metadata, 75);
> 
> What's the purpose of this?

Just some testing code that I forgot to remove :)
Comment 6 Christophe Fergeau 2012-02-03 16:25:59 UTC
(In reply to comment #5)
> (In reply to comment #3)
> > @@ +17,3 @@
> > +    rdfs:range xsd:boolean .
> > +
> > +osinfo:isLiveCD a rdf:Property ;
> > 
> > I suggest that we add the nfo:OperationSystem type only for live CDs (and disk
> > images but that's not relevant here). That would make the above property
> > redundant.
> 
> This isLiveCD is used to diffentiate between installer ISOs and live cd images,
> I think we want nfo:OperatingSystem to be set both for live CDs and installer
> ISOs
> 

Ah after rereading again, I think I understand what you meant, we will always set
osinfo:Installer class for both installers and live CDs, and live CDs will get in addition the nfo:OperatingSystem property.
Comment 7 Jürg Billeter 2012-02-03 16:32:19 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #3)
> > > @@ +17,3 @@
> > > +    rdfs:range xsd:boolean .
> > > +
> > > +osinfo:isLiveCD a rdf:Property ;
> > > 
> > > I suggest that we add the nfo:OperationSystem type only for live CDs (and disk
> > > images but that's not relevant here). That would make the above property
> > > redundant.
> > 
> > This isLiveCD is used to diffentiate between installer ISOs and live cd images,
> > I think we want nfo:OperatingSystem to be set both for live CDs and installer
> > ISOs
> > 
> 
> Ah after rereading again, I think I understand what you meant, we will always
> set
> osinfo:Installer class for both installers and live CDs, and live CDs will get
> in addition the nfo:OperatingSystem property.

Yes, that's what I'm suggesting (if there are live CDs without installer, they would just be nfo:OperatingSystem without osinfo:Installer). nfo:OperatingSystem would mean an operating system that is ready to boot/be used, not just an installer for an operating system. In my opinion, this keeps things simple and clear.
Comment 8 Christophe Fergeau 2012-02-07 11:39:20 UTC
Created attachment 206973 [details] [review]
Add iso extractor using libosinfo
Comment 9 Christophe Fergeau 2012-02-07 11:40:41 UTC
Here is the reworked version of the patch. One thing that I don't seem to have gotten right is that I can't see the various "a xxxxx" properties in tracker-info output even though they show up in tracker-extract output.
Comment 10 Martyn Russell 2012-02-10 16:00:26 UTC
Comment on attachment 206973 [details] [review]
Add iso extractor using libosinfo

>@@ -175,6 +175,7 @@ LIBVORBIS_REQUIRED=0.22
> LIBFLAC_REQUIRED=1.2.1
> LIBEXIF_REQUIRED=0.6
> LIBGSF_REQUIRED=1.13
>+LIBOSINFO_REQUIRED=0.0.2
> EXEMPI_REQUIRED=2.1.0
> EVO_REQUIRED=2.32.0
> EVO_SHELL_REQUIRED=2.32.0

I have a feeling this may conflict with my latest change to master. But it's a small issue.

>diff --git a/data/ontologies/33-nfo.ontology b/data/ontologies/33-nfo.ontology
>index 65e4fb9..1006def 100644
>--- a/data/ontologies/33-nfo.ontology
>+++ b/data/ontologies/33-nfo.ontology
>@@ -12,7 +12,7 @@
> 
> nfo: a tracker:Namespace, tracker:Ontology ;
> 	tracker:prefix "nfo" ;
>-	nao:lastModified "2011-12-06T15:40:00Z" .
>+	nao:lastModified "2011-12-15T18:51:25Z" .
> 
> nfo:Document a rdfs:Class ;
> 	rdfs:label "Document" ;
>@@ -926,6 +926,13 @@ nfo:isContentEncrypted a rdf:Property ;
> 	rdfs:domain nie:InformationElement ;
> 	rdfs:range xsd:boolean .
> 
>+# This change must be proposed upstream
>+nfo:isBootable a rdf:Property ;
>+	rdfs:label "Is content bootable" ;
>+	rdfs:comment "Might change (IE of DataObject property?)" ;

The comment (IIRC) is used for documentation. I think this should be described a bit better. It's pretty self explanatory already as a property, but perhaps giving an example of file types you would expect to have this property helps here. E.g. "True when the file is bootable, for example like an ISO or other disc images".

> # Ontology request by Alexander Bokovoy, replaces nfo:device
> # These classes and properties need to be submitted to NEPOMUK
>diff --git a/data/ontologies/93-libosinfo.description b/data/ontologies/93-libosinfo.description
>new file mode 100644
>index 0000000..edb5035
>--- /dev/null
>+++ b/data/ontologies/93-libosinfo.description
>@@ -0,0 +1,13 @@
>+@prefix dsc: <http://www.tracker-project.org/temp/dsc#> .
>+
>+<virtual-ontology-uri:93-libosinfo.ontology> a dsc:Ontology ;
>+	dsc:title "libosinfo-specific annotations Ontology" ;
>+	dsc:description "Libosinf properties" ;

Is that a typo? the title says "libosinfo", the description says "Libosinf"? Also the case different there should be consistent ;)

>+osinfo:id a rdf:Property ;
>+	nrl:maxCardinality 1 ;
>+	rdfs:domain nie:InformationElement ;
>+	rdfs:range xsd:string .

Are you expecting any of these strings to be searchable with Full Text Search (FTS)? If so, there are properties missing here to make sure they're indexed.  

>+ * Author: Christophe Fergeau <cfergeau@redhat.com>
>+ */

Generally, we always include config.h first here.

>+#include <stdio.h>
>+
>+#include <osinfo/osinfo.h>
>+
>+#include <gio/gio.h>
>+
>+#include <libtracker-extract/tracker-extract.h>
>+#include <libtracker-sparql/tracker-sparql.h>
>+

...

>+	g_warn_if_fail(media != NULL);
>+	g_warn_if_fail(loader != NULL);

This breaks the coding style (spaces between functions/macros and parameters).

>+	db = osinfo_loader_get_db (loader);
>+	os = osinfo_db_guess_os_from_media(db, media, &matched_media);

Same code style breakage above.

>+	if (os == NULL)
>+		goto no_os;
>+
>+	has_installer = osinfo_media_get_installer (matched_media);
>+	is_live = osinfo_media_get_live (matched_media);
>+	id = g_strdup (osinfo_entity_get_id (OSINFO_ENTITY (os)));
>+	name = g_strdup (osinfo_product_get_name (OSINFO_PRODUCT (os)));

To make extractions faster, we try to avoid memory allocations wherever possible. The name and id here can be used here directly with tracker_sparql_builder_*() API without the g_strdup() given we duplicate them anyway.

>+	g_object_unref (G_OBJECT (media));
>+	g_object_unref (G_OBJECT (loader));
>+
>+	tracker_sparql_builder_predicate (metadata, "a");
>+	tracker_sparql_builder_object (metadata, "nfo:FilesystemImage");
>+
>+	if (name != NULL) {
>+		tracker_sparql_builder_predicate (metadata, "nie:title");
>+		tracker_sparql_builder_object_string (metadata, name);
>+		g_free (name);
>+	}

Using tracker_sparql_builder_object_unvalidated() might make more sense here for 'id' and 'name' too because it checks for utf8 validity at the same time. Or can you guarantee those functions return UTF-8?

--

Other than some comments above, the rest looks really good to me. Juergbi, any final comments here?
I think we can look to committing this patch once we've addressed these small points.

Thanks Christophe!
Comment 11 Christophe Fergeau 2012-02-10 17:03:20 UTC
(In reply to comment #10)

> 
> >+osinfo:id a rdf:Property ;
> >+	nrl:maxCardinality 1 ;
> >+	rdfs:domain nie:InformationElement ;
> >+	rdfs:range xsd:string .
> 
> Are you expecting any of these strings to be searchable with Full Text Search
> (FTS)? If so, there are properties missing here to make sure they're indexed. 

No idea if we want to be able to search these strings with FTS :) what kind of properties should I add if I want them to be searchable?
 > >+
> >+	has_installer = osinfo_media_get_installer (matched_media);
> >+	is_live = osinfo_media_get_live (matched_media);
> >+	id = g_strdup (osinfo_entity_get_id (OSINFO_ENTITY (os)));
> >+	name = g_strdup (osinfo_product_get_name (OSINFO_PRODUCT (os)));
> 
> To make extractions faster, we try to avoid memory allocations wherever
> possible. The name and id here can be used here directly with
> tracker_sparql_builder_*() API without the g_strdup() given we duplicate them
> anyway.
>

Ok I'll rework the code, the strings becomes invalid after the 2 unref below, that's why I duplicate them. I'll move the unref down.
 
> >+	g_object_unref (G_OBJECT (media));
> >+	g_object_unref (G_OBJECT (loader));
> >+
> >+	tracker_sparql_builder_predicate (metadata, "a");
> >+	tracker_sparql_builder_object (metadata, "nfo:FilesystemImage");
> >+
> >+	if (name != NULL) {
> >+		tracker_sparql_builder_predicate (metadata, "nie:title");
> >+		tracker_sparql_builder_object_string (metadata, name);
> >+		g_free (name);
> >+	}
> 
> Using tracker_sparql_builder_object_unvalidated() might make more sense here
> for 'id' and 'name' too because it checks for utf8 validity at the same time.
> Or can you guarantee those functions return UTF-8?

id will be 7bit ascii from a libosinfo XML file, name also comes from a libosinfo XML file, so we can guarantee they are UTF-8, it's not a random string extracted from the iso

> I think we can look to committing this patch once we've addressed these small
> points.

I'll post an updated patch early next week.
Comment 12 Christophe Fergeau 2012-02-13 17:27:48 UTC
Created attachment 207469 [details] [review]
updated patch
Comment 13 Martyn Russell 2012-02-13 20:00:54 UTC
Comment on attachment 207469 [details] [review]
updated patch

> data/ontologies/33-nfo.ontology           |    9 ++-
> data/ontologies/93-libosinfo.description  |   13 +++
> data/ontologies/93-libosinfo.ontology     |   23 +++++

The ontology files here produced the crash you discussed earlier tonight:

$ ./src/tracker-store/tracker-store 
Initializing tracker-store...
Tracker-Message: Setting up monitor for changes to config file:'/home/martyn/.config/tracker/tracker-store.cfg'
Tracker-Message: Setting up monitor for changes to config file:'/home/martyn/.config/tracker/tracker-store.cfg'
Starting log:
  File:'/home/martyn/.local/share/tracker/lt-tracker-store.log'

(lt-tracker-store:20268): Tracker-CRITICAL **: Fatal error dealing with ontology changes: 932.18: syntax error, expected comma, semicolon, or dot

(lt-tracker-store:20268): Tracker-CRITICAL **: tracker_class_get_uri: assertion `TRACKER_IS_CLASS (service)' failed

(lt-tracker-store:20268): Tracker-CRITICAL **: 932.18: syntax error, expected comma, semicolon, or dot

(lt-tracker-store:20268): Tracker-CRITICAL **: tracker_class_get_uri: assertion `TRACKER_IS_CLASS (service)' failed

(lt-tracker-store:20268): GLib-CRITICAL **: g_variant_new_string: assertion `string != NULL' failed
Segmentation fault (core dumped)

--

>+# This change must be proposed upstream
>+nfo:isBootable a rdf:Property ;
>+	rdfs:label "Is content bootable" ;
>+	rdfs:comment " "True when the file is bootable, for example like an ISO or other disc images" ;

The double quote above is what caused the errors, once that was fixed there were no crashes or warnings.
I fixed it in a fixup so your original commit looks right.

--

The rest is fine thank you and it has now been committed ;)
Comment 14 Martyn Russell 2012-02-13 20:01:13 UTC
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.