GNOME Bugzilla – Bug 346581
[typefinding] recognise text/html
Last modified: 2006-07-06 13:25:21 UTC
plugins like gnomevfssrc or neonhttpsrc may return text/html, but GStreamer cannot identify this mime type. I therefore propose the following patch: Index: gsttypefindfunctions.c =================================================================== RCS file: /cvs/gstreamer/gst-plugins-base/gst/typefind/gsttypefindfunctions.c,v retrieving revision 1.110 diff -u -3 -p -b -B -d -r1.110 gsttypefindfunctions.c --- gsttypefindfunctions.c 24 May 2006 08:34:53 -0000 1.110 +++ gsttypefindfunctions.c 4 Jul 2006 20:23:48 -0000 @@ -2332,6 +2332,7 @@ plugin_init (GstPlugin * plugin) static gchar *ape_exts[] = { "ape", NULL }; static gchar *uri_exts[] = { "ram", NULL }; static gchar *smil_exts[] = { "smil", NULL }; + static gchar *html_exts[] = { "htm", "html", NULL }; static gchar *xml_exts[] = { "xml", NULL }; static gchar *jpeg_exts[] = { "jpg", "jpe", "jpeg", NULL }; static gchar *gif_exts[] = { "gif", NULL }; @@ -2428,6 +2429,9 @@ plugin_init (GstPlugin * plugin) TYPE_FIND_REGISTER (plugin, "video/quicktime", GST_RANK_SECONDARY, qt_type_find, qt_exts, QT_CAPS, NULL, NULL); + TYPE_FIND_REGISTER_START_WITH (plugin, "text/html", + GST_RANK_SECONDARY, html_exts, "<!DOCTYPE HTML", 14, + GST_TYPE_FIND_MAXIMUM); TYPE_FIND_REGISTER_START_WITH (plugin, "application/vnd.rn-realmedia", GST_RANK_SECONDARY, rm_exts, ".RMF", 4, GST_TYPE_FIND_MAXIMUM); TYPE_FIND_REGISTER (plugin, "application/x-shockwave-flash",
That seems like a great addition. I think the 'HTML' bit after the '<!DOCTYPE' bit can be lower-case as well though, no? (as View => View Page Source on this bugzilla page seems to suggest). It would be great if you could add a proper typefind function for this that checks for the marker without case sensitivity (and possibly we might additionally also just want to check for a '<html>' marker in the first N bytes).
Created attachment 68441 [details] [review] Patch to identify mime-type text/html
Thanks, committed slightly modified: 2006-07-06 Tim-Philipp Müller <tim at centricular dot net> Patch by: Lutz Mueller <lutz at topfrose de> * gst/typefind/gsttypefindfunctions.c: (html_type_find), (plugin_init): Add typefinding for text/html (#346581). (the xml_check_first_element() didn't work with input that doesn't have an XML declaration tag at the beginning and had to be fixed first; the reason for the third fallback is that xml_check_first_element() won't work with very small files).