After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 346581 - [typefinding] recognise text/html
[typefinding] recognise text/html
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-base
git master
Other Linux
: Normal normal
: 0.10.9
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2006-07-04 20:29 UTC by Lutz Mueller
Modified: 2006-07-06 13:25 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch to identify mime-type text/html (2.08 KB, patch)
2006-07-06 04:47 UTC, Lutz Mueller
committed Details | Review

Description Lutz Mueller 2006-07-04 20:29:42 UTC
plugins like gnomevfssrc or neonhttpsrc may return text/html, but GStreamer cannot identify this mime type. I therefore propose the following patch:

Index: gsttypefindfunctions.c
===================================================================
RCS file: /cvs/gstreamer/gst-plugins-base/gst/typefind/gsttypefindfunctions.c,v
retrieving revision 1.110
diff -u -3 -p -b -B -d -r1.110 gsttypefindfunctions.c
--- gsttypefindfunctions.c      24 May 2006 08:34:53 -0000      1.110
+++ gsttypefindfunctions.c      4 Jul 2006 20:23:48 -0000
@@ -2332,6 +2332,7 @@ plugin_init (GstPlugin * plugin)
   static gchar *ape_exts[] = { "ape", NULL };
   static gchar *uri_exts[] = { "ram", NULL };
   static gchar *smil_exts[] = { "smil", NULL };
+  static gchar *html_exts[] = { "htm", "html", NULL };
   static gchar *xml_exts[] = { "xml", NULL };
   static gchar *jpeg_exts[] = { "jpg", "jpe", "jpeg", NULL };
   static gchar *gif_exts[] = { "gif", NULL };
@@ -2428,6 +2429,9 @@ plugin_init (GstPlugin * plugin)
   TYPE_FIND_REGISTER (plugin, "video/quicktime", GST_RANK_SECONDARY,
       qt_type_find, qt_exts, QT_CAPS, NULL, NULL);

+  TYPE_FIND_REGISTER_START_WITH (plugin, "text/html",
+      GST_RANK_SECONDARY, html_exts, "<!DOCTYPE HTML", 14,
+      GST_TYPE_FIND_MAXIMUM);
   TYPE_FIND_REGISTER_START_WITH (plugin, "application/vnd.rn-realmedia",
       GST_RANK_SECONDARY, rm_exts, ".RMF", 4, GST_TYPE_FIND_MAXIMUM);
   TYPE_FIND_REGISTER (plugin, "application/x-shockwave-flash",
Comment 1 Tim-Philipp Müller 2006-07-04 20:48:01 UTC
That seems like a great addition.

I think the 'HTML' bit after the '<!DOCTYPE' bit can be lower-case as well though, no? (as View => View Page Source on this bugzilla page seems to suggest).

It would be great if you could add a proper typefind function for this that checks for the marker without case sensitivity (and possibly we might additionally also just want to check for a '<html>' marker in the first N bytes).
Comment 2 Lutz Mueller 2006-07-06 04:47:25 UTC
Created attachment 68441 [details] [review]
Patch to identify mime-type text/html
Comment 3 Tim-Philipp Müller 2006-07-06 13:25:21 UTC
Thanks, committed slightly modified:

  2006-07-06  Tim-Philipp Müller  <tim at centricular dot net>

        Patch by: Lutz Mueller <lutz at topfrose de>

        * gst/typefind/gsttypefindfunctions.c: (html_type_find),
        (plugin_init):
          Add typefinding for text/html (#346581).

(the xml_check_first_element() didn't work with input that doesn't have an XML declaration tag at the beginning and had to be fixed first; the reason for the third fallback is that xml_check_first_element() won't work with very small files).