After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 653768 - inline HTML subtitle parser plugin
inline HTML subtitle parser plugin
Status: RESOLVED INVALID
Product: GStreamer
Classification: Platform
Component: gst-plugins-bad
git master
Other Linux
: Normal normal
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2011-06-30 18:34 UTC by Arnaud Vrac
Modified: 2011-07-10 19:31 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
htmlparse patch against gst-plugins-bad git head (10.24 KB, patch)
2011-06-30 18:35 UTC, Arnaud Vrac
none Details | Review

Description Arnaud Vrac 2011-06-30 18:34:55 UTC
Hello,

I have written a small subtitle parser plugin, that converts inline html in plain/text subtitles to pango markup. The plugin is automatically plugged by the subtitleoverlay element so it works out of the box.

Commit 74e0c05ff7d2270494d should be reverted in gst-plugins-good if the htmlparse plugin makes it in gstreamer.

I have attached a patch against gst-plugins-bad git head.

Thanks
Comment 1 Arnaud Vrac 2011-06-30 18:35:56 UTC
Created attachment 191052 [details] [review]
htmlparse patch against gst-plugins-bad git head
Comment 2 David Schleef 2011-07-04 05:19:43 UTC
text/plain is not the appropriate type for HTML subtitles.

Also, htmlparse is not the best name for this element.  It does not parse HTML.
Comment 3 Arnaud Vrac 2011-07-04 12:23:12 UTC
I'm using text/plain because there is no metadata in the input stream specifying that the subtitles contains tags. There is no spec for tags in plain text subtitles. We could parse the subtitles in the demuxers to detect a tag and change the mimetype but I don't think it is necessary.

Since there is no spec for this kind of subtitles and the tags are taken from the html spec, I'm not sure what other name could be used.
Comment 4 Sebastian Dröge (slomo) 2011-07-10 17:36:43 UTC
You can invent a new caps format, e.g. "subtitle/html". This should also go together with a typefinder in gst-plugins-base/gst/typefind but as this seems to be plain HTML this will be hard to distinguish from non-subtitle HTML...

Where are subtitles like this used?
Comment 5 Arnaud Vrac 2011-07-10 19:31:48 UTC
This element was useful with subtitles embedded in MKV files, and I just found out they are actually in the SubRip format, the text/plain format is the output of matroskademux.

SubRip format can have the following tags:
<b>
<i>
<u>
<s>
<font color="#xxxxxx">

The subparse element handles <b>, <i>, <u> and <s>. The matroskademux element handles <b>, <i>, <u>, <s> and <span> tags since 74e0c05ff7d2270494d. So my plugin is actually not needed, only font color support is missing in these elements.

Sorry for my lack of research, I am closing the issue.