GNOME Bugzilla – Bug 629670
[subparse] Do not over-process subtitle data
Last modified: 2011-05-25 18:40:19 UTC
The subparse element is too clever for my tastes when dealing with subtitles: it assumes that the displaying elements are unable to display any other markup than <u>, <i>, <b>, and strips non-conforming data. I think that this processing should belong to either a new element (like "subtitlecleaner") or to the element which knows its displaying capabilities, i.e. "textoverlay". The subtitlecleaner approach would be nicer IMHO, but it would break existing applications. Hence, putting the code into textoverlay/cairotextoverlay is maybe more convenient. This would simplify the subparse code, and allow experimentations (such as feeding SVG data to the proposed rsvgoverlay element, to display graphics over the video, which is currently impossible). Another option would be to add a "raw" option to the subparse element, which would deactivate data cleaning.
For text/plain it should really remove all markup, for text/x-pango-markup it should remove all markup except valid pango markup. What you want is a new subtitle/text/markup format for use in GStreamer. Something more powerful than pango markup is definitely a good idea but it should really be defined and not something like "text with random markup".
Agreed for text/plain, and also for text/x-pango-markup: if such caps are specified, they should be respected. And the pipeline negociation mechanism makes it useful to have precise things specified. However, it would be nice to have a way for the user to indicate the type of data that he wants to get out of the subtitle file. See for instance the examples given in http://blog.gingertech.net/2010/08/07/websrt-and-html5-media-accessibility/#WebSRT for the (not finalized yet) WebSRT format: its payload can be plain text, text with markup, "cue metadata" (which means practically anything: HTML, json, etc). It could for instance be a "raw" or a "force-content-type=text/json" property added to the element. But anyway, the user should have some control over it (the autodetection mechanism is nice, but there should be a way to deactivate it).
*** This bug has been marked as a duplicate of bug 629764 ***