GNOME Bugzilla – Bug 758232
Add support for EBU-TT-D / TTML subtitles
Last modified: 2016-11-01 18:47:22 UTC
In Bug #747774 I mentioned that I'd been working on elements to support EBU-TT-D subtitle parsing and rendering. A repository containing these elements (and an associated library) is now up on github: https://github.com/bbc/gst-ttml-subtitles I'd be grateful for comments on the design, and guidance on how this code can best be integrated. As mentioned in #747774, it should be fairly straightforward to create a patch for subparse that would add parsing of EBU-TT-D files, as the changes are mainly contained in a couple of new source files that are specific to TTML. Integrating the library and renderer, however, is probably a bit more complicated. More details are given on the github page. Chris
Is the goal to use the GstSubtitle* as the generic interchange format between subtitle elements in GStreamer? If so it might not be powerful enough to express things like SSA/ASS subtitles, not sure. It might also have to support text-based and picture-based subtitles. Could this, for starters, be just merged into gst-plugins-bad into a plugin with the parser and renderer and then we gradually improve the interchange format and make it used in other subtitle elements and at some point make it public API?
(In reply to Sebastian Dröge (slomo) from comment #1) > Is the goal to use the GstSubtitle* as the generic interchange format > between subtitle elements in GStreamer? If so it might not be powerful > enough to express things like SSA/ASS subtitles, not sure. It might also > have to support text-based and picture-based subtitles. That was my initial plan, but it proved difficult to produce something initially that supports all formats. So what it is at present is a library that supports interchange of TTML-based subtitles. Currently it supports only the EBU-TT-D profile of TTML, but it should hopefully be straightforward to extend it to support full TTML and other profiles of TTML, such as SPMTE-TT and the IMSC1 profiles. > Could this, for starters, be just merged into gst-plugins-bad into a plugin > with the parser and renderer and then we gradually improve the interchange > format and make it used in other subtitle elements and at some point make it > public API? That sounds like a good idea to me. In this scenario, do you envisage that the library code would go in gst-plugins-bad/gst-libs and still be built as a separate library, or should it go in the plugin directory with the parser and renderer code and just compiled and linked directly with the parser & renderer (i.e., not be built as a separate library at all)?
Both would work I guess, for starters it might be simpler to not have a library and just everything in a single plugin.
OK - I'll go ahead and create patches to add these as a plugin to gst-plugins-bad.
Great, thanks :) Would be nice to get this in for 1.8
Created attachment 318412 [details] [review] Add support for TTML (EBU-TT-D) subtitles Patch that adds a TTML plugin to gst-plugins-bad.
Review of attachment 318412 [details] [review]: Thanks for the patch, didn't look in closer detail yet but: Why the LEGAL file in addition to what is written inside the files? Also why are you including all the other stuff from subparse, e.g. the tmplayerparse.c and samiparse.c
(In reply to Sebastian Dröge (slomo) from comment #7) [...] > > Why the LEGAL file in addition to what is written inside the files? When I went through my employer's Open Source release process, our legal team wanted this wording included somewhere (separate from the LGPL text). I noticed that qtdemux had a LEGAL file in its directory in plugins-good, so I copied that approach. Is this problematic? > Also why are you including all the other stuff from subparse, e.g. the > tmplayerparse.c and samiparse.c With the TTML parser, it was amenable to creating a patch to add it to subparse, so I tried to avoid changing the subparse bits too much. (The renderer is a pretty big departure from basetextoverlay - big enough that it couldn't really be added by a patch - so I didn't attempt to keep the other bits of the pango plugin). Do you want me to strip out the non-TTML bits from the parser?
(In reply to Chris Bass from comment #8) > (In reply to Sebastian Dröge (slomo) from comment #7) > [...] > > > > Why the LEGAL file in addition to what is written inside the files? > > When I went through my employer's Open Source release process, our legal > team wanted this wording included somewhere (separate from the LGPL text). > > I noticed that qtdemux had a LEGAL file in its directory in plugins-good, so > I copied that approach. Is this problematic? Not really problematic, it just usually causes more confusion than it solves compared to just having this information at the top of the relevant files :) > > Also why are you including all the other stuff from subparse, e.g. the > > tmplayerparse.c and samiparse.c > > With the TTML parser, it was amenable to creating a patch to add it to > subparse, so I tried to avoid changing the subparse bits too much. (The > renderer is a pretty big departure from basetextoverlay - big enough that it > couldn't really be added by a patch - so I didn't attempt to keep the other > bits of the pango plugin). > > Do you want me to strip out the non-TTML bits from the parser? Is the functionality exposed by the plugin and does it do something different than in subparse? I.e. is it also supporting the same intermediate subtitle format as used for TTML for all the other formats nowadays supported by subparse?
Review of attachment 318412 [details] [review]: ::: ext/ttml/gstttmlrender.c @@ +44,3 @@ +#include <gst/video/gstvideometa.h> +#include <gst/video/video-overlay-composition.h> +#include <pango/pangocairo.h> I think this somewhere needs configure checks for pango and pangocairo (and cairo?) like in gst-plugins-base, and also needs the relevant bits added to the Makefile.am for LIBADD and CFLAGS
(In reply to Sebastian Dröge (slomo) from comment #9) [...] > > Not really problematic, it just usually causes more confusion than it solves > compared to just having this information at the top of the relevant files :) I could add it beneath the LGPL text at the top of each file, perhaps separated by some kind of delimiter (e.g., `--' on a line by itself). I think they'd be happy with that. [...] > > Is the functionality exposed by the plugin and does it do something > different than in subparse? I.e. is it also supporting the same intermediate > subtitle format as used for TTML for all the other formats nowadays > supported by subparse? No, the functionality is hidden and the pad caps have been changed so that it handles only TTML subs. And I haven't changed handling of the other subtitle formats to use the new exchange format. I guess that means it would be best to remove them..?
Review of attachment 318412 [details] [review]: I strongly disagree we should let new licences terms be added like this. I'm not a layer, but most of what is in this new "licence" file is part of the term of copyright law and the LGPL licence. Sorry to say, but I'm quite against taking the risk of obfuscating our licence with random term that haven been verified to be compatible with the LGPL v2.1 by someone who have the ability to do so.
(In reply to Sebastian Dröge (slomo) from comment #10) > Review of attachment 318412 [details] [review] [review]: [...] > > I think this somewhere needs configure checks for pango and pangocairo (and > cairo?) like in gst-plugins-base, and also needs the relevant bits added to > the Makefile.am for LIBADD and CFLAGS Sorry - patch generation fail on my part! I forgot to add configure.ac, etc. I'll attach an updated patch...
Created attachment 318418 [details] [review] Add support for TTML (EBU-TT-D) subtitles Updated patch; includes changes to config files.
It's actually not adding anything to the LGPL, it just repeats the "no warranties"-clause, says it's under the terms of the LGPL and hints at possibly requiring other licenses to be acquired for commercial use and that you have to check that yourself (e.g. patent, trademark, etc) which is not really different from e.g. the libav h264 decoder and we have similar wording to that in the binary packages. IMHO it's a completely meaningless piece of text that only has the added informational value that it tells you that the LGPL is about source code distribution and your rights if you get binaries but that additionally use might require you to pay some money to the patent mafia or similar.
Oh and Nicolas' reaction is exactly that kind of confusion I meant that happens when you add such things. Better to just omit it as it doesn't have any value in itself and just stay with what the LGPL says.
Created attachment 318509 [details] [review] Add support for TTML (EBU-TT-D) subtitles Updated patch that removes non-TTML code from parser.
Created attachment 320324 [details] [review] Add support for TTML (EBU-TT-D) subtitles Updated patch that fixes bug when rendering spans with transparent backgrounds.
The LEGAL file is still there and we're not going to merge it with that as IMHO it just causes confusion for no value. It will have to be dropped, is that ok with you?
I'll have to check with our legal team and get back to you.
Created attachment 325006 [details] [review] Add support for TTML (EBU-TT-D) subtitles We've managed to get agreement that the text of the LGPL is sufficient and no additional disclaimer text is needed. Here's the patch with the `LEGAL' file removed.
Created attachment 326116 [details] [review] Add support for TTML (EBU-TT-D) subtitles Updated patch with improved handling of styling attributes in parser and handling of default fontnames in renderer.
Just want to quickly say thanks for getting the legal issue sorted and using stock LGPL. Let's get this in this cycle.
Created attachment 327038 [details] [review] Add support for TTML (EBU-TT-D) subtitles Updated patch. - Improves handling of whitespace in input EBU-TT-D files. - Fixes bug in which subtitles with start time of 0 were not displayed.
Review of attachment 327038 [details] [review]: ::: ext/ttml/gstttmlparse.c @@ +490,3 @@ + + GST_DEBUG_OBJECT (self, "Sending buffer %p, %llu %llu", + op_buffer, GST_BUFFER_PTS (op_buffer), GST_BUFFER_DURATION (op_buffer)); This fails to compile for me on 64bit Ubuntu 16.04. To work on both 32bit and 64bit machines, this should be changed to: GST_DEBUG_OBJECT (self, "Sending buffer %p, %" G_GUINT64_FORMAT " %" G_GUINT64_FORMAT op_buffer, GST_BUFFER_PTS (op_buffer), GST_BUFFER_DURATION (op_buffer)); Or maybe have the time values in human readable form: GST_DEBUG_OBJECT (self, "Sending buffer %p, %" GST_TIME_FORMAT " %" GST_TIME_FORMAT op_buffer, GST_TIME_ARGS (GST_BUFFER_PTS (op_buffer)), GST_TIME_ARGS (GST_BUFFER_DURATION (op_buffer)));
Created attachment 327634 [details] [review] Add support for TTML (EBU-TT-D) subtitles Updated patch. - Fixes problem highlighted by Alex. - Fixes issue with handling character entities (e.g., "&") in input TTML docs.
Created attachment 328789 [details] [review] Add support for TTML (EBU-TT-D) subtitles Updated patch. Code now handles DASH streams in which <p> or <span> elements that straddle two (or more) consecutive segments appear in each of those segments with identical timings, which is allowed in EBU Tech 3381.
Created attachment 329278 [details] [review] Add support for TTML (EBU-TT-D) subtitles Updated patch: - "suppress-at-line-break" and "white-space-treatment" rules implemented in renderer.
Review of attachment 329278 [details] [review]: Mostly trivial suggestions you might as well ignore ::: ext/ttml/gstttmlparse.c @@ +102,3 @@ + g_free (ttmlparse->detected_encoding); + ttmlparse->detected_encoding = NULL; + } You don't need to check for NULL before g_free() ::: ext/ttml/gstttmlrender.c @@ +1197,3 @@ + GST_CAT_ERROR (ttmlrender_debug, "Failed to access memory at index %u.", + index); + return NULL; I'm not particularly sure about this one but aren't you leaking a ref if mem is valid but the map fails? @@ +1261,3 @@ + + for (i = 2; cur; ++i) { + if (cur->element->suppress_whitespace) { At a first glance it seems like the whole block that follows can be simplified around g_strstrip() @@ +1554,3 @@ +} + +static GstTtmlRenderRenderedImage * Guess it wouldn't hurt to make this one explicitly "inline" @@ +2234,3 @@ + GST_LOG_OBJECT (render, "Text pad not linked"); + GST_TTML_RENDER_UNLOCK (render); + ret = gst_pad_push (render->srcpad, buffer); Maybe add a goto to #2381 and avoid needlessly nesting the long block that follows ::: ext/ttml/ttmlparse.c @@ +112,3 @@ + +static void +ttml_style_set_delete (TtmlStyleSet * style_set) Every call to this helper is preceded by an if(style_set) check. Maybe just do nothing on !style_set and be done with it? @@ +722,3 @@ + if (parent) { + GHashTableIter iter; + gpointer attr_name, attr_value; Just return ret on !parent and save the needless nesting of the long block that follows @@ +758,3 @@ + * - tts:showBackground + * - tts:unicodeBidi + */ If you are going to list every special case bellow you might as well say none is inherited and make the commentary briefer / less redundant @@ +853,3 @@ + ttml_style_set_print (element->style_set); + + return FALSE; You don't need the boolean return if you always return FALSE @@ +904,3 @@ + ttml_style_set_print (element->style_set); + + return FALSE; Ditto @@ +941,3 @@ + parent = node->parent->data; + element->whitespace_mode = parent->whitespace_mode; + return FALSE; Ditto @@ +958,3 @@ + TtmlElement *element = node->data; + ttml_delete_element (element); + return FALSE; Yet another time... @@ +995,3 @@ + element->begin = MAX (element->begin, window->begin); + element->end = MIN (element->end, window->end); + return FALSE; Ditto @@ +1044,3 @@ + } + + return FALSE; Ditto @@ +1074,3 @@ + } + + return FALSE; Ditto @@ +1117,3 @@ + } + + return FALSE; Ditto @@ +1247,3 @@ + + if (element->text + && (element->whitespace_mode != TTML_WHITESPACE_MODE_PRESERVE)) { Maybe just return on ~() ? @@ +1277,3 @@ + } + + return FALSE; Ditto (Always returning FALSE)
(In reply to Reynaldo H. Verdejo Pinochet from comment #29) > Review of attachment 329278 [details] [review] [review]: [...] Thanks for the review. I've made most of the changes that you've suggested, except for the ones about the functions that always return false: > @@ +853,3 @@ > + ttml_style_set_print (element->style_set); > + > + return FALSE; > > You don't need the boolean return if you always return FALSE The reason for the boolean return is that all of these functions are passed to g_node_traverse(), which expects a function of type GNodeTraverseFunc to be passed, whose return type is gboolean. [...]
Created attachment 330539 [details] [review] Add support for TTML (EBU-TT-D) subtitles Updated patch that addresses the comments made in the above review and also contains fixes for a couple of issues: - Fixes issue where first word in line of live-generated "snake" subtitles shifts vertically a little when subsequent words appear. - Fixes issue where the rightmost character in italic text is clipped slightly.
Just a friendly prod to see if we can get this merged in time for 1.9.2. :) Let me know if there's anything I can do to help speed the process.
Ping...
Hi Chris, Compiled this plugin with gst-plugins-bad-1.8.1 and tried to run the following pipeline: #gst-launch-1.0 souphttpsrc location="http://dash.edgesuite.net/dash264/TestCases/4b/qualcomm/1/English_track.xml" ! ttmlparse ! ttmlrender Getting the following error: Setting pipeline to PAUSED ... Pipeline is PREROLLED ... Setting pipeline to PLAYING ... New clock: GstSystemClock any_doc_name:41: parser error : AttValue: " or ' expected <p begin="00:04:27:00" end= ^ any_doc_name:41: parser error : attributes construct error <p begin="00:04:27:00" end= ^ any_doc_name:41: parser error : Couldn't find end of Start Tag p line 41 <p begin="00:04:27:00" end= ^ any_doc_name:41: parser error : Premature end of data in tag div line 16 <p begin="00:04:27:00" end= ^ any_doc_name:41: parser error : Premature end of data in tag body line 15 <p begin="00:04:27:00" end= ^ any_doc_name:41: parser error : Premature end of data in tag tt line 2 <p begin="00:04:27:00" end= ^ any_doc_name:1: parser error : Start tag expected, '<' not found "00:04:28:28" tts:textAlign="center">Well, don't you ever get tired of this?</p> ^ any_doc_name:1: parser error : Start tag expected, '<' not found 00:07:36:27" tts:textAlign="center">Well can you tell me that?</p> This xml url is basically the ttml subtitle url taken from Dash mpd url "http://dash.edgesuite.net/dash264/TestCases/4b/qualcomm/1/ED_OnDemand_5SecSeg_Subtitles.mpd". Further debugging shows that souphttpsrc is reading this xml network file and sending the read data downstream in GstBuffer and ttmlparse element is unable to parse this buffer data because buffer contains partial xml file without closing tags. in this case, if complete file is read and send into gstbuffer then this will be valid xml file. is it expected to handle this use case in ttmlparse element? how to deal with this use case? Please let me know if you want more details.
(In reply to Sanjay from comment #34) Hi Sanjay, The reason this is happening is that currently ttmlparse expects that each buffer it receives will contain a complete XML file. This will be the case if the subtitles in a DASH stream are encapsulated in ISO BMFF (i.e., mp4) format[*], but it won't be the case if the files are simply fetched as unpackaged XML, which is the case with the DASH stream that you're using. However, even if these files were packaged in ISO BMFF, they wouldn't work in this case because they are not valid EBU-TT-D - they use a different profile of TTML, and so ttmlparse would reject them. There *is* a way to render subtitles from an unpackaged EBU-TT-D file over a video stream. If, for example, you wanted to render an EBU-TT-D file containing the subtitles for the whole of Elephants Dream (which you can get here: http://rdmedia.bbc.co.uk/dash/ondemand/elephants_dream/1/elephantsdream_25fps.ebuttd.xml) over an mp4 file containing h264 video, you can download the xml file to disk and run the following command: gst-launch-1.0 ttmlrender name=r filesrc location=<path to mp4 file> ! qtdemux ! h264parse ! avdec_h264 ! autovideoconvert ! r.video_sink filesrc blocksize=16777216 location=<path to subtitle XML file> ! ttmlparse ! r.text_sink r. ! ximagesink [Note that the blocksize argument passed to filesrc ensures that it doesn't split the input EBU-TT-D file into smaller chunks, but passes the whole thing in a single buffer.] If you're interested, we do have a DASH stream of Elephants Dream containing subtitles that are packaged in ISO BMFF; its hosted here: http://rdmedia.bbc.co.uk/dash/ondemand/elephants_dream/ To play it, you just need to run the following: gst-play-1.0 http://rdmedia.bbc.co.uk/dash/ondemand/elephants_dream/1/client_manifest-all.mpd Hope that helps, Chris [*] The spec for packaging of EBU-TT-D in ISO BMFF is here: https://tech.ebu.ch/files/live/sites/tech/files/shared/tech/tech3381.pdf
Comment on attachment 330539 [details] [review] Add support for TTML (EBU-TT-D) subtitles Let's get this in immediately after 1.10 release.
It fails with an assertion on https://github.com/gpac/gpac/blob/master/tests/media/ttml/ebu-ttd_sample.ttml . It expects a div but instead gets a p there. Also attaching rebased patch that doesn't fail to apply and doesn't fail the docs build.
Created attachment 336826 [details] [review] ttml: Add plugin that supports TTML subtitles Add a parser (ttmlparse) and renderer (ttmlrender) element that handle subtitles that use the EBU-TT-D profile of TTML1.
Created attachment 336829 [details] test.mp4 MP4 file generated with MP4Box with the above linked subtitle file. The problem here is that each sample only has a <body> that contains a <p>, not a <div>.
The parser/renderer handle the raw TTML file with no problems. The content of the mp4 file is not valid EBU-TT-D: in EBU-TT-D <p>s must be contained within a <div> - they can't be direct children of a <body> element. Unfortunately, MP4Box is not doing the right thing. It also does other horrible things, like creating a separate EBU-TT-D document for each timed element (i.e., each <p> in this case) in the input file, and placing each of these documents in its own sample within the mp4 file. That means that 90% of the content of the mp4 file is redundant: all of the contents of the <head> and most of the other contents of the input file is repeated in each of these documents.
According to ISO 14496-30 this seems to be the right thing to do: 6.6 Sample Format says that each sample should be a (whole) XML document and is as such a sync sample. It seems very wasteful but that's what that standard says at least, and this standard seems to cover all variants of TTML (and WebVTT), so should also apply to EBU-TT-D. For the other part, indeed it seems like it does it wrong and should put each of the <p> into a <div>. I'll report a bug to them.
Each sample should be a whole document, but that doesn't mean that each timed element in the input document should be split out into its own document/sample: You could, for instance, take that whole input document and put it in a single sample in the mp4 file. If you were splitting the input file into fixed duration (e.g., 5s) segments for DASH, you can put all the timed elements occuring in the period 0s-5s into a single EBU-TT-D file, and that file would go into a single sample in the corresponding mp4 segment (and so on for the other segments). In other words, the most efficient method is for each mp4 file in a DASH stream to contain a single sample, which holds a single EBU-TT-D file that contains all the elements that will be displayed in that period of the presentation.
I see, that makes sense and is less weird :)
Attachment 336826 [details] pushed as d82ae69 - ttml: Add plugin that supports TTML subtitles