Bug 685282 – dvdspu: figure out how to make it work with hardware decoders and subpicture overlays

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 685282 - dvdspu: figure out how to make it work with hardware decoders and subpicture overlays


Summary:	dvdspu: figure out how to make it work with hardware decoders and subpicture ...


Status:	RESOLVED OBSOLETE

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-bad
Version:	git master
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	git master
Assigned To:	Jan Schmidt
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:	663750 725900

Reported:	2012-10-02 13:36 UTC by Jan Schmidt
Modified:	2018-11-03 13:13 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
dvdspu: render to ARGB overlay (41.88 KB, patch) 2013-09-02 13:17 UTC, Arnaud Vrac	none	Details \| Review
matroskademux: send custom event with DVD subtitles frame size (1.74 KB, patch) 2013-09-02 13:20 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: handle frame size event from upstream (2.04 KB, patch) 2013-09-02 13:21 UTC, Arnaud Vrac	none	Details \| Review
matroskademux: send custom event with DVD subtitles frame size (1.93 KB, patch) 2015-04-03 14:08 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: pass dvdspu argument to set_caps functions (3.21 KB, patch) 2015-04-03 14:09 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: render to ARGB overlay (43.94 KB, patch) 2015-04-03 14:09 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: allow suffix in dvd event name to allow multiple sticky dvd events (1.00 KB, patch) 2015-04-03 14:11 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: handle frame size event from upstream (2.02 KB, patch) 2015-04-03 14:12 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: cache overlay composition (2.97 KB, patch) 2015-04-03 14:12 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: improve negotiation of overlay composition (15.01 KB, patch) 2015-04-03 14:12 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: pass dvdspu argument to set_caps functions (3.21 KB, patch) 2015-08-20 15:52 UTC, Arnaud Vrac	committed	Details \| Review
dvdspu: render to AYUV overlay (43.30 KB, patch) 2015-08-20 15:55 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: render to ARGB overlay instead of AYUV (6.10 KB, patch) 2015-08-20 15:56 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: allow suffix in dvd event name to allow multiple sticky dvd events (1.00 KB, patch) 2015-08-20 15:56 UTC, Arnaud Vrac	committed	Details \| Review
dvdspu: handle frame size event from upstream (1.99 KB, patch) 2015-08-20 15:57 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: cache overlay composition (3.27 KB, patch) 2015-08-20 15:57 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: improve negotiation of overlay composition (17.67 KB, patch) 2015-08-20 15:57 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: render to AYUV overlay (41.72 KB, patch) 2015-09-28 12:15 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: render to ARGB overlay instead of AYUV (6.15 KB, patch) 2015-09-28 12:15 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: negotiate overlay composition meta with downstream (17.62 KB, patch) 2015-09-28 12:16 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: cache overlay composition (3.34 KB, patch) 2015-09-28 12:17 UTC, Arnaud Vrac	none	Details \| Review
dvdspu: handle frame size event from upstream (1.99 KB, patch) 2015-09-28 12:18 UTC, Arnaud Vrac	none	Details \| Review

Description Jan Schmidt 2012-10-02 13:36:57 UTC

Getting the DVD SPU to paint generically is a requirement for allowing the DVD elements to plug / output hardware decoder caps.

Here's a conversation we had about it on IRC:

Sep 26 01:37:04 *       thaytan wonders how SPU works
Sep 26 01:37:19 <bilboed>       thaytan, with vdpau ?
Sep 26 01:37:24 <thaytan>       *nod*
Sep 26 01:37:35 <thaytan>       how it should work, that is
Sep 26 01:37:36 <bilboed>       they have also VdpBitmapSurface
Sep 26 01:38:01 <bilboed>       so VdpVideoSurface => YUV stuff, VdpBitmapSurface => RGB stuff
Sep 26 01:38:11 <bilboed>       VdpOutputSurface => output of the compositor for display
Sep 26 01:38:40 <bilboed>       you can create VdpBitmapSurface and map(write) stuff on it
Sep 26 01:39:01 <bilboed>       then you give it to the compositor with coordinates (and whatever else) and that's basically it
Sep 26 01:39:09 <thaytan>       hmmm
Sep 26 01:39:26 <bilboed>       so as a result ... I'm also gonna have to figure out how to solve hw-compositing
Sep 26 01:39:28 <thaytan>       not sure I see how that fits into a GStreamer/rsndvdbin context
Sep 26 01:39:31 <bilboed>       in a generic fashion that is
Sep 26 01:39:55 <bilboed>       thaytan,  was thinking you could slap GstOverlayMeta (or sth like that) with attached GstBuffers
Sep 26 01:40:06 <bilboed>       thaytan, maybe videomixer or some generic element could do that
Sep 26 01:40:09 <thaytan>       I guess it's a vdpspu element with video and subpicture inputs as currently with dvdspu
Sep 26 01:40:25 <thaytan>       except the video pad accepts vdp output surface caps
Sep 26 01:40:38 <bilboed>       I'd prefer to avoid creating yet-another-custom-element
Sep 26 01:40:45 <thaytan>       but, I don't know what it outputs
Sep 26 01:41:12 <thaytan>       bilboed: I don't know how to build it generically
Sep 26 01:41:48 <thaytan>       the dvdspu element uses the video stream passing through to a) paint onto b) uses the timestamps to crank the SPU state machine
Sep 26 01:42:44 <bilboed>       what do you need more apart from "put this RGB bitmap at these coordinates for this timestamp and this duration"
Sep 26 01:43:48 <thaytan>       it needs the timestamps and segment info on the video stream so it knows which pixels to generate
Sep 26 01:44:04 <thaytan>       it's more "here's a video frame, what's the overlay?"
Sep 26 01:44:13 <thaytan>       also, dvdspu works in YUV too
Sep 26 01:44:13 <bilboed>       sure, but it doesn't care about the *content* of that frame
Sep 26 01:44:28 <thaytan>       bilboed: not if it's not doing the compositing, no
Sep 26 01:44:34 <bilboed>       right
Sep 26 01:45:11 <bilboed>       so it could see the stream go through, watch/collect segment/timestamps and decide what subpicture to attach to it (without *actually* doing any processing and letting downstream handle that)
Sep 26 01:45:13 <thaytan>       but the model is still to pass the video buffer stream through the spu element so it can see the timing info it needs, right?
Sep 26 01:45:22 <thaytan>       oh, of course
Sep 26 01:45:28 <thaytan>       that's what I was suggesting, I guess I wasn't clear
Sep 26 01:45:35 <__tim> thaytan, dvdspu should support GstVideoOverlayComposition imho
Sep 26 01:45:38 <bilboed>       I'm not *that* familiar with SPU
Sep 26 01:45:42 <bilboed>       also, what __tim said :)
Sep 26 01:45:54 <bilboed>       like that I don't have to solve it in 500 different elements
Sep 26 01:45:56 <thaytan>       ok, I guess I'll have to look at the GstVideoOverlayComposition API
Sep 26 01:46:54 *       bilboed is not looking forward "at all" to fixing this for cluster-gst
Sep 26 01:47:12 <__tim> it's very dumb, you can provide one or more ARGB or AYUV rectangles and either use helper API to put them onto the raw video, or attach them to the buffer; the sink or whatever can then take over the overlaying using that
Sep 26 01:47:40 <__tim> and it will do a bunch of conversions for you and cache them if the sink or whatever does the overlaying doesn't support what you supplied
Sep 26 01:48:54 <thaytan>       well, that sounds feasible - although less efficient than dvdspu painting natively if the composite is in software
Sep 26 01:49:28 <thaytan>       maybe it can be extended to add RLE AYUV rectangles as a format though?
Sep 26 01:49:44 <__tim> thaytan, how sure are you of that? because basically you have to parse the RLE data for every single frame, right? is that really so much faster than blitting some ready-made rectangle using orc?
Sep 26 01:50:04 <thaytan>       dvdspu gets to skip a lot of transparent pixels
Sep 26 01:52:21 <__tim> yeah, but it's if else and loops etc. You might be right, I'm just curious how much difference it actually makes. Also, you don't have to use the API to blit your pixels, you can still do that as you do now and only attach the AYUV rectangle if downstream supports that
Sep 26 01:54:13 <__tim> it's just convenient because you only have one code path
Sep 26 01:55:01 <thaytan>       it sounds like a structural improvement

Comment 1 Arnaud Vrac 2013-09-02 13:17:10 UTC

I've ported dvdspu to use the GstVideoOverlayComposition API, and I'd like some feedback. The patches work well on my side for either PGS or VobSub formats, using a hardware renderer for composition or by blending the overlay directly in the video frames.

Note that I'm using an ARGB overlay. I agree that usually video is YUV so it would be faster to use an AYUV overlay for blending, however in my case I can only use ARGB for hardware composition, and there is currently no way to negotiate the supported overlay formats with the downstream elements.

Right now a single overlay rectangle is allocated, and the SPU objects are rendered inside this rectangle. The size of the rectangle is made minimal on purpose: instead of allocating the full overlay size, only the surface of the cropped objects is allocated to minimize memory usage. So coordinates of the rendered objects are translated to match the rectangle position in the overlay.

The patch also changes how coordinates of the SPU are calculated against the video frame. Usually in BD or DVD rips, black borders are removed and video might be scaled with the same aspect ratio. However, the SPU data is not corrected to account for the black borders removal or the scaling. In this case the SPU data was not rendered correctly (see Bug #663750). This patch tries to fix that by calculating the scaling/translation necessary to render the overlay as intended in the original content. The scaling can be done easily using the overlay composition API. When playing a DVD these heuristics are a noop since the video frame size matches the overlay size.

Interlacing should be handled transparently by the overlay composition API so Bug #667224 should be fixed too.

The performance level and memory usage should not be too far from the previous implementation, since before there were intermediate buffers for A/U/V blending.

Performance could be improved a lot by caching the overlay composition, I'm not sure how to do that yet.

Comment 2 Arnaud Vrac 2013-09-02 13:17:34 UTC

Created attachment 253844 [details] [review]
dvdspu: render to ARGB overlay

Comment 3 Arnaud Vrac 2013-09-02 13:20:43 UTC

Created attachment 253845 [details] [review]
matroskademux: send custom event with DVD subtitles frame size

This patch helps the downstream dvdspu element to render the VobSub overlay at the right position and size, in case the video was scaled or cropped to remove black borders. The VobSub SPU data coordinates are calculated against the size set in the IDX file, which is parsed here.

Comment 4 Arnaud Vrac 2013-09-02 13:21:50 UTC

Created attachment 253846 [details] [review]
dvdspu: handle frame size event from upstream

Use the frame size from the IDX file to make VobSub overlay positioning and scale correct.

Comment 5 Arnaud Vrac 2013-09-02 16:02:24 UTC

I've fixed some issues related to caps negotiation, vobsub artifacts and added overlay cache, please use the patches here instead:

https://github.com/rawoul/gst-plugins-bad/commits/dvdspu

Comment 6 Jan Schmidt 2013-09-02 21:43:11 UTC

"The performance level and memory usage should not be too far from the previous
implementation, since before there were intermediate buffers for A/U/V
blending."

The performance level is something you could actually measure. My recollection is that it made a big difference - it was one of the reasons I wrote it to render incrementally onto the frame instead of into a large caching buffer and blending the whole thing. I don't remember how big a difference - but it was something like the overlay consuming as much CPU as the MPEG decode. The intermediate AYUV buffers in the existing code don't trigger the same level of CPU consumption because a) it only does work for lines with actual pixels b) it only blends the minimum left-right span that contains coloured pixels c) the pixel buffers are small enough to usually still be in cache when they get blended

Also, there was no floating point ops in the previous code.

Comment 7 Jan Schmidt 2013-09-02 22:17:33 UTC

Sorry - that may have sounded overly negative. Thanks for working on this - it's a cool set of changes :)

Comment 8 Arnaud Vrac 2013-09-02 22:20:40 UTC

I agree that your blending code is probably faster when blending directly on an YUV frame and cropping out the transparent areas. The floating point operations can be easily removed by using an AYUV overlay instead of ARGB, which would make blending faster anyway. As I said, I'm using premultiplied ARGB because in my use case I can render the overlay in hardware with this format.

Since the overlay is now cached, only the blending operation is done for each video frame after the overlay is rendered once, I don't think it should be that expensive. I'll try to run some benchmarks and report them.

Your code was probably the best way to do it, unfortunately it only works for YUV planar video formats... And most importantly it doesn't allow hardware rendering.

I'm open to any suggestions to make it faster :)

Comment 9 Arnaud Vrac 2013-09-02 22:21:34 UTC

(In reply to comment #7)
> Sorry - that may have sounded overly negative. Thanks for working on this -
> it's a cool set of changes :)

Thanks for looking into this :)

Comment 10 Sebastian Dröge (slomo) 2013-09-27 12:44:48 UTC

What's the status of this? Sounds very interesting :)

Comment 11 Matej Knopp 2013-11-26 01:03:36 UTC

Just an idea, does anyone see any potential issues in adding multiple GstVideoOverlayRectangles to the composition instead of just one big rectangle? Basically during the RLE process we'd remember where the actual non transparent pixels were, group this into rectangles and then add rectangle for each area? The buffers could share the underlying framebuffer to prevent copying (or maybe not since most of the are would be transparent and unused).

The benefit would be significantly reduced blending (since most transparent pixels wouldn't need to be blended), also downstream could adjust rectangle positions if the dimensions don't match (which is sometimes the case with ripped DVDs).

Comment 12 Holger Kaelberer 2013-11-26 08:16:07 UTC

(In reply to comment #11)
> Just an idea, does anyone see any potential issues in adding multiple
> GstVideoOverlayRectangles to the composition instead of just one big rectangle?

We do this for a while for custom OSD elements in s/w and h/w (vaapi/libva) based pipelines with ARGB overlay data and it works like charm. In h/w accelerated pipelines you don't have a benefit in terms of blending performance because vaapi and vdpau can do this completely in hardware. Though, in the case of vaapi there is a limit of max. 8 (vdpau-driver/libva) or 4 (current intel-driver) subpictures (i.e. GstVideoOverlayRectangles), iirc.

Comment 13 Arnaud Vrac 2013-11-26 08:28:52 UTC

In my case it shouldn't impact performance because I do the blending in the screen framebuffer in hardware too.

Your idea of repositioning the rectangles would have to be done in the dvdspu element though, since the blending with the video frame is done just after creating the rectangles when no downstream element can handle the overlay metadata.

Comment 14 Matej Knopp 2013-11-26 12:25:35 UTC

I don't need to assume that my usecase would be generic enough. I can always have extra element downstream that does the repositioning and composition.

Comment 15 Matthieu Bouron 2014-03-05 20:38:39 UTC

First of all, thanks for this work.
I tested it a bit with one of the sample I have and it works well with cluttersink (+ overlay composition patch).

By the way, what's the current state of the work ?

Comment 16 Edward Hervey 2014-04-01 13:58:03 UTC

This is pretty neat indeed and the improvements are worthwhile.

What exactly is blocking this ?

Comment 17 Arnaud Vrac 2014-04-01 17:45:16 UTC

Nothing much, I think the AYUV-> ARGB conversion code should be removed for this to be merged upstream, the overlay composition api can handle it. The performance loss due when using the blending path have not been measured compared to the current optimized path, which only works on YUV videos.

There's also a small issue with the frame-size and clut events sent from matroskademux. I need to make them sticky, but the event name has to be unique so that both events can be kept on the pad. So we have to change:

application/x-gst-dvd, event = "dvd-set-frame-size"
application/x-gst-dvd, event = "dvd-spu-clut-change"

to something like

application/x-gst-dvd-spu-clut-change
application/x-gst-dvd-set-frame-size

Otherwise the frame size or the CLUT will be lost on track change.

Unfortunately I don't have time to work on this ATM, maybe Matthieu has made some progress on this ?

Comment 18 Arnaud Vrac 2014-08-27 17:21:56 UTC

I've rebased my work on master and included proper caps negotation for the overlay composition meta, like was done in other plugins using overlay by Matthieu Bouron and Nicolas Dufresne.

The patches can be found here:
https://github.com/rawoul/gst-plugins-bad/commits/dvdspu
https://github.com/rawoul/gst-plugins-good/commits/dvdspu

The first iteration of the render patch works with ARGB format, and I've reverted to AYUV in the next patch. I'm keeping both patch in case we want to do some performance tests for the overlay in either ARGB or AYUV formats.

Comment 19 Nicolas Dufresne (ndufresne) 2015-04-02 22:37:32 UTC

Would it be possible to rebase and attach the patch so we can finally review and merge ?

Comment 20 Arnaud Vrac 2015-04-03 14:08:23 UTC

Created attachment 300885 [details] [review]
matroskademux: send custom event with DVD subtitles frame size

Comment 21 Arnaud Vrac 2015-04-03 14:09:25 UTC

Created attachment 300886 [details] [review]
dvdspu: pass dvdspu argument to set_caps functions

Comment 22 Arnaud Vrac 2015-04-03 14:09:53 UTC

Created attachment 300887 [details] [review]
dvdspu: render to ARGB overlay

Comment 23 Arnaud Vrac 2015-04-03 14:11:38 UTC

Created attachment 300888 [details] [review]
dvdspu: allow suffix in dvd event name to allow multiple sticky dvd events

Comment 24 Arnaud Vrac 2015-04-03 14:12:00 UTC

Created attachment 300889 [details] [review]
dvdspu: handle frame size event from upstream

Comment 25 Arnaud Vrac 2015-04-03 14:12:21 UTC

Created attachment 300890 [details] [review]
dvdspu: cache overlay composition

Comment 26 Arnaud Vrac 2015-04-03 14:12:41 UTC

Created attachment 300891 [details] [review]
dvdspu: improve negotiation of overlay composition

Comment 27 Arnaud Vrac 2015-04-03 14:17:53 UTC

I've rebased the github links against master and posted the patches.

Ideally someone with embedded hardware with no overlay could check performance is OK when using those patches. Unfortunately the generic overlay blending code is not very much optimised.

Comment 28 Arnaud Vrac 2015-08-20 15:52:46 UTC

Created attachment 309736 [details] [review]
dvdspu: pass dvdspu argument to set_caps functions

Comment 29 Arnaud Vrac 2015-08-20 15:55:39 UTC

Created attachment 309737 [details] [review]
dvdspu: render to AYUV overlay

Comment 30 Arnaud Vrac 2015-08-20 15:56:12 UTC

Created attachment 309738 [details] [review]
dvdspu: render to ARGB overlay instead of AYUV

Comment 31 Arnaud Vrac 2015-08-20 15:56:44 UTC

Created attachment 309739 [details] [review]
dvdspu: allow suffix in dvd event name to allow multiple sticky dvd events

Comment 32 Arnaud Vrac 2015-08-20 15:57:07 UTC

Created attachment 309740 [details] [review]
dvdspu: handle frame size event from upstream

Comment 33 Arnaud Vrac 2015-08-20 15:57:30 UTC

Created attachment 309741 [details] [review]
dvdspu: cache overlay composition

Comment 34 Arnaud Vrac 2015-08-20 15:57:52 UTC

Created attachment 309742 [details] [review]
dvdspu: improve negotiation of overlay composition

Comment 35 Arnaud Vrac 2015-08-20 16:01:53 UTC

I've updated the patches, and fixed a lot of small issues. I've tested with a few DVDs and also matroska files, which render fine. I've also improved the overlay composition negotiation again, based on the recent textoverlay work.

The vobsub matroska embedded subtitles are now rendered much better than before, the scale / position now matches vlc and mpv for the same files.

You can also find the patches here: https://github.com/rawoul/gst-plugins-bad/commits/dvdspu

I hope this can be reviewed now !

Comment 36 Jan Schmidt 2015-09-26 14:08:15 UTC

I think somewhere down the road, it would be nice to have more generic SPU interface than sending all these events with application/x-gst-dvd prefix, but I'm not sure what that API should be.

So far these patches look good.

I'm still a little worried about the performance impact blending an entire display-area sized region. My memory is that it was a non-trivial impact in 2004-2005 when I wrote the original code. It's something we could optimise later though - perhaps keep an auxilliary array of "height" entries that stores the X position of the first non-transparent pixel for each line so it can skip blending.

Instead of only scaling down to fit, is it worth considering using the set-frame-size event to allocate a larger downstream frame as the target for blending into? I think in any case a mechanism for communicating the intended display size to the sinks would be useful when using GstVideoOverlayComposition. I have a rip here where the video was originally 1080p, and they've trimmed the black-bars to enode 1920x800. When the black bars are added back in for display, the PGS titles should appear there - starting below Y=870 or so.

Between dvbsuboverlay, dvdspu and textoverlay, we probably have enough experience to start factoring out a set of common overlay/SPU behaviours into a base class and have features like negotiation, colour conversion and software blending fallbacks in there.

Besides all that, there is something wrong with these patches, but I don't know what yet. On my Back To The Future DVD, they make it not display certain subpicture entries in the bonus 'Animated Anecdotes' track, but only some.

Comment 37 Arnaud Vrac 2015-09-26 14:20:54 UTC

Hi Jan,

Thanks for taking a look. The patches on the github branch are more recent, mostly fixing issues with PGS rendering introduced by my previous patches. Now everything works fine, and I also handle the cropping window properly. Since 1.6 has been released, I'm now trying to make the individual patches easier to review (separating overlay meta negotiation and rendering in separate patches). I'll try to finish this work tonight.


(In reply to Jan Schmidt from comment #36)
> I think somewhere down the road, it would be nice to have more generic SPU
> interface than sending all these events with application/x-gst-dvd prefix,
> but I'm not sure what that API should be.
> 
> So far these patches look good.
> 
> I'm still a little worried about the performance impact blending an entire
> display-area sized region. My memory is that it was a non-trivial impact in
> 2004-2005 when I wrote the original code. It's something we could optimise
> later though - perhaps keep an auxilliary array of "height" entries that
> stores the X position of the first non-transparent pixel for each line so it
> can skip blending.

I agree, and the blending performance could be better if the overlay format could be dynamic instead of hardcoding ARGB or AYUV. We would use AYUV when blending to avoid conversions, and ARGB in other cases since we have no way yet to negotiate the overlay format with downstream, and most hardware works in ARGB.

The rendering code does skip transparent pixels so I hope the impact is not too big. Clearing the full frame should be pretty fast.

> 
> Instead of only scaling down to fit, is it worth considering using the
> set-frame-size event to allocate a larger downstream frame as the target for
> blending into? I think in any case a mechanism for communicating the
> intended display size to the sinks would be useful when using
> GstVideoOverlayComposition.  I have a rip here where the video was
> originally 1080p, and they've trimmed the black-bars to enode 1920x800. When
> the black bars are added back in for display, the PGS titles should appear
> there - starting below Y=870 or so.

I'm not sure what to do here. Right now I have the exact same result as VLC for vobsub and PGS. mpv renders PGS slightly differently, it doesn't scale them.

> 
> Between dvbsuboverlay, dvdspu and textoverlay, we probably have enough
> experience to start factoring out a set of common overlay/SPU behaviours
> into a base class and have features like negotiation, colour conversion and
> software blending fallbacks in there.

Yes, the overlay meta negotiation is not easy to get right and is a big copy paste in all the subtitle renderers.

> 
> Besides all that, there is something wrong with these patches, but I don't
> know what yet. On my Back To The Future DVD, they make it not display
> certain subpicture entries in the bonus 'Animated Anecdotes' track, but only
> some.

The issue you face might be related to the overlay cache not being cleared in some cases, which I might have overlooked. Can you check if reverting the 'dvdspu: cache overlay composition' helps ?

Comment 38 Jan Schmidt 2015-09-26 15:01:35 UTC

Confirming what I said on IRC: Reverting the cache composition makes the animated subs play again.

It looks like the composition caching only allows for subs to change when a new event or packet arrives, but actually VOB subs are a sequence of commands to be executed in turn at particular timestamps - we need a way for the vobsub implementation to flag that advancing the SPU did or did not change the output and use it to invalidate the cache.

Comment 39 Jan Schmidt 2015-09-27 03:38:49 UTC

(In reply to Arnaud Vrac from comment #37) 
> > 
> > Instead of only scaling down to fit, is it worth considering using the
> > set-frame-size event to allocate a larger downstream frame as the target for
> > blending into? I think in any case a mechanism for communicating the
> > intended display size to the sinks would be useful when using
> > GstVideoOverlayComposition.  I have a rip here where the video was
> > originally 1080p, and they've trimmed the black-bars to enode 1920x800. When
> > the black bars are added back in for display, the PGS titles should appear
> > there - starting below Y=870 or so.
> 
> I'm not sure what to do here. Right now I have the exact same result as VLC
> for vobsub and PGS. mpv renders PGS slightly differently, it doesn't scale
> them.
> 

To take an example sub from the test sample I'm talking about:

Win ID 0 x 404 y 872 w 1111 h 142

So that means on the original disc, this was targeted to display from y=872 to 1014.

With 140 pixel black bars ((1080 - 800) / 2) added to the 1920x800 video, then the video displays from 140 -> 940 on the screen. The sub should cover y = 872 to 1014, so should appear at the bottom of the video, with about 74 pixels sticking off the bottom.

On a real disc, the video would actually be encoded at 1920x1080, so the placement would be a non-issue.

btw, can you please update the remaining patches on the bug?

Comment 40 Arnaud Vrac 2015-09-28 12:15:09 UTC

Created attachment 312279 [details] [review]
dvdspu: render to AYUV overlay

Updated the gstspu_fit_overlay_rectangle function to allow keeping the SPU window aspect ratio as much as possible when required. This is now enabled for PGS subtitles. Since black borders are removed on BD-rips, and subtitles are usually placed on top of the video and not in the black borders, the result is the same as on the original video.

Comment 41 Arnaud Vrac 2015-09-28 12:15:30 UTC

Created attachment 312280 [details] [review]
dvdspu: render to ARGB overlay instead of AYUV

Comment 42 Arnaud Vrac 2015-09-28 12:16:19 UTC

Created attachment 312281 [details] [review]
dvdspu: negotiate overlay composition meta with downstream

Comment 43 Arnaud Vrac 2015-09-28 12:17:26 UTC

Created attachment 312282 [details] [review]
dvdspu: cache overlay composition

I haven't fixed the too aggressive caching yet, so this patch should not be pushed. It's useful for benchmarks.

Comment 44 Arnaud Vrac 2015-09-28 12:18:52 UTC

Created attachment 312283 [details] [review]
dvdspu: handle frame size event from upstream

Comment 45 Arnaud Vrac 2015-09-28 12:30:11 UTC

I've updated the patches, they are rebased on master and are the same than on my github dvdspu branch.

Jan, the PGS subtitles should look better now, can you confirm ?

The cache issue is still here, I haven't had time to fix it yet. I have some numbers I got playing a DVD Rip as fast as possible, unfortunately they are not good when blending:

$ time gst-launch-1.0 file://$PWD/12_MONKEYS.mkv ! matroskademux name=dmx dmx.video_0 ! queue ! avdec_mpeg2video ! d.video dmx.subtitle_0 ! queue ! d.subpicture dvdspu name=d ! fakesink

blended AYUV overlay, with cache: 873.82s user 17.35s system 146% cpu 10:07.05 total
blended AYUV overlay, no cache: 893.13s user 57.03s system 142% cpu 11:05.09 total
blended ARGB overlay, with cache: 1191.11s user 17.00s system 130% cpu 15:23.82 total
blended ARGB overlay, no cache: 1210.60s user 53.36s system 129% cpu 16:19.72 total
no overlay (direct blending): 445.42s user 19.90s system 246% cpu 3:09.12 total

So the new code is much slower when using the generic blending code. Most of the time is spent in gst_video_blend, and also converting ARGB to YUV before blending when the overlay is rendered to ARGB. I don't think we can ever match the original performance, since it is highly optimised for the blending YUV to YUV use case. I see multiple things that could be improved:

- keep the old code path for blending. This could make the code pretty complicated though.
- crop the overlay rectangles to only contain visible pixels. This could be done in two ways:
* with GstCropMeta, unfortunately the overlay code does not account for this meta.
* by tweaking the overlay GstVideoMeta width, height and plane offset. We must make sure the alignment is correct though.

Caching doesn't help much when blending, since rendering the overlay is fast compared to the blending operation.

Comment 46 Jan Schmidt 2018-05-04 09:20:11 UTC

One option to improve the overall performance benefit here would be to support an overlay format that matches the DVD and PGS input format more closely - passing overlay regions in a run-length encoded format instead of as arrays of raw pixels that (for DVD/Bluray at least) are frequently mostly transparent pixels

Comment 47 GStreamer system administrator 2018-11-03 13:13:08 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/78.