After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 712134 - matroskamux: Text stream generated by appsrc into kateenc+matroskamux loses subtitle encoding
matroskamux: Text stream generated by appsrc into kateenc+matroskamux loses s...
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
1.x
Other Linux
: Normal normal
: 1.2.3
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2013-11-12 06:10 UTC by Dylan Broome
Modified: 2014-03-25 20:27 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Pygi script to generate bad subtitle mimetype using appsrc for text input stream (1.28 KB, text/x-python)
2013-11-12 06:10 UTC, Dylan Broome
Details
Simple subtitle testfile for subparse (225 bytes, text/plain)
2014-01-02 04:57 UTC, Dylan Broome
Details

Description Dylan Broome 2013-11-12 06:10:20 UTC
Created attachment 259634 [details]
Pygi script to generate bad subtitle mimetype using appsrc for text input stream

Feeding a filesrc ! subparse ! kateenc ! matroskamux works, subtitles are encoded in subtitle/x-kate and is able to be replayed in vlc. If these are generated on the fly via appsrc the mime type for the subtitle stream appears to end up as application/x-subtitle-unknown. Using this pipeline :

matroskamux name=mux ! filesink location=test.mkv name=sink videotestsrc ! queue name=videoqueue ! avenc_mpeg4 ! mux. appsrc name=appsrc caps=text/x-raw,format=utf8 stream-type=0 format=3 is-live=true ! queue name=appqueue ! kateenc category=subtitles name=kateenc ! mux.

Playing this back in VLC, or looking at mkvinfo shows the subtitle stream to be undef. I have attached a same python script to generate the problem (as it uses appsrc)
Comment 1 Dylan Broome 2013-11-18 06:29:26 UTC
This looks like an issue with kateenc and streaming.

Using the following works correctly : 

gst-launch-1.0 videotestsrc num-buffers=1000 ! x264enc ! h264parse ! matroskamux name=mux ! filesink location=test.mkv   filesrc location=/some/test/subtitle/file ! subparse ! kateenc category=SUB ! mux.

However, leave the num-buffers=1000 out and hit ctrl-c to end the pipe and you get this above undesired bug

gst-launch-1.0 videotestsrc ! x264enc ! h264parse ! matroskamux name=mux ! filesink location=test.mkv   filesrc location=/some/test/subtitle/file ! subparse ! kateenc category=SUB ! mux.

When working with a non stream (like the working example above) I see with GST_DEBUG=kateenc:6 the following debug :

0:00:00.696801565 23268  0x8c5b0c0 LOG                  kateenc gstkateenc.c:1349:gst_kate_enc_sink_event:<kateenc0> ensuring all headers are in

This does not occur when streaming and ending the pipeline with ctrl-c.

Note, this also fails if one sets stream=1 on the matroskamux
Comment 2 Tim-Philipp Müller 2013-11-18 14:24:03 UTC
Your python test program just crashes for me, when emitting the push-buffer signal on appsrc, not sure what's going on there.

> However, leave the num-buffers=1000 out and hit ctrl-c to end the pipe and you
> get this above undesired bug
> (snip) 
> When working with a non stream (like the working example above) I see with
> GST_DEBUG=kateenc:6 the following debug :

You can't just hit control-C and expect things to work properly.

Pass -e to gst-launch-1.0, then it will push an eos event through the pipeline and make sure all the headers are written properly. In code you can do that by doing gst_element_send_event (pipeline, gst_event_new_eos()); and then waiting for an EOS message on the pipeline's bus.
Comment 3 Tim-Philipp Müller 2013-11-28 13:05:34 UTC
Dylan, does that fix your issues?
Comment 4 Dylan Broome 2013-12-10 07:21:15 UTC
Unfortunately Tim I have not had any luck with these solutions. The biggest problem is that our system wants to record a rtsp stream indefinitely until power fails. There is no real fixed 'end time' for the stream or a way for me to ensure it shuts down gracefully when running 'in the wild'. This was working in 0.10. Is there a way for me to send the encoding headers at the start of the stream? This seems to be a bug still as the kate plugin fails to send the headers during a stream (which by definition has no defined end). The video channel works even with an unexpected shutdown but kate/subtitle channel fails.
Comment 5 Tim-Philipp Müller 2013-12-10 10:02:13 UTC
Ok, I don't see any reason why it should behave any different than 0.10 here, but your python test case crashes for me. Could you make a test case in C perhaps?
Comment 6 Dylan Broome 2014-01-02 04:57:11 UTC
Hi Tim-Philipp,

Sorry for the delay in this reply, seasons greetings and happy new year.
The behaviour difference between 0.10 and 1.0 is easily demonstrated with the following gst-launch lines. Only difference is ffenc and avenc otherwise the pipelines are the same. If the following is repeated with limited frames on the videotestsrc we see they both work however with the case below, 1.0 subtitles do not. (subtitle type is unknown under 1.0)

Using 1.0 (not working) :
gst-launch-1.0 videotestsrc ! avenc_mpeg4 ! matroskamux name=mux ! filesink location=test_1.0_nolimit.mkv filesrc location=subtest.txt ! subparse ! kateenc category=SUB ! mux.

Using 0.10 (working) :
gst-launch-0.10 videotestsrc ! ffenc_mpeg4 ! matroskamux name=mux ! filesink location=test_0.10_nolimit.mkv filesrc location=subtest.txt ! subparse ! kateenc category=SUB ! mux.

Please not this is without the -e option, as mentioned before our pipeline runs until system powerdown from external supply.
Comment 7 Dylan Broome 2014-01-02 04:57:52 UTC
Created attachment 265121 [details]
Simple subtitle testfile for subparse
Comment 8 Vincent Penquerc'h 2014-01-09 18:33:24 UTC
Fixed by writing codec ID and private data at start, when they are known, as is done for audio and video. Not sure just why subtitles were treated differently (can these become known *later* ?), but the code is still there if anything else depends on it.

Tested with the gst-launch line give above.

commit 1c6ee3fba47fb8efb8c312239cc9cb7673371351
Author: Vincent Penquerc'h <vincent.penquerch@collabora.co.uk>
Date:   Thu Jan 9 18:25:04 2014 +0000

    matroskamux: write subtitle codec ID and data at start when known
    
    This avoids issues with writing dummy data first, then having
    to come back and write correct data later. Doing so prevents
    the muxed stream from being actually streamable.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=712134
Comment 9 Tim-Philipp Müller 2014-01-09 18:42:44 UTC
> Not sure just why subtitles were treated differently
> (can these become known *later* ?), but the code is still
> there if anything else depends on it.

Probably a left-over from 0.10 days where you could only send caps with a buffer.
Comment 10 Vincent Penquerc'h 2014-01-10 08:58:47 UTC
I added that one too, then.

commit 4a0554331e4b7f115a94c3299f1433f09c5d7755
Author: Vincent Penquerc'h <vincent.penquerch@collabora.co.uk>
Date:   Fri Jan 10 08:52:16 2014 +0000

    matroskamux: remove obsolete write-dummy-and-overwrite-on-eos code
    
    The need for rewriting apparently is obsolete 0.10 leftover.
    We now have caps for subtitles when we create the headers,
    so we always write the correct data in the first place.

Pushed to 1.2 too.
Comment 11 Mark Nauwelaerts 2014-01-19 15:36:06 UTC
Hang on a bit here.

Generally, I would hope for something stronger than 'apparently' in a commit message ...

In this particular case.  Yes, subtitles are treated specially since required codec-private data can come later, see the dvd-clut stuff which is built from a custom event (not caps).  Unless someone can explain or indicate why/how it is guaranteed in 1.x that this event will arrive before the header will be constructed, that rewrite code is not obsolete and quite needed in that case.
Comment 12 Tim-Philipp Müller 2014-01-20 00:20:26 UTC
The clut event should be sent before any buffers, and the header is only written once all pads have a buffer (or are otherwise prerolled).
Comment 13 Mark Nauwelaerts 2014-01-20 19:01:38 UTC
Which buffers do you mean?  The clut-event is sent before any subtitle buffers, but is it sent (and does it also arrive) before any other (a/v) buffers ?
Also note that the header can already be written without any subtitle buffers having arrived.
Comment 14 Mark Nauwelaerts 2014-02-02 11:11:02 UTC
So, I had another look and it seems to work ok.  Not sure what you meant with "otherwise pre-rolled" but the reason it works is that collectpads keeps the subtitle pad in waiting-state (and as such will not trigger header writing) until it receives a subtitle buffer or a subtitle segment (telling it there will not be a buffer for some time).

As such, it will correctly receive the needed info (i.e. clut-event) as long as that event is received before a buffer or a (future) segment event [*].  In particular, the confusing part here is that (afaics) this is not related to any 1.x changes and should also have worked (without header rewriting) in 0.10 provided [*] holding up.  And it should have held up, unless some bug in collectpads or some demuxer (??)

Anyway, so much "for the record".
Comment 15 Tim-Philipp Müller 2014-02-02 11:26:07 UTC
Thanks for investigating.

By 'otherwise pre-rolled' I was thinking of GAP events for example.

Was this code added for the clut events? I would have thought those worked fine in 0.10 as well, but the problem in 0.10 was maybe that kateenc will not have been able to send headers at the start because the caps could only be passed with buffers.
Comment 16 Mark Nauwelaerts 2014-02-02 12:01:30 UTC
It was originally added when working with clut events (though apparently not needed in hindsight), though it would indeed have been really needed for the caps related data (like kate) in 0.10 (and not so anymore now).
Comment 17 Mark Nauwelaerts 2014-03-25 20:27:55 UTC
So, for the record, it seems that the additional code in matroskamux was needed in the past because matroskademux has always been (a bit) buggy in this regard in that it could send newsegment event (nowadays gap event) (and thereby kickstarting a muxer's header writing) before actually sending the clut event (with data needed for this header writing).

Following commit arranges for sending additional "meta-data-events" before stream synchronization in matroskademux:

commit 9a30726226bdff5fbedd509f6462fae1f9526839
Author: Mark Nauwelaerts <mnauw@users.sourceforge.net>
Date:   Sat Mar 22 17:05:17 2014 +0100

    matroskademux: early sending pending codec-data for all streams
    
    ... at least before syncing across all streams might cause some gap
    activity on any of those streams, notably sparse streams.
    
    See also #712134