After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 723327 - matroskademux: Set buffer offsets for output
matroskademux: Set buffer offsets for output
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
git master
Other Linux
: Normal enhancement
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2014-01-30 23:09 UTC by Brendan Long
Modified: 2018-11-03 14:51 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch to copy all buffer metadata in webvttenc (968 bytes, patch)
2014-01-31 21:26 UTC, Brendan Long
needs-work Details | Review

Description Brendan Long 2014-01-30 23:09:03 UTC
I'm looking at using the buffer offset as a unique identifier for caption/subtitle cues. This HTML mailing list thread is relevant:

http://lists.w3.org/Archives/Public/public-html/2014Jan/0083.html

Specifically:

> Is the UA expected to keep a unique internal id for each in-band
> TextTrackCue it creates so that it doesn't reinsert a cue that
> was inserted previously, but was modified?

To which the answer appears to be, "yes".

The worst-case scenario for this seems to be WebVTT, where the *only* way to uniquely identify cues is by where they appear in a file:

    00:00:00 --> 00:00:05
    Lorem ipsum

    00:00:00 --> 00:00:05
    Lorem ipsum

So, I'd like to use GST_BUFFER_OFFSET, but it looks like it never sets set for any caption or subtitle formats. As a proof-of-concept, I'd like to implement this for Matroska/SSA, since it seems to be the best-supported cue format right now. Does that make sense?
Comment 1 Tim-Philipp Müller 2014-01-30 23:33:25 UTC
What would the buffer offset represent?
Comment 2 Brendan Long 2014-01-31 00:09:45 UTC
The number of bytes into the file where the cue started.

So given by example above, the offset of the first cue would be 0 and the second would be 35. Obviously it's a lot more complicated with cues embedded in Matroska files..
Comment 3 Tim-Philipp Müller 2014-01-31 00:26:29 UTC
I'm not convinced this is viable. GST_BUFFER_OFFSET is usually context and media type specific, e.g. for audio it means sample frames, for video it's the frame number, for subtitles I'm not sure what it should be, but passing through things like byte-offset from the original container just doestn't sound right.
Comment 4 Brendan Long 2014-01-31 00:27:55 UTC
It could be the cue number (so the first one would be 1, the second would be 2, etc.), but I'm not sure how we would reliably generate that when seeking into the middle of a file.
Comment 5 Sebastian Dröge (slomo) 2014-01-31 07:06:18 UTC
That all doesn't seem a good idea... if there's a need for identifying cues, I think the cue identifier should instead be used.
Comment 6 Brendan Long 2014-01-31 16:00:37 UTC
What cue identifier?
Comment 7 Brendan Long 2014-01-31 21:26:12 UTC
Created attachment 267757 [details] [review]
Patch to copy all buffer metadata in webvttenc

I ran into an issue while creating a proof-of-concept for this. webvttenc throws away all of the buffer metadata except timestamp and duration. Even if we decide not to include offsets in subtitle buffers, webvttenc shouldn't be throwing it away, along with the flags, GstMeta, and anything else we might attach to buffers in the future.
Comment 8 Sebastian Dröge (slomo) 2014-02-03 17:28:09 UTC
(In reply to comment #6)
> What cue identifier?

http://dev.w3.org/html5/webvtt/#dfn-webvtt-cue-identifier
Comment 9 Brendan Long 2014-02-04 23:55:58 UTC
(In reply to comment #8)
> (In reply to comment #6)
> > What cue identifier?
> 
> http://dev.w3.org/html5/webvtt/#dfn-webvtt-cue-identifier

Cues aren't required to have an ID though (like they are in SRT). Most WebVTT files I've seen don't include IDs at all.

This isn't something we're trying to expose to JavaScript, we just need *any* way to determine if two cues are the same cue (not just identical content). I haven't been able to think of anything except the byte offset in the file though. Keeping track of the number of cues we've seen, like the frame number for video would be nice, but I'm not sure if there's any way to do that if we jump to the middle of a file.
Comment 10 Sebastian Dröge (slomo) 2014-02-06 21:52:45 UTC
And the content plus timestamp are not enough for that? I don't see a reliable way to assign unique IDs to them
Comment 11 Sebastian Dröge (slomo) 2014-02-06 21:54:14 UTC
Comment on attachment 267757 [details] [review]
Patch to copy all buffer metadata in webvttenc

This needs iterate over all metas and decide one-by-one if it can be copied or needs to be transformed or dropped... based on the tags of the meta.

Look at basetransform for an example
Comment 12 GStreamer system administrator 2018-11-03 14:51:13 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/104.