GNOME Bugzilla – Bug 723327
matroskademux: Set buffer offsets for output
Last modified: 2018-11-03 14:51:13 UTC
I'm looking at using the buffer offset as a unique identifier for caption/subtitle cues. This HTML mailing list thread is relevant: http://lists.w3.org/Archives/Public/public-html/2014Jan/0083.html Specifically: > Is the UA expected to keep a unique internal id for each in-band > TextTrackCue it creates so that it doesn't reinsert a cue that > was inserted previously, but was modified? To which the answer appears to be, "yes". The worst-case scenario for this seems to be WebVTT, where the *only* way to uniquely identify cues is by where they appear in a file: 00:00:00 --> 00:00:05 Lorem ipsum 00:00:00 --> 00:00:05 Lorem ipsum So, I'd like to use GST_BUFFER_OFFSET, but it looks like it never sets set for any caption or subtitle formats. As a proof-of-concept, I'd like to implement this for Matroska/SSA, since it seems to be the best-supported cue format right now. Does that make sense?
What would the buffer offset represent?
The number of bytes into the file where the cue started. So given by example above, the offset of the first cue would be 0 and the second would be 35. Obviously it's a lot more complicated with cues embedded in Matroska files..
I'm not convinced this is viable. GST_BUFFER_OFFSET is usually context and media type specific, e.g. for audio it means sample frames, for video it's the frame number, for subtitles I'm not sure what it should be, but passing through things like byte-offset from the original container just doestn't sound right.
It could be the cue number (so the first one would be 1, the second would be 2, etc.), but I'm not sure how we would reliably generate that when seeking into the middle of a file.
That all doesn't seem a good idea... if there's a need for identifying cues, I think the cue identifier should instead be used.
What cue identifier?
Created attachment 267757 [details] [review] Patch to copy all buffer metadata in webvttenc I ran into an issue while creating a proof-of-concept for this. webvttenc throws away all of the buffer metadata except timestamp and duration. Even if we decide not to include offsets in subtitle buffers, webvttenc shouldn't be throwing it away, along with the flags, GstMeta, and anything else we might attach to buffers in the future.
(In reply to comment #6) > What cue identifier? http://dev.w3.org/html5/webvtt/#dfn-webvtt-cue-identifier
(In reply to comment #8) > (In reply to comment #6) > > What cue identifier? > > http://dev.w3.org/html5/webvtt/#dfn-webvtt-cue-identifier Cues aren't required to have an ID though (like they are in SRT). Most WebVTT files I've seen don't include IDs at all. This isn't something we're trying to expose to JavaScript, we just need *any* way to determine if two cues are the same cue (not just identical content). I haven't been able to think of anything except the byte offset in the file though. Keeping track of the number of cues we've seen, like the frame number for video would be nice, but I'm not sure if there's any way to do that if we jump to the middle of a file.
And the content plus timestamp are not enough for that? I don't see a reliable way to assign unique IDs to them
Comment on attachment 267757 [details] [review] Patch to copy all buffer metadata in webvttenc This needs iterate over all metas and decide one-by-one if it can be copied or needs to be transformed or dropped... based on the tags of the meta. Look at basetransform for an example
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/104.