After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 616936 - [matroskademux] Incorrect display of subtitles with markup
[matroskademux] Incorrect display of subtitles with markup
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
git master
Other Linux
: Normal normal
: 0.10.31
Assigned To: GStreamer Maintainers
GStreamer Maintainers
: 651739 654596 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2010-04-27 11:39 UTC by Nicolò Chieffo
Modified: 2012-02-18 15:27 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
the avi file (461.32 KB, video/x-msvideo)
2010-04-27 12:14 UTC, Nicolò Chieffo
  Details
the srt file (225 bytes, application/octet-stream)
2010-04-27 12:14 UTC, Nicolò Chieffo
  Details
the mkv file with subtitles (454.98 KB, video/x-matroska)
2010-04-27 12:15 UTC, Nicolò Chieffo
  Details
matroskademux: UTF-8 subtitles may have markup (1.69 KB, patch)
2010-06-23 09:16 UTC, Mark Nauwelaerts
none Details | Review
matroskademux: UTF-8 subtitles may have markup (4.48 KB, patch)
2010-07-09 15:03 UTC, Mark Nauwelaerts
committed Details | Review

Description Nicolò Chieffo 2010-04-27 11:39:36 UTC
When using and avi file plus a srt text file, totem can handle tags in subtitles by showing bold, italic and underline.
If instead the video is in MKV with embedded subtitles, it displays the tag.

I also made a test by extracting the subtitles and converting from MKV
to AVI. The result confirms the bug
Comment 1 Bastien Nocera 2010-04-27 11:40:49 UTC
Please upload a test file.
Comment 2 Nicolò Chieffo 2010-04-27 12:14:24 UTC
Created attachment 159684 [details]
the avi file
Comment 3 Nicolò Chieffo 2010-04-27 12:14:44 UTC
Created attachment 159685 [details]
the srt file
Comment 4 Nicolò Chieffo 2010-04-27 12:15:09 UTC
Created attachment 159686 [details]
the mkv file with subtitles
Comment 5 Nicolò Chieffo 2010-04-27 12:16:29 UTC
Here are all the files.
if you open directly the mkv file you'll see the tags.

if you download the avi and srt files you won't see the tags, but see the italic font.
Comment 6 Bastien Nocera 2010-04-27 14:18:59 UTC
Reproduced with gst-launch, so GStreamer bug.

My guess is that the subtitle stream is wrongly tagged, or detected.
Comment 7 Tim-Philipp Müller 2010-04-27 14:33:26 UTC
I think it's just that subtitles in matroska were assumed to be without markup, so the caps end up as text/plain rather than text/x-pango-markup.
Comment 8 Mark Nauwelaerts 2010-06-23 09:16:20 UTC
Created attachment 164376 [details] [review]
matroskademux: UTF-8 subtitles may have markup

On the one hand, matroskademux specs are not very specific on what "plain" UTF8 subtitles entail.

On the other hand, replacing text/plain with text/x-pango-markup (what patch does) Just Works, and AFAIK it should not break other plain text cases (being markup with empty markup) ?
Comment 9 Tim-Philipp Müller 2010-06-23 09:30:10 UTC
If we re-label plain text as markup, then we need to make sure & < > etc. are escaped properly (g_markup_*() utility functions). So I guess we'd need to check if there are tags in the text; if yes, then we can just assume everything is fine, if not we have to check if there are chars that need to be escaped that aren't escaped yet, or something.
Comment 10 Mark Nauwelaerts 2010-06-23 09:44:38 UTC
Ah.  That's a bit of a snag.  Sounds like some not well-defined heuristics are in order (somewhere) ...
Comment 11 Tim-Philipp Müller 2010-06-23 10:22:02 UTC
Maybe something like this would be enough:

 init stream->seen_tag to FALSE and then:

 stream->seen_tag = stream->seen_tag || check_if_subtitle_chunk_has_tag (txt);

 if (!stream->seen_tag)
   xyz = g_markup_escape_text (txt); 

?
Comment 12 Sebastian Dröge (slomo) 2010-06-23 19:39:30 UTC
and what about the rare case where plaintext subtitles contains something that looks like a tag?
Comment 13 Tim-Philipp Müller 2010-06-23 21:09:11 UTC
> and what about the rare case where plaintext subtitles contains something that
> looks like a tag?

That's just tough luck then. We could whitelist a number of common/acceptable tags to check for.

I think the chances that someone puts '<b>' or '<i>' in a subtitle chunk and actually wants it displayed like that are close to 0..
Comment 14 Mark Nauwelaerts 2010-07-09 15:03:16 UTC
Created attachment 165560 [details] [review]
matroskademux: UTF-8 subtitles may have markup

As before, but adds some (simple) heuristics as proposed to determine whether or not to escape subtitle text.
Comment 15 Sebastian Dröge (slomo) 2011-05-26 10:13:24 UTC
commit 74e0c05ff7d2270494d616ea2d86811bae5a3d53
Author: Mark Nauwelaerts <mark.nauwelaerts@collabora.co.uk>
Date:   Wed Jun 23 11:12:00 2010 +0200

    matroskademux: UTF-8 subtitles may have markup
    
    Fixes #616936.
Comment 16 David Schleef 2011-06-05 02:29:35 UTC
*** Bug 651739 has been marked as a duplicate of this bug. ***
Comment 17 Tim-Philipp Müller 2012-02-18 15:27:16 UTC
*** Bug 654596 has been marked as a duplicate of this bug. ***