After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 752421 - matroskademux: SRT subtitles with markup are displayed with a trailing asterisk
matroskademux: SRT subtitles with markup are displayed with a trailing asterisk
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
git master
Other Linux
: Normal blocker
: 1.5.90
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2015-07-15 13:16 UTC by Dimitrios Christidis
Modified: 2015-08-16 13:38 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
testcase (must be merged into a Matroska file) (121 bytes, application/x-subrip)
2015-07-15 13:16 UTC, Dimitrios Christidis
  Details
possible fix based on subparse.c (1.22 KB, patch)
2015-07-15 13:17 UTC, Dimitrios Christidis
committed Details | Review
always copy buffer and add terminator (2.51 KB, patch)
2015-07-16 16:02 UTC, Dimitrios Christidis
none Details | Review
trim buffer only if it contains NUL terminator (1.95 KB, patch)
2015-07-16 16:04 UTC, Dimitrios Christidis
none Details | Review
trim buffer only if it contains NUL terminator (1.70 KB, patch)
2015-07-18 07:14 UTC, Dimitrios Christidis
committed Details | Review

Description Dimitrios Christidis 2015-07-15 13:16:07 UTC
Created attachment 307477 [details]
testcase (must be merged into a Matroska file)

The issue can be seen with the attached example, when merged into a Matroska file and played with totem or gst-launch.

The asterisk is added by the pango element because a call to g_utf8_validate() fails. This happens because the subtitle buffer size also counts the null terminator.

I have prepared a patch that seems to address the issue based on the subparse element.
Comment 1 Dimitrios Christidis 2015-07-15 13:17:08 UTC
Created attachment 307478 [details] [review]
possible fix based on subparse.c
Comment 2 Tim-Philipp Müller 2015-07-16 12:24:24 UTC
Thanks, this looks right to me. The buffers we push out should not contain a trailing NUL character in the (visible) buffer data, it's just added for safety.

commit 45892ec8becdc6b5d1d72a1b90b42f90b699cf7e
Author: Dimitrios Christidis <dchristidis@mykolab.com>
Date:   Wed Jul 15 13:44:52 2015 +0300

    matroskademux: fix trailing '*' displayed with some text subtitles
    
    The subtitle buffer we push out should not include a NUL terminator
    as part of the data, we just add such a terminator for safety, but
    it should not be included in the buffer size.
    
    A NUL terminator is not valid UTF-8, so checks will fail if it's
    included in the size, and the NUL will be replaced by the fallback
    character specified when converting, i.e. '*'.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=752421
Comment 3 Tim-Philipp Müller 2015-07-16 12:57:52 UTC
I think this needs more fixing, for the case where there is a 0-terminator in the data already (since we now adjusted it from map.size-1 to map.size everywhere in the conversion/check functions).
Comment 4 Dimitrios Christidis 2015-07-16 16:02:57 UTC
Created attachment 307571 [details] [review]
always copy buffer and add terminator
Comment 5 Dimitrios Christidis 2015-07-16 16:04:04 UTC
Created attachment 307572 [details] [review]
trim buffer only if it contains NUL terminator
Comment 6 Dimitrios Christidis 2015-07-16 16:04:34 UTC
Indeed, the first patch causes a regression (see bug #707933). There are two ways to address this.

With the second patch, a copy is always made to ensure a NUL terminator is added but is not part of the data.

With the third patch, the buffer is trimmed if it contains a terminator. Otherwise, no action is taken.
Comment 7 Nicolas Dufresne (ndufresne) 2015-07-17 22:59:55 UTC
What's left here, do we still need these patch now that bug 707933 is fixed ?
Comment 8 Dimitrios Christidis 2015-07-18 07:14:07 UTC
Created attachment 307651 [details] [review]
trim buffer only if it contains NUL terminator

Removed modification of line 1.
Comment 9 Dimitrios Christidis 2015-07-18 07:59:54 UTC
With the first patch merged, the function will not work correctly on subtitle buffers with NUL terminators as part of the data. I do not have a test case for this, but according to bug #707933 it could happen. Thus, that bug would be reopened.

I think the third patch is the better approach. Per our discussion on IRC, when shrinking a buffer, one should not rely on the trimmed data being there. Furthermore, no part of the codebase that handles subtitle buffers should depend on the existence of a NUL terminator. Finally, I think it looks cleaner, but this is subjective.

The commit message certainly needs some work from a developer with more experience.
Comment 10 Tim-Philipp Müller 2015-07-21 13:29:20 UTC
Third time lucky, hopefully:

commit c1382e97fa48f299895dd641df75b4e2ca588f31
Author: Tim-Philipp Müller <tim@centricular.com>
Date:   Thu Jul 16 18:09:30 2015 +0100

    tests: add minmal matroskademux test for subtitle output
    
    Some of the subtitle chunks will have embedded
    NUL-terminators (last three), some don't (first three),
    some will have markup, some won't, some will be valid
    UTF-8 (all but last), some won't (last stanza).
    
    https://bugzilla.gnome.org/show_bug.cgi?id=752421

commit 744167056cea37fd6b225416624ff84bbb130b8f
Author: Dimitrios Christidis <dchristidis@mykolab.com>
Date:   Thu Jul 16 18:49:26 2015 +0300

    matroskademux: fix for subtitle buffers with NUL terminators
    
    Commit 45892ec8 created a regression where g_utf8_validate() would fail
    if the subtitle buffer had a NUL terminator as part of the data.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=752421



I agree that jumping through hoops to NUL-terminate the data just in case some other element is buggy is a bit over the top, so let's not do that and just trim off an existing terminator.