After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 657872 - [subparse] Doesn't detect some SRT subtitle files
[subparse] Doesn't detect some SRT subtitle files
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-base
git master
Other Linux
: Normal normal
: 0.10.36
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2011-08-31 21:04 UTC by Nicolas Dufresne (ndufresne)
Modified: 2011-09-08 12:58 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
SRT file being rejected (144 bytes, text/plain)
2011-08-31 23:43 UTC, Nicolas Dufresne (ndufresne)
  Details
subparse: Improve subrip type check regex (1.55 KB, patch)
2011-09-01 01:02 UTC, Nicolas Dufresne (ndufresne)
committed Details | Review

Description Nicolas Dufresne (ndufresne) 2011-08-31 21:04:48 UTC
subparse element expect to see application/x-subtitle for SRT files while XDG mime type finder will tag them as x-subrip. This prevent SRT subtitle to work with plaubin2.

Looking into libav they seem to assume x-subrip too, but libav is about mime, not custom media types.
Comment 1 Tim-Philipp Müller 2011-08-31 22:21:39 UTC
Why does it matter what the XDG mime type finder returns? Our own typefinder should trump that every time. If it doesn't, it needs to be fixed.
Comment 2 Nicolas Dufresne (ndufresne) 2011-08-31 22:35:22 UTC
Well you just said it, it needs fix. Now, the first thing I see is that the so called type finder that should override XDG is MARGINAL, iirc that means it will be ignored. So as it's not used XDG finder is the only one to give a reply. Then in the subparse element (the one that contains the finder), there is a regex that does not seems to work matching the type. I'm using simple file found on wikipedia, http://en.wikipedia.org/wiki/SubRip.

Questions:
Why do we have an XDG type finder if we make sure that our custom type are not valid mime type from XDG when such a type exist ?

Rational:
Having XDG makes video player like Totem report that a missing plugin for mime type application/x-subrip does not exist, but in fact GStreamer have defined this as application/x-subtitle, so neither user or CODEC finder can ever find a suitable plugin.
Comment 3 Tim-Philipp Müller 2011-08-31 22:50:30 UTC
> Now, the first thing I see is that the so
> called type finder that should override XDG is MARGINAL, iirc that means it
> will be ignored. 

No, that's not what it means, and I don't think that's what's happening (in absence of logs showing the opposite).

> So as it's not used XDG finder is the only one to give a
> reply. Then in the subparse element (the one that contains the finder), there
> is a regex that does not seems to work matching the type. I'm using simple file
> found on wikipedia, http://en.wikipedia.org/wiki/SubRip.

Ok, so fix it please.


> Questions:
> Why do we have an XDG type finder if we make sure that our custom type are not
> valid mime type from XDG when such a type exist ?

To prevent false positives on non-media files (this is a real problem).

Sadly we can't blacklist subtitle types in general from the xdg typefinder. If we could, we would blacklist them. You can blacklist this type if you like.

Ultimately it's a bug in our srt typefinder, so let's just fix that.
 
> Rational:
> Having XDG makes video player like Totem report that a missing plugin for mime
> type application/x-subrip does not exist, but in fact GStreamer have defined
> this as application/x-subtitle, so neither user or CODEC finder can ever find a
> suitable plugin.

Yes, it's a bug. It needs fixing. Fix the srt typefinder.
Comment 4 Nicolas Dufresne (ndufresne) 2011-08-31 23:13:56 UTC
(In reply to comment #3)
> Yes, it's a bug. It needs fixing. Fix the srt typefinder.

I'm currently digging, if I don't find the bug, I'll report.

But I'm trying to make sure people realize how badly XDG can affect user interaction. Fixing this will work for this case, but this is a situation that will come back for other type until all the media type that exist in the world are supported by gst-plugins-bad (simply thinking what about until then).

And I think there exist solutions to that, as an example, when we speak about file-type we could respect the mime-type instead of making sure to use something different. That may hide finder bugs, but won't mess with users. Or, find a way to make sure that if the result if from XDG (I mean a mime database, XDG being one of) we make sure that it's kept internal, which avoid sending wrong information to the plugin finder, or worst to the user.

(btw, not sending traces because I know it's easy to reproduce for devs interested to help, you need a linux system with GIO, mime-info and a SRT file I mentioned before)
Comment 5 Nicolas Dufresne (ndufresne) 2011-08-31 23:43:52 UTC
Created attachment 195358 [details]
SRT file being rejected

I found that there was a tiny difference between my SRT and the one on Wikipedia, and this is actually causing the typefinder to reject the file. You'll find that I omited a 0 in the seconds element. I was validating my file with mplayer which tolerate this tiny error. If our parser support it, I suggest making that extra 0 optional in the regex.

1
00:00:5,440 --> 00:00:10,375
Comment 6 Nicolas Dufresne (ndufresne) 2011-09-01 01:02:12 UTC
Created attachment 195359 [details] [review]
subparse: Improve subrip type check regex

This patch prevents timestamp like "1 1:00:00", which would have been seen
as hour 101 by our parser, and allow single digit hour, minute and seconds
as it's already supported by the parser, and also by other implementation
like in mplayer.

Note that this does not fix the problem with XDG reporting bad types. To follow current code, we could ignore known subtitle types from xdg reply. Here's the list from freedesktop database. I guess they should be removed if we support them, some of them are being match using pattern (e.g. microdvd, mpsub, subviewer etc. shares extension .sub).

application/x-subrip (.srt)
application/x-sami (.smi .sami)
text/x-microdvd (.sub)
text/x-mpsub (.sub)
text/x-ssa (.ssa .ass)
text/x-subviewer (.sub)
Comment 7 Nicolas Dufresne (ndufresne) 2011-09-08 01:36:09 UTC
So, anyone would like to review ?
Comment 8 Sebastian Dröge (slomo) 2011-09-08 12:58:34 UTC
commit 25939e02183e9a9420d02c3f2c3ddcd3fe83cd65
Author: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Date:   Wed Aug 31 20:51:17 2011 -0400

    subparse: Improve subrip type check regex
    
    This patch prevents timestamp like "1 1:00:00", which would have been seen
    as hour 101 by our parser, and allow single digit hour, minute and seconds
    as it's already supported by the parser, and also by other implementation
    like in mplayer. This fixes bug 657872.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=657872