After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 729113 - id3demux: Produce text/html output caps and fails
id3demux: Produce text/html output caps and fails
Status: RESOLVED NOTABUG
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
unspecified
Other Linux
: Normal normal
: NONE
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2014-04-28 14:15 UTC by Nicolas Dufresne (ndufresne)
Modified: 2014-04-29 13:49 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Nicolas Dufresne (ndufresne) 2014-04-28 14:15:27 UTC
I have an mp3 that don't play in GStreamer. It starts with ID3, but id3demux set output caps to text/html, and then the pipeline fails.
Comment 1 Jan Schmidt 2014-04-28 15:01:58 UTC
Are you just bragging, or are you going to show us the file? ;)
Comment 2 Nicolas Dufresne (ndufresne) 2014-04-28 15:05:06 UTC
Arg, file was too big, ignored silently
Comment 3 Nicolas Dufresne (ndufresne) 2014-04-28 15:06:41 UTC
Note, I'll remove as soon as we have a fix, since I have no idea where is that file from/copyright and stuff.

http://people.collabora.com/~nicolas/a.mp3
Comment 4 Tim-Philipp Müller 2014-04-28 15:20:07 UTC
$ gst-launch-1.0 filesrc location= /home/tpm/samples/misc/729113-typefind-html.mp3 ! id3demux ! fakesink dump=true | grep ^0 | head -n4
00000000 (0x7f53bc0061a0): 3c 48 54 4d 4c 3e 0d 0a 3c 48 45 41 44 3e 3c 54  <HTML>..<HEAD><T
00000010 (0x7f53bc0061b0): 49 54 4c 45 3e 50 61 67 65 20 45 78 70 69 72 65  ITLE>Page Expire
00000020 (0x7f53bc0061c0): 64 3c 2f 54 49 54 4c 45 3e 0d 0a 3c 2f 48 45 41  d</TITLE>..</HEA
00000030 (0x7f53bc0061d0): 44 3e 0d 0a 3c 42 4f 44 59 20 42 47 43 4f 4c 4f  D>..<BODY BGCOLO

Just saying..
Comment 5 Jan Schmidt 2014-04-28 15:32:15 UTC
Yeah, not sure how that should be handled - there's id3, then some random html content, then eventually the mp3 packets.
Comment 6 Nicolas Dufresne (ndufresne) 2014-04-28 15:36:33 UTC
From the provider of this file, there is HTML in the ID3, but clearly it's
detected as being after, running the typefind on that obviously gives
text/html. So I'd guess a parser bug, otherwise we really have html garbage
between the id3 and the mp3. Most mp3 player can skip garbage without issue though ...
Comment 7 Nicolas Dufresne (ndufresne) 2014-04-28 15:37:54 UTC
Even GST do actually :-P
gst-launch-1.0 filesrc location=a.mp3 ! mad ! pulsesink
Comment 8 Jan Schmidt 2014-04-28 15:40:39 UTC
The HTML isn't in the ID3 tag. It starts immediately after - the ID3 tag is declared as 54210 bytes long.
Comment 9 Nicolas Dufresne (ndufresne) 2014-04-28 17:38:09 UTC
I confirm this. It's really the worst case for us. With the demuxer not knowing why are the valid types, I'm not sure how we could handle this.
Comment 10 Tim-Philipp Müller 2014-04-28 17:53:32 UTC
I'm tempted to just WONTFIX this. It's clearly a broken file from a buggy server script and probably rather unique. Not sure it's worth jumping through hoops to "fix" this.
Comment 11 Nicolas Dufresne (ndufresne) 2014-04-28 18:12:18 UTC
Yes and no, e.g. this kind of corruption need to be supported to for HW mp3 player to get the branding. But I suspect our model is not exactly prepared for that. E.g. if it was an jpeg instead of html, we would render the jpeg, but the jpeg decoder would fail later receiving the mp3 data.

Though, we could also state that if you where to implement a commercial mp3 player, you would use a static pipeline, with a customer decoder that also handle id3. I would find GStreamer a bit useless in this scenario to be honest.
Comment 12 Tim-Philipp Müller 2014-04-28 18:19:16 UTC
What "branding" is this a test file for?
Comment 13 Nicolas Dufresne (ndufresne) 2014-04-28 18:28:49 UTC
(In reply to comment #12)
> What "branding" is this a test file for?

I'm not saying that file. To be allowed to use MP3 branding on your device, I know that you'll have to go through certification and run over few tests. And I know theses tests includes having corruption between mpeg frames. I guessed this most likely include having corruption between the id3 and the first frame, which is exactly the scenario we are facing.
Comment 14 Tim-Philipp Müller 2014-04-28 18:41:45 UTC
No, it's not the same scenario at all. We should be recognising files with some junk at the beginning or between some frames just fine.
Comment 15 Nicolas Dufresne (ndufresne) 2014-04-28 19:23:47 UTC
I've crafted two files with 6bytes corruption at similar location.

http://people.collabora.com/~nicolas/test.mp3
http://people.collabora.com/~nicolas/test2.mp3

test2.mp3 has random value as the 6 bytes, and this is recognized and skipped, test.mp3 has specially crafted junk (a partial png header) and fails.

So that matches what just have been said. It will fail if it's not junk (html file, png, jpeg, etc), or if you are unlucky enough that the junk triggers a type in typefinder.
Comment 16 Tim-Philipp Müller 2014-04-29 13:49:05 UTC
Let's close this, I think this is not really a bug, but expected behaviour, and reasonable behaviour.

There are probably non-hackish ways to make this work, but that would require some more thought about typefinding design issues, and would be an enhancement and a bit more work. I don't know if it's worth it to be honest. If you think it is, then please clone an enhancement bug from this bug.