GNOME Bugzilla – Bug 681368
typefinding: CSV file is detected as MP3
Last modified: 2013-08-29 08:44:05 UTC
Created attachment 220542 [details] Testcase 1 - CSV Hi, I use default install of GStreamer 0.10.36 from Ubuntu 12.04 (32bit) Please take a look on attached CSV files. They are interpreted as MP3. Test commands I've used: gst-launch -v filesrc location=fakemp3-1.csv ! decodebin2 ! fakesink gst-launch -v filesrc location=fakemp3-2.csv ! decodebin2 ! fakesink
Created attachment 220543 [details] Testcase 2 - CSV
Created attachment 220544 [details] Testcase 1 - gst-launch output
Created attachment 220545 [details] Testcase 2 - gst-launch output
What exacly do you mean by "interpreted as MP3"?
> What exacly do you mean by "interpreted as MP3"? $ gst-typefind-0.10 ~/samples/misc/681368-typefinding-is-not-mp3-* /home/tpm/samples/misc/681368-typefinding-is-not-mp3-1.csv - audio/mpeg, mpegversion=(int)1, layer=(int)1 /home/tpm/samples/misc/681368-typefinding-is-not-mp3-2.csv - audio/mpeg, mpegversion=(int)1, layer=(int)1
Exactly. Ah, my fault, it's not layer 3 but still typefind interprets that as an audio/mpeg.
1.0.6 is still affected.
Right, the problem is that this is UTF-16 LE with a BOM marker (FF FE) at the beginning, which the mp3 typefinder misinterprets as possible mp3 sync. I'll see what we can do..
commit ef5c6d351f04ff5d711912971d77c4be75963800 Author: Tim-Philipp Müller <tim.muller@collabora.co.uk> Date: Thu Jul 25 11:56:07 2013 +0100 typefinding: don't detect mp3 based on just a few bits Remove dodgy code that detects mp3 with as little as a valid frame sync at the beginning. This was only used in some unit tests in -good where there were only a few bytes after the id3 tag. We now require at least two frame headers. Fixes mis-dection of text files with UTF-16 LE BOM as mp3. https://bugzilla.gnome.org/show_bug.cgi?id=681368 $ gst-typefind-1.0 ~/samples/misc/681368-typefinding-is-not-mp3-* /home/tpm/samples/misc/681368-typefinding-is-not-mp3-1.csv - text/utf-16, endianness=(int)1234 /home/tpm/samples/misc/681368-typefinding-is-not-mp3-2.csv - text/utf-16, endianness=(int)1234
I can observe the same problem again in 1.0.8. Please see bug 707016 for more information.
The files attached here are still detected as plaintext with 1.1.4. Are you saying that the attached ones are again detected as MP3 for you or other files? Also, can you retest with 1.0.10?
Nevermind, that fix was never backported to the 1.0 branch. So was only fixed in 1.1.3 and later. The fix will now be included in 1.0.11.