After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 508291 - [id3demux] must check if language code in id3v2 COMM frames is UTF-8
[id3demux] must check if language code in id3v2 COMM frames is UTF-8
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
git master
Other Linux
: Normal normal
: 0.10.7
Assigned To: Tim-Philipp Müller
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2008-01-09 13:26 UTC by Tommi Myöhänen
Modified: 2008-01-09 15:59 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
kpa.mp3 (192.00 KB, application/octet-stream)
2008-01-09 13:27 UTC, Tommi Myöhänen
  Details
id3v2frames.c.lang_utf8_patch (687 bytes, patch)
2008-01-09 13:28 UTC, Tommi Myöhänen
reviewed Details | Review
gststructure.c.utf8_patch (566 bytes, patch)
2008-01-09 13:29 UTC, Tommi Myöhänen
reviewed Details | Review

Description Tommi Myöhänen 2008-01-09 13:26:47 UTC
The following warning is displayed with attached clip:
$ gst-launch playbin uri=file:///tmp/kpa.mp3 
Setting pipeline to PAUSED ...

(gst-launch-0.10:1341): GStreamer-WARNING **: Trying to set string field 'extended-comment' on structure, but string is not valid UTF-8. Please file a bug.
Pipeline is PREROLLING ...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstAudioSinkClock

The warning is caused by strange language code field in id3v2 comment tag of the file, so maybe id3demux should check this field too for non-utf8 strings.

Also, the warning comes from gst_structure_id_set_value() where non-utf8 strings are skipped, but only if gst-debug is enabled. To avoid confusion I think enabling gst-debug should only add the debug prints and not change the behaviour of the code.
Comment 1 Tommi Myöhänen 2008-01-09 13:27:49 UTC
Created attachment 102459 [details]
kpa.mp3
Comment 2 Tommi Myöhänen 2008-01-09 13:28:43 UTC
Created attachment 102460 [details] [review]
id3v2frames.c.lang_utf8_patch
Comment 3 Tommi Myöhänen 2008-01-09 13:29:10 UTC
Created attachment 102461 [details] [review]
gststructure.c.utf8_patch
Comment 4 Tim-Philipp Müller 2008-01-09 15:20:58 UTC
Nice catch! Indeed, we should check that. I've solved it a bit differently though:

  2008-01-09  Tim-Philipp Müller  <tim at centricular dot net>

        Based on patch by: Tommi Myöhänen <ext-tommi.myohanen nokia com>

        * gst/id3demux/id3v2frames.c: (parse_comment_frame):
          Make sure the ISO 639-X language code in ID3v2 COMM frames
          is actually valid UTF-8 (or rather: ASCII), so we don't end
          up with non-UTF8 strings in tags if there's garbage in the
          language field. Also make sure the language code is always
          lower case. Fixes: #508291.



> Also, the warning comes from gst_structure_id_set_value() where non-utf8
> strings are skipped, but only if gst-debug is enabled. To avoid confusion I
> think enabling gst-debug should only add the debug prints and not change the
> behaviour of the code.

I agree it shouldn't change behaviour of the code, but allowing non-UTF8 strings to end up in structures/taglists is just a very bad idea, since it can easily lead to random crashes or invalid memory accesses when apps pass those strings to g_markup_escape() or g_markup_printf_escaped(), like they would if they wanted to construct fancy pango markup.

Therefore, I've changed gststructure.c to always check strings for UTF-8 conformance and to always return if the string is non-UTF-8. The performance impact should be negligible, and it might reveal bugs in proprietary plugins that are only tested in environments where gst-debug is disabled:

  2008-01-09  Tim-Philipp Müller  <tim at centricular dot net>

        * gst/gststructure.c: (gst_structure_id_set_value):
          Always check UTF-8 conformance of structure strings and not only
          if the debugging system is enabled; reasoning: the behaviour of
          the actual code shouldn't really change depending on whether the
          debugging system is enabled or not (#508291).
Comment 5 Tommi Myöhänen 2008-01-09 15:59:09 UTC
Thanks, your modifications look good.