GNOME Bugzilla – Bug 654673
non-UTF-8 filename causes warning in URI query
Last modified: 2011-09-13 21:01:29 UTC
When below message appear, the playbin2 may cause a crash: (<unknown>:3889): GStreamer-WARNING **: Trying to set string on structure field 'uri', but string is not valid UTF-8. Please file a bug.
(This shouldn't cause a crash, unless you set your environment or tell glib to abort on warnings.) But anyway, could you attach the file please? (The first few kB will probably do, if the tag is at the beginning: head --bytes=990k foo.mp3 > head.mp3)
Created attachment 192022 [details] mp3 with GB2312 song name
May be not related to the ID3 tag, may be it is caused by the url is in GB2312 code (Chinese charactor). The song is on a SD card that has a Chinese name in GN2312. The behavior is strange: I wrap a playbin2 to a program that run as a daemon and it communicates with a main Qt application through a TCP socket. When the program start separately under a linux shell, it can work even the warning message appear. But when it is run through a QProcess with a use of "&>" to redirect all the message to a file inside the Qt application, it terminates suddenly (not creating the redirected file) just before the warning message suddenly. I don't know what is happening. Attached is the first 1M bytes of the song. It is in a Chinese name.
I have tried to find out where the warning message is sent. After I remove the g_warning() statement, I can start the daemon through QProcess inside Qt application. So, in the mean time, how can I mask out all the warning message from the gstreamer? Colman
Oh I see, it's not from the tag at all. Could you run this in gdb (and ideally install debugging symbols for gstreamer + plugins): $ G_DEBUG=fatal_warnings gdb --args /usr/bin/gst-launch-0.10 playbin2 uri=file:///path/to/filename.mp3 (gdb) run ... wait for abort .. (gdb) bt ... paste output from here ...
You can play around with the fatal mask stuff from GLib (to override your environment variable settings?): http://developer.gnome.org/glib/2.29/glib-Message-Logging.html
It is a embedded Linux system, I think it does not have enough resource to run gdb, it only have 64Mbytes RAM. By the way, how can I make it mask out all console messages when it is run as a daemon? Colman
Have you tried reproducing the problem on a desktop system? I'm sure it'll happen here as well. Without gdb or a way to get a stack trace, this is going to be quite painful to debug. What's the source element being used here? filesrc? Can you do this: $ ls -1 /path/to/file.mp3 > filename.dump and attach filename.dump (or the output of hexdump -C filename.dump)?
Created attachment 192026 [details] The song name Attached please find the song name. I believe that the problem is related to the g_warning() at gst_structure_set_field() of file gststructure.c function as below: else if (G_UNLIKELY (s != NULL && !g_utf8_validate (s, -1, NULL))) { g_warning ("Trying to set string on %s field '%s', but string is not " "valid UTF-8. Please file a bug.", IS_TAGLIST (structure) ? "taglist" : "structure", g_quark_to_string (field->name)); g_value_unset (&field->value); return; } I am looking at the g_log_set_fatal_mask(const gchar *log_domain, GLogLevelFlags fatal_mask) function, what should I pass to the "log_domain" in order to mask out the "GStreamer-WARNING ..." message? Colman
> I believe that the problem is related to the g_warning() at > gst_structure_set_field() of file gststructure.c function as below: Well yes, of course it is. That much we knew, but what we need to know is what element / piece of code created the broken URI in the first place (an URI should always be ASCII, with non-ASCII characters escaped). > I am looking at the g_log_set_fatal_mask(const gchar *log_domain, > GLogLevelFlags fatal_mask) function, what should I pass to the > "log_domain" in order to mask out the "GStreamer-WARNING ..." message? Masking the message isn't going to help. The bug needs to be found and fixed. Other things won't work properly as a result of this issue (whether the warning is displayed or not).
Actually, it is not a broken uri, it is a file inside a local storage. Since the OS allow a NON-ASCII file name, such as a BIG5 or GB2312 encoded file name. However, gstreamer can find it and play with the file without problem, only it gives out a warning as it is not an UTF-8 string. So, do you mean I have to convert the non-ASCII character to an escape sequence? Colman
I have solved this problem by using QProcess::startDetached() function to run it as a standalone process, the warning message will print on the console. So, I think gstreamer should convert non-ascii character to an escape sequence if the string is not utf-8 encoded. Colman
Yes, when creating an URI from a file name (which can be in any encoding), GStreamer needs to escape non-ASCII characters. I have added a unit test for that, and it looks like it does that, at least in git. So, some more questions: - what version of GStreamer are you using? (gst-inspect-0.10 filesrc | grep Version) - what source element is being used here? filesrc?
Oh, and one more question: - how do you set the filename/URI? You set an URI on playbin2, right? How do you create that URI exactly, given a filename? what is the output of e.g. g_print ("uri: %s\n", uri); just before you set it on playbin2?
1. The version is 0.10.35 2. I don't understand what is the mean of "filesrc"? Is it equal to "coreelements" ? 3. The uri is a file name got directly from the SD card (FAT32 formated) by using Qt's QFileSystemModel class. The codepage is cp936(GB2312, Chinese simplified character set). Whwn the name is returned by QFileSystemModel, it is added with a suffix of "file://" and then use g_object_set() to set it to playbin2. Since my console cannot display Chinese (but my Qt application runing on an ARM platform can), I cannot get the correct name shown on the console, but the name is correctly sent to playbin2 and it can play it without problem. Colman
> 2. I don't understand what is the mean of "filesrc"? Is it equal to > "coreelements" ? Never mind this one, it's a core element, yes :) > 3. The uri is a file name got directly from the SD card (FAT32 formated) by > using Qt's QFileSystemModel class. The codepage is cp936(GB2312, Chinese > simplified character set). When the name is returned by QFileSystemModel, it is > added with a suffix of "file://" and then use g_object_set() to set it to > playbin2. Ah ok, this is not really correct then (though we do accept it). Try using gst_filename_to_uri() or g_filename_to_uri() to create an URI from a filename (these functions will escape things properly).
Thank you, Will try later. Colman
colman do you have any update for this bug?
No. Use g_filename_to_url() function can solve the problem. Colman
> No. Use g_filename_to_url() function can solve the problem. Ok, thanks, will close as invalid then. Also added some more tests, but didn't find any problems. It is possible that there's still a bug in some source element of course, but we don't know which and it doesn't look like it's filesrc. commit 1051eddd4ca117fd8bc4bf013503fb6fb3075fd4 Author: Tim-Philipp Müller <tim.muller@collabora.co.uk> Date: Tue Sep 13 21:58:21 2011 +0100 tests: make sure filesrc returns escaped URIs even if the input was unescaped https://bugzilla.gnome.org/show_bug.cgi?id=654673 commit 14a79628a732d040402c09990e43ea80c14a681c Author: Tim-Philipp Müller <tim.muller@collabora.co.uk> Date: Mon Sep 12 15:10:37 2011 +0100 playbin2: try to catch malformed URIs Only log in debug log for now, since the check is a bit half-hearted, its purpose is mostly to make sure people use gst_filename_to_uri() or g_filename_to_uri(). https://bugzilla.gnome.org/show_bug.cgi?id=654673