After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 654673 - non-UTF-8 filename causes warning in URI query
non-UTF-8 filename causes warning in URI query
Status: RESOLVED INVALID
Product: GStreamer
Classification: Platform
Component: gstreamer (core)
0.10.35
Other Linux
: Normal normal
: NONE
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2011-07-15 10:10 UTC by pklai
Modified: 2011-09-13 21:01 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
mp3 with GB2312 song name (1000.00 KB, audio/mpeg)
2011-07-15 11:19 UTC, pklai
Details
The song name (101 bytes, application/octet-stream)
2011-07-15 12:42 UTC, pklai
Details

Description pklai 2011-07-15 10:10:41 UTC
When below message appear, the playbin2 may cause a crash:
(<unknown>:3889): GStreamer-WARNING **: Trying to set string on structure field 'uri', but string is not valid UTF-8. Please file a bug.
Comment 1 Tim-Philipp Müller 2011-07-15 10:31:33 UTC
(This shouldn't cause a crash, unless you set your environment or tell glib to abort on warnings.)

But anyway, could you attach the file please? (The first few kB will probably do, if the tag is at the beginning: head --bytes=990k foo.mp3 > head.mp3)
Comment 2 pklai 2011-07-15 11:19:53 UTC
Created attachment 192022 [details]
mp3 with GB2312 song name
Comment 3 pklai 2011-07-15 11:21:00 UTC
May be not related to the ID3 tag, may be it is caused by the url is in GB2312 code (Chinese charactor). The song is on a SD card that has a Chinese name in GN2312. The behavior is strange:
I wrap a playbin2 to a program that run as a daemon and it communicates with a main Qt application through a TCP socket. When the program start separately under a linux shell, it can work even the warning message appear. But when it is run through a QProcess with a use of "&>" to redirect all the message to a file inside the Qt application, it terminates suddenly (not creating the redirected file) just before the warning message suddenly. I don't know what is happening.

Attached is the first 1M bytes of the song. It is in a Chinese name.
Comment 4 pklai 2011-07-15 11:33:15 UTC
I have tried to find out where the warning message is sent. After I remove the g_warning() statement, I can start the daemon through QProcess inside Qt application. So, in the mean time, how can I mask out all the warning message from the gstreamer?

Colman
Comment 5 Tim-Philipp Müller 2011-07-15 11:36:45 UTC
Oh I see, it's not from the tag at all.

Could you run this in gdb (and ideally install debugging symbols for gstreamer + plugins):

 $ G_DEBUG=fatal_warnings gdb --args /usr/bin/gst-launch-0.10 playbin2 uri=file:///path/to/filename.mp3
 (gdb) run
 ... wait for abort ..
 (gdb) bt
  ... paste output from here ...
Comment 6 Tim-Philipp Müller 2011-07-15 11:39:01 UTC
You can play around with the fatal mask stuff from GLib (to override your environment variable settings?): http://developer.gnome.org/glib/2.29/glib-Message-Logging.html
Comment 7 pklai 2011-07-15 11:46:56 UTC
It is a embedded Linux system, I think it does not have enough resource to run gdb, it only have 64Mbytes RAM.

By the way, how can I make it mask out all console messages when it is run as a daemon?

Colman
Comment 8 Tim-Philipp Müller 2011-07-15 11:58:00 UTC
Have you tried reproducing the problem on a desktop system? I'm sure it'll happen here as well.

Without gdb or a way to get a stack trace, this is going to be quite painful to debug.

What's the source element being used here? filesrc?

Can you do this:

 $ ls -1 /path/to/file.mp3 > filename.dump

and attach filename.dump (or the output of hexdump -C filename.dump)?
Comment 9 pklai 2011-07-15 12:42:17 UTC
Created attachment 192026 [details]
The song name

Attached please find the song name. 
I believe that the problem is related to the g_warning() at gst_structure_set_field() of file gststructure.c function as below:

else if (G_UNLIKELY (s != NULL && !g_utf8_validate (s, -1, NULL))) {
  g_warning ("Trying to set string on %s field '%s', but string is not "
             "valid UTF-8. Please file a bug.",
             IS_TAGLIST (structure) ? "taglist" : "structure",
             g_quark_to_string (field->name));
  g_value_unset (&field->value);
  return;
}

I am looking at the g_log_set_fatal_mask(const gchar *log_domain, GLogLevelFlags fatal_mask) function, what should I pass to the 
"log_domain" in order to mask out the "GStreamer-WARNING ..." message?

Colman
Comment 10 Tim-Philipp Müller 2011-07-15 12:50:40 UTC
> I believe that the problem is related to the g_warning() at
> gst_structure_set_field() of file gststructure.c function as below:

Well yes, of course it is. That much we knew, but what we need to know is what element / piece of code created the broken URI in the first place (an URI should always be ASCII, with non-ASCII characters escaped).


> I am looking at the g_log_set_fatal_mask(const gchar *log_domain,
> GLogLevelFlags fatal_mask) function, what should I pass to the 
> "log_domain" in order to mask out the "GStreamer-WARNING ..." message?

Masking the message isn't going to help. The bug needs to be found and fixed. Other things won't work properly as a result of this issue (whether the warning is displayed or not).
Comment 11 pklai 2011-07-16 02:09:18 UTC
Actually, it is not a broken uri, it is a file inside a local storage. Since the OS allow a NON-ASCII file name, such as a BIG5 or GB2312 encoded file name. However, gstreamer can find it and play with the file without problem, only it gives out a warning as it is not an UTF-8 string. So, do you mean I have to convert the non-ASCII character to an escape sequence?

Colman
Comment 12 pklai 2011-07-16 02:48:26 UTC
I have solved this problem by using QProcess::startDetached() function to run it as a standalone process, the warning message will print on the console.

So, I think gstreamer should convert non-ascii character to an escape sequence if the string is not utf-8 encoded.
 
Colman
Comment 13 Tim-Philipp Müller 2011-07-16 11:26:08 UTC
Yes, when creating an URI from a file name (which can be in any encoding), GStreamer needs to escape non-ASCII characters. I have added a unit test for that, and it looks like it does that, at least in git.

So, some more questions:

 - what version of GStreamer are you using?
   (gst-inspect-0.10 filesrc | grep Version)

 - what source element is being used here?
   filesrc?
Comment 14 Tim-Philipp Müller 2011-07-16 11:30:43 UTC
Oh, and one more question:

 - how do you set the filename/URI? You set an
   URI on playbin2, right? How do you create that
   URI exactly, given a filename? what is the output
   of e.g.

   g_print ("uri: %s\n", uri);

   just before you set it on playbin2?
Comment 15 pklai 2011-07-18 01:52:23 UTC
1. The version is 0.10.35
2. I don't understand what is the mean of "filesrc"? Is it equal to "coreelements" ?
3. The uri is a file name got directly from the SD card (FAT32 formated) by using Qt's QFileSystemModel class. The codepage is cp936(GB2312, Chinese simplified character set). Whwn the name is returned by QFileSystemModel, it is added with a suffix of "file://" and then use g_object_set() to set it to playbin2. Since my console cannot display Chinese (but my Qt application runing on an ARM platform can), I cannot get the correct name shown on the console, but the name is correctly sent to playbin2 and it can play it without problem.

Colman
Comment 16 Tim-Philipp Müller 2011-07-18 09:29:30 UTC
> 2. I don't understand what is the mean of "filesrc"? Is it equal to
> "coreelements" ?

Never mind this one, it's a core element, yes :)

> 3. The uri is a file name got directly from the SD card (FAT32 formated) by
> using Qt's QFileSystemModel class. The codepage is cp936(GB2312, Chinese
> simplified character set). When the name is returned by QFileSystemModel, it is
> added with a suffix of "file://" and then use g_object_set() to set it to
> playbin2.

Ah ok, this is not really correct then (though we do accept it). Try using gst_filename_to_uri() or g_filename_to_uri() to create an URI from a filename (these functions will escape things properly).
Comment 17 pklai 2011-07-18 09:44:41 UTC
Thank you,

Will try later.

Colman
Comment 18 Fabio Durán Verdugo 2011-09-04 22:32:34 UTC
colman do you have any update for this bug?
Comment 19 pklai 2011-09-05 01:26:44 UTC
No. Use g_filename_to_url() function can solve the problem.

Colman
Comment 20 Tim-Philipp Müller 2011-09-13 21:01:29 UTC
> No. Use g_filename_to_url() function can solve the problem.

Ok, thanks, will close as invalid then.

Also added some more tests, but didn't find any problems. It is possible that there's still a bug in some source element of course, but we don't know which and it doesn't look like it's filesrc.

 commit 1051eddd4ca117fd8bc4bf013503fb6fb3075fd4
 Author: Tim-Philipp Müller <tim.muller@collabora.co.uk>
 Date:   Tue Sep 13 21:58:21 2011 +0100

    tests: make sure filesrc returns escaped URIs even if the input was unescaped
    
    https://bugzilla.gnome.org/show_bug.cgi?id=654673


commit 14a79628a732d040402c09990e43ea80c14a681c
Author: Tim-Philipp Müller <tim.muller@collabora.co.uk>
Date:   Mon Sep 12 15:10:37 2011 +0100

    playbin2: try to catch malformed URIs
    
    Only log in debug log for now, since the check is a bit
    half-hearted, its purpose is mostly to make sure people
    use gst_filename_to_uri() or g_filename_to_uri().
    
    https://bugzilla.gnome.org/show_bug.cgi?id=654673