GNOME Bugzilla – Bug 521294
Impossible to catch ConvertError exception
Last modified: 2008-03-11 15:44:36 UTC
When using Glib::Markup::Parser to parse a GMarkup document, if it encounters invalid UTF-8, it throws an exception which seemingly can't be caught. I don't understand why, but I can't seem to catch it. Test case attached.
Created attachment 106878 [details] test case illustrating the uncatchable exception. When I run the attached program, I get the following output, even though I'm attempting to catch instances of Glib::MarkupError, Glib::ConvertError, and Glib::Error: Glib::MarkupError: terminate called after throwing an instance of 'Glib::ConvertError' Aborted (core dumped)
Yes, I get that too, and I can't understand why. Here is a slightly changed test case to make it even more obvious, and here is the gdb backtrace: murrayc@murrayc-laptop:~$ gdb ./a.out GNU gdb 6.6-debian Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i486-linux-gnu"... Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1". (gdb) run Starting program: /home/murrayc/a.out debug: before parsing Glib::MarkupError: terminate called after throwing an instance of 'Glib::ConvertError' Program received signal SIGABRT, Aborted. 0xffffe410 in __kernel_vsyscall () (gdb) bt
+ Trace 191861
Created attachment 106900 [details] markup_test.cc
Actually, it's operator<< that is throwing the exception. I wonder if that's happening when writing out the exception's what() message.
Indeed you're correct. I guess I hadn't looked closely enough. I can get it to not abort if I instead use the following line: std::cerr << "Glib::MarkupError: " << exception.what ().raw () << std::endl; (e.g. using the raw() string and not doing any charset conversions on operator<<). I wonder if it's really a good idea to put invalid utf-8 into a utf-8 message string...
just for reference, this is the output I get with the raw() call above (assuming bugzilla doesn't do something weird to it): debug: before parsing Glib::MarkupError: Error on line 1 char 6: Invalid UTF-8 encoded text - not valid '�`���اH���'
I consider this a glib bug, because I think that GError messages are meant to be displayed, so they should be valid UTF8. I'll make a C test case and file it eventually if you don't first.
I fixed this in glib in bug 521591.