GNOME Bugzilla – Bug 79897
UTF-8 validate all external strings
Last modified: 2004-12-22 21:47:04 UTC
Running well - everything OK. On saving file crashes. Console output: ** (process:18152): WARNING (recursed) **: Invalid UTF8 string passed to pango_layout_set_text() aborting... gimp-1.3: terminated: Cancelled
It is difficult to know what is wrong because you did not give enough information in your bug report: - When does the crash occur? Before or after the File Save dialog? Before or after the save options dialog (for the file formats that have additional options, such as JPEG, GIF, PNG, ...)? - What type of file are you trying to save? Does it occur for all file formats, or only for some specific types? - What locale are you using (language settings)? This might be a translation error. By the way, the developers' version of the GIMP (from CVS) is not in a stable state for the moment. Since new bugs can come and go on a daily basis, it is better (for now) to discuss this on the developers' mailing list instead of submitting a bug report, because the code changes rather fast. For the moment, it is better to use Bugzilla for bugs related to the stable version (1.2.x) or for enhancements for the 1.3.x version or future versions. Once the 1.3.x version becomes a bit more stable (when we are closer to the 1.4 release), then Bugzilla will be the best channel for all bug reports.
I can't reproduce this here. It very much depends on the locale settings and probably the filename you tried to use. If you are using iso8859-2 filenames you absolutely need to set LC_CTYPE (or LC_ALL) to something like pl_PL.ISO-8859-2, otherwise g_locale_to_utf8() and g_utf8_to_locale() will not be able to work correctly on the filename you entered. Of course we need to make the code more robust so it doesn't crash if the conversion fails.
If your filename encoding is not UTF-8, you also need to set the environment variable G_BROKEN_FILENAMES in order to make the g_filename_[to|from]_utf8() functions GIMP uses work. We also need to add generic UTF-8 validation for every string that is passed into the GIMP core via the PDB using pdbgen.
Made the XCF loader handle invalid UTF-8: 2002-05-30 Michael Natterer <mitch@gimp.org> * app/xcf/xcf-read.c (xcf_read_string): UTF-8 validate all strings and try g_locale_to_utf8() as fallback if it fails.
As Mitch outlined this is fixed for the particular case that has been reported but there are still other possible ways strings could sneak into The GIMP w/o being validated.
What's the fix to this? For every dialog with a text box, UTF-8 validate the input? Isn't this done automatically in gtk+? In any case, this seems like a bug that needs to be addressed before the stable release, but isn't a blocker for a pre-release. Setting milestone to 2.0. Dave.
Preferably, the string param handling in pdbgen would allow to specify if a string param needs to be UTF-8 validated and generate a PDB_CALLING_ERROR if validation fails like it does for out-of-range enum values.
Wouldn't be hard to do. I'll try to get to it in the next couple days.
2003-07-29 Manish Singh <yosh@gimp.org> * tools/pdbgen/app.pl: added a utf8 option for string input parameters, and validate them. * tools/pdbgen/pdb/text_tool.pdb: make the text parameter use it. Partially addresses #79897. Also remove references to XLFD in the doc text. Someone should go through the rest of the pdb stuff and see what other string parameters should be marked that way. Maybe if a lot of them do, utf8 should be the default unless otherwise specified.
The parasite PDB API is UTF-8 safe now: 2003-08-17 Michael Natterer <mitch@gimp.org> Fixed bug #79897 for all parasite procedures: * tools/pdbgen/app.pl: UTF-8 validate parasite->name. * tools/pdbgen/pdb/parasite.pdb: UTF-8 validate parasite names which are passed separately from the parasite struct. * app/pdb/parasite_cmds.c: regenerated.
Fixed the whole PDB except the strings that may be NULL: 2003-08-18 Michael Natterer <mitch@gimp.org> * tools/pdbgen/pdb/brush_select.pdb * tools/pdbgen/pdb/brushes.pdb * tools/pdbgen/pdb/channel.pdb * tools/pdbgen/pdb/convert.pdb * tools/pdbgen/pdb/drawable.pdb * tools/pdbgen/pdb/fileops.pdb * tools/pdbgen/pdb/font_select.pdb * tools/pdbgen/pdb/gimprc.pdb * tools/pdbgen/pdb/gradient_select.pdb * tools/pdbgen/pdb/gradients.pdb * tools/pdbgen/pdb/layer.pdb * tools/pdbgen/pdb/message.pdb * tools/pdbgen/pdb/palette.pdb * tools/pdbgen/pdb/palette_select.pdb * tools/pdbgen/pdb/palettes.pdb * tools/pdbgen/pdb/paths.pdb * tools/pdbgen/pdb/pattern_select.pdb * tools/pdbgen/pdb/patterns.pdb * tools/pdbgen/pdb/plug_in.pdb * tools/pdbgen/pdb/procedural_db.pdb * tools/pdbgen/pdb/text_tool.pdb * tools/pdbgen/pdb/unit.pdb: UTF-8 validate all strings except filenames. Does not work yet for string params which may be NULL. They currently don't get checked because I still don't understand pdbgen enough :) * app/pdb/brush_select_cmds.c * app/pdb/brushes_cmds.c * app/pdb/channel_cmds.c * app/pdb/convert_cmds.c * app/pdb/drawable_cmds.c * app/pdb/fileops_cmds.c * app/pdb/font_select_cmds.c * app/pdb/gimprc_cmds.c * app/pdb/gradient_select_cmds.c * app/pdb/gradients_cmds.c * app/pdb/message_cmds.c * app/pdb/palette_select_cmds.c * app/pdb/palettes_cmds.c * app/pdb/paths_cmds.c * app/pdb/pattern_select_cmds.c * app/pdb/patterns_cmds.c * app/pdb/plug_in_cmds.c * app/pdb/procedural_db_cmds.c * app/pdb/text_tool_cmds.c * app/pdb/unit_cmds.c: regenerated.
2003-08-19 Manish Singh <yosh@gimp.org> * tools/pdbgen/app.pl: Default all strings to validate UTF-8, use no_validate to disable. Also added a null_ok parameter which does validate UTF-8, but allows NULL. * tools/pdbgen/pdb/brush_select.pdb * tools/pdbgen/pdb/brushes.pdb * tools/pdbgen/pdb/channel.pdb * tools/pdbgen/pdb/convert.pdb * tools/pdbgen/pdb/fileops.pdb * tools/pdbgen/pdb/font_select.pdb * tools/pdbgen/pdb/gimprc.pdb * tools/pdbgen/pdb/gradient_select.pdb * tools/pdbgen/pdb/gradients.pdb * tools/pdbgen/pdb/help.pdb * tools/pdbgen/pdb/image.pdb * tools/pdbgen/pdb/layer.pdb * tools/pdbgen/pdb/message.pdb * tools/pdbgen/pdb/palette_select.pdb * tools/pdbgen/pdb/palettes.pdb * tools/pdbgen/pdb/parasite.pdb * tools/pdbgen/pdb/paths.pdb * tools/pdbgen/pdb/pattern_select.pdb * tools/pdbgen/pdb/patterns.pdb * tools/pdbgen/pdb/plug_in.pdb * tools/pdbgen/pdb/procedural_db.pdb * tools/pdbgen/pdb/text_tool.pdb * tools/pdbgen/pdb/unit.pdb: removed utf8, added no_validate and null_ok where appropriate. * app/pdb/brush_select_cmds.c * app/pdb/font_select_cmds.c * app/pdb/gradient_select_cmds.c * app/pdb/layer_cmds.c * app/pdb/palette_select_cmds.c * app/pdb/pattern_select_cmds.c * app/pdb/plug_in_cmds.c: regenerated.
With that change, is there anything left to be done before closing this?
Hi, Doesn't look like it to me. Resolving as FIXED until someone proves otherwise. Dave.
Gah! Changing status for real this time. Dave.
The various data file loading functions need to get UTF-8 validation in a way that tries the current locale as fallback if validation fails. This is needed to support brushes etc. with non UTF-8 names saved by older GIMPs.
Fixed in CVS: 2003-10-16 Michael Natterer <mitch@gimp.org> * libgimpbase/gimputils.[ch]: new function which takes any string and returns UTF-8 (it returns "(invalid UTF-8 string)" if all conversion attempts fail). * app/core/gimpbrush.c * app/core/gimpgradient.c * app/core/gimppalette.c * app/core/gimppattern.c * app/xcf/xcf-read.c: use it. Fixes bug #79897.