GNOME Bugzilla – Bug 620579
Accept unicode objects in addition to strings
Last modified: 2010-12-17 17:25:18 UTC
Right now if unicode is passed in as value, it is rejected instead of being encoded.
Confirmed with current pygobject 2.27.0, still requires an explicit .encode('UTF-8') everywhere.
On Python 2? Can you provide an example? Python 3 handles unicode much better.
Yes, it happens with python 2.6 and 2.7: $ python -c 'from gi.repository import Gtk; Gtk.require_version("3.0"); l=Gtk.Label(); l.set_label(u"foo")' (-c:22055): Gtk-WARNING **: Unable to locate theme engine in module_path: "pixmap", Traceback (most recent call last):
+ Trace 224723
return info.invoke(*args)
Created attachment 174712 [details] [review] when converting to UTF-8 accept Python Unicode objects as input (Python 2)
Does this work for you?
It does accept unicode objects now, and as long as the actual strings just have ASCII characters, it works. However, now it crashes on non-ASCII characters with $ python -c 'import locale; locale.setlocale(locale.LC_ALL, ""); print locale.getlocale(); from gi.repository import Gtk; Gtk.require_version("3.0"); l=Gtk.Label(); l.set_label(u"ä")' ('de_DE', 'UTF8') Traceback (most recent call last):
+ Trace 224742
Explicitly calling Gtk.set_locale() doesn't help either (although that shouldn't be required in the first place). Thanks!
Martin, is this a regression or has it never worked with PyGObject/PyGtk?
It is a regression from pygtk. It has always "just worked" with pygtk to supply unicode objects to pygtk objects, and strings were displayed normally. I am currently porting a pygtk (2.0) project of mine to pygi Gtk 3.0, and that's how I noticed that in the first place. My current workaround is to add .encode('UTF-8') everywhere. I don't know whether it has ever worked with pygobject and GI, since I really just started using it a few days ago.
Created attachment 174795 [details] [review] when converting to UTF-8 accept Python Unicode objects as input (Python 2) ha, git-bz doesn't handle unicode either so I had to append this patch manually. This should fix your issue.
I should note that Python 3 makes these issue go away (e.g. it properly handles Unicode).
Awesome, that does it. Thank you!
With Python 2 and this pygobject master (i.e. with this patch), I'm getting TypeErrors in Totem with the patch in bug #619039 applied: Traceback (most recent call last):
+ Trace 224746
self.os_append_menu()
tooltip=_(u"Download movie subtitles from OpenSubtitles"))
The file has "# -*- coding: utf-8 -*-" specified at the top. The code in question: self.action = Gtk.Action(name='opensubtitles', label=_(u'_Download Movie Subtitles…'), tooltip=_(u"Download movie subtitles from OpenSubtitles")) (Note the Unicode ellipsis in the "label" property value.) Is this a problem at my end?
No, we handle properties a bit different. To be clear, the patch didn't break you, it just didn't fix your issue - correct?
(In reply to comment #13) > No, we handle properties a bit different. To be clear, the patch didn't break > you, it just didn't fix your issue - correct? Correct.
In case it's helpful for testing, small standalone reproducer: python -c 'from gi.repository import Gtk; Gtk.require_version("3.0"); Gtk.MessageDialog(message_format=u"hello ♥").run()' It does accept the unicode argument just fine (u"hello" works). Explicitly setting the locale doesn't help. With old pygtk2 this just worked: python -c 'import gtk; gtk.MessageDialog(message_format="hello ♥").run()'
For the old static bindings, pango calls (which is imported by the gtk) PyUnicode_SetDefaultEncoding("utf-8"); However, the oldest open PyGTK bug asks to avoid calling that, see https://bugzilla.gnome.org/show_bug.cgi?id=132040 I guess that should be done for at least python 2.x for compatibility. For Python 3.x there's a chance to avoid doing that. All strings in Glib/Gtk are utf-8 and relevant linux distributions also use utf-8 everywhere, see http://fedoraproject.org/wiki/Features/PythonEncodingUsesSystemLocale for instance Either way, pygobject should do the right thing per default, it should not be necessary to set the locals manually.
John, any news?
Sorry, no. I haven't looked at it. I'll look a bit more tomorrow.
Created attachment 176550 [details] [review] Properly handle unicode object in properties There are still some cavets in Python 2: - properties are returned as String objects with the unicode code points - you must add # coding=utf-8 to the top of your python file or python will error out if it sees embeded unicode charaters (such as when supporting python 3 and python 2 from the same source)
This passes Martin's example in comment 15 except when run from the command like my terminal will convert the heart to its unicode code points. When run from a file with # coding=utf8 at the top this works flawlessly. Note that Python 2 does still have some drawbacks when working with utf8 that aren't going to be fixed but at least now we accept unicode objects.
Please test out patch and let me know if it fixes your issue and I can commit
It fixes the issue for Totem, thanks!