Bug 683667 – gnome-clock crashes when an alarm name contains non-ascii characters.

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 683667 - gnome-clock crashes when an alarm name contains non-ascii characters.


Summary:	gnome-clock crashes when an alarm name contains non-ascii characters.


Status:	RESOLVED FIXED

Product:	gnome-clocks
Classification:	Applications
Component:	general
Version:	0.1.x
Hardware:	Other Linux

Importance:	Normal critical
Target Milestone:	---
Assigned To:	Clocks maintainer(s)
QA Contact:	Clocks maintainer(s)

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2012-09-09 13:08 UTC by Jiro Matsuzawa
Modified:	2012-09-10 17:16 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
the samaple alarm data (81 bytes, application/octet-stream) 2012-09-09 13:08 UTC, Jiro Matsuzawa		Details
patch for this bug (1.24 KB, patch) 2012-09-09 13:14 UTC, Jiro Matsuzawa	none	Details \| Review

Description Jiro Matsuzawa 2012-09-09 13:08:57 UTC

Created attachment 223847 [details]
the samaple alarm data

gnome-clock crashes when an alarm name contains non-ascii characters.

Steps to reproduce:
1. Run gnome-clocks.
2. Click the Alarm button.
3. Click the New button to add a new alarm
4. Enter any non-ascii characters (e.g. Japanese "あ", whose codepoint is U+3042) in the Name entry.
5. Click the Save button to save the alarm.
6. Click the Quit button to exit gnome-clocks
7. Run gnome-clocks again, then gnome-clocks crashes.


Traceback:
Traceback (most recent call last):

+ Trace 230827

File "/usr/lib/python2.7/dist-packages/gnomeclocks/app.py", line 397 in do_activate
```
self.win = Window(self)
```
File "/usr/lib/python2.7/dist-packages/gnomeclocks/app.py", line 65 in __init__
```
self.alarm = Alarm()
```
File "/usr/lib/python2.7/dist-packages/gnomeclocks/alarm.py", line 321 in __init__
```
self.load_alarms()
```
File "/usr/lib/python2.7/dist-packages/gnomeclocks/alarm.py", line 369 in load_alarms
```
self.add_alarm_widget(alarm)
```
File "/usr/lib/python2.7/dist-packages/gnomeclocks/alarm.py", line 381 in add_alarm_widget
```
label = GLib.markup_escape_text(alarm.name)
```

UnicodeEncodeError: 'ascii' codec can't encode character u'\u3042' in position 0: ordinal not in range(128)



Sample alarm data (I've attached):
$ cat ~/.local/share/gnome-clocks/alarms.json 
[{"minute": "56", "name": "\u3042", "hour": "21", "days": [0, 1, 2, 3, 4, 5, 6]}]

Comment 1 Jiro Matsuzawa 2012-09-09 13:14:34 UTC

Created attachment 223848 [details] [review]
patch for this bug

Handle unicode encoding/decoding.

Please review it.

Comment 2 Paolo Borelli 2012-09-09 13:20:04 UTC

The text retrieved from a gtk entry is always encoded as utf8, there should be no need to decode it... how did you trigger the bug? Is your system not using utf8 by default? Maybe the problem is the json serialization and we must force it it save as utf8

Comment 3 Jiro Matsuzawa 2012-09-09 13:43:21 UTC

(In reply to comment #2)
> The text retrieved from a gtk entry is always encoded as utf8, there should be
> no need to decode it... how did you trigger the bug? Is your system not using
> utf8 by default? Maybe the problem is the json serialization and we must force
> it it save as utf8

I've already described how to reproduce the bug. Please see the above steps to reproduce.

If you enter non-ascii string as alarm name, then a decoded unicode string (e.g. "name": "\u3042") will be dumped on alarms.json . You must encode the unicode sring from alarms.json in UTF-8 because GLib.markup_escape_text() requires UTF-8 string. That is why my patch encode string which is passed to GLib.markup_escape_text().  But as you said, gtk entry returns an encoded UTF-8 string, which is passed GLib.markup_escape_text() at the same location, too.  So you must decode the UTF-8 string that gtk entry returned before you pass it to GLib.markup_escape_text().

Comment 4 Paolo Borelli 2012-09-09 15:03:06 UTC

Review of attachment 223848 [details] [review]:

Sorry, but I still do not understand how this happens (I cannot test right now). That's why I was asking if your system is using a different encoding.

If we need to do encoding conversion, that should happen in the AlarmStorage class when loading/saving the json file, but according to http://docs.python.org/library/json.html they should already be utf8 by default...

Comment 5 Jiro Matsuzawa 2012-09-10 11:31:41 UTC

(In reply to comment #4)
> Review of attachment 223848 [details] [review]:
> 
> Sorry, but I still do not understand how this happens (I cannot test right
> now). That's why I was asking if your system is using a different encoding.
> 

I'm sorry for late reply.
My system is Ubuntu 12.10 (Quantal Quetzal), which use UTF-8 as the default encoding.
Here is information about my system:
----------------
$ uname -a
Linux nurigabe 3.5.0-14-generic #15-Ubuntu SMP Thu Sep 6 22:57:58 UTC 2012 i686 i686 i686 GNU/Linux
$ cat /etc/issue
Ubuntu quantal (development branch) \n \l

$ locale
LANG=ja_JP.UTF-8
LANGUAGE=ja:en
LC_CTYPE="ja_JP.UTF-8"
LC_NUMERIC="ja_JP.UTF-8"
LC_TIME="ja_JP.UTF-8"
LC_COLLATE="ja_JP.UTF-8"
LC_MONETARY="ja_JP.UTF-8"
LC_MESSAGES="ja_JP.UTF-8"
LC_PAPER="ja_JP.UTF-8"
LC_NAME="ja_JP.UTF-8"
LC_ADDRESS="ja_JP.UTF-8"
LC_TELEPHONE="ja_JP.UTF-8"
LC_MEASUREMENT="ja_JP.UTF-8"
LC_IDENTIFICATION="ja_JP.UTF-8"
LC_ALL=
$ python --version
Python 2.7.3
----------------


> If we need to do encoding conversion, that should happen in the AlarmStorage
> class when loading/saving the json file, but according to
> http://docs.python.org/library/json.html they should already be utf8 by
> default...

It doesn't matter what encoding json uses. And, I don't convert encoding. I just do encode and decode by UTF-8.

As the above traceback says, the direct cause for the crash is that an unicode string, which is NOT encoded to UTF-8 string, is passed to GLib.markup_escape_text(). GLib.markup_escape_text() requires UTF-8 strings (NOT unicode strings). But, json.load() returns unicode strings. The json reference about json.load() says, json.load() returns  "simply decoded to a unicode object" [1]. That is why we must encode unicode string which json.load returns by UTF-8 before passing it to GLib.markup_escape_text(). (The souce of caribou may be helpful. [2])


[1] http://docs.python.org/library/json.html#json.load
[2] http://git.gnome.org/browse/caribou/tree/caribou/antler/keyboard_view.py#n74

Comment 6 Paolo Borelli 2012-09-10 17:16:32 UTC

Thank you for the clarification!

I committed the following patch that should fix the problem

http://git.gnome.org/browse/gnome-clocks/commit/?id=fd612ef60b528c875693c7aac906df082eaa7f4a