GNOME Bugzilla – Bug 144322
Search doesn't handle non-ascii chars correctly
Last modified: 2009-02-14 13:43:51 UTC
From the Debian BTS, a bug that probably has a lot of impact in the program usability: http://bugs.debian.org/254339 Gtranslator's search function doesn't work well with non-ASCII text. When non-ASCII letters are present in a message in which the search pattern is found, gtranslator highlights the wrong passage of text. For example, when searching for "des", it would highlight "es " in the following message: Ein paar mögliche Befehle (...) des Bildschirms ^^^ In this message, when search for naho, it highlights "oře.". Zobrazit vybrané příkazy nahoře... ^^^^ In short, the selection shifts to the right for each non-ASCII letter in the message. Looks like a classical bytes vs. characters problem to me.
Yes, the regexp code in find.c is not yet UTF8-aware. It still thinks the world is flat, and that chars == bytes! I'll be keeping an eye on the following bug in anticipation of a decent UTF8-aware regexp function to use. http://bugzilla.gnome.org/show_bug.cgi?id=50075
While having the UTF-8 regexp engine would be nice, something else is likely also wrong here. It sounds like search should work absolutely just fine as long as ASCII is being searched for. I.e., the bytes vs. chars is to be fixed independently of the regexp problem. If you need UTF-8 now, grab Gnumeric's.
Already fixed? I don't see any problem searching with non-ascii (Turkish) characters.
I think this didn't work in 1.1.x series cause gettext was not being used. This is working fine in trunk version.