GNOME Bugzilla – Bug 328162
Diacritics in topic keywords
Last modified: 2008-08-14 22:56:08 UTC
The probleme here is for french, but I suppose some other languages are concerned. My use-case : I defined a bookmark topic called "Vidéo" (I think you can guess the translation), when I want to access this topic in the location Bar, I type "vidéo" or "Vidéo" and I have the list of related bookmarks, fine. But I'm sometimes lazy, and I type "video" without accent, then the bookmarks doesn't appears anymore. The comparison should be done without taking diacritics into account. After some test I found out the problem is even more subtle, If I type "vid" my bookmarks are here. If the next letter I type is "é", the bookmarks stays, if I type an "a" for exemple they disappear (no problem), but if I type an "e" they stays too (great). But the problem is that if I finish the word : with "éo" bookmarks are still listed, and with "eo" they disappears. This is not consistent". Maybe a weird effect of UTF-8 encoding on two bytes... Other information:
Yes, this is a sideeffect of UTF-8 representation. We should definitely do a more sophisticated search.
I can't reproduce this anymore and there's even a bug (#343906) complaining about the solution to this bug (making no difference between -for example- é and e). I'm closing this.
Bug 343906 is only about the location entry though, while this also applies to the completion in the bookmark properties dialogue.
the bug still exists for me in 2.16.1 from ubuntu.
Does this bug need to be reopened?
Oups.. forgot to respond to this one. Yes it's still present and for me it should be reopened.
Created attachment 96265 [details] Works for me This works for me, if the topic is name "eeeé" I search for "eeee" and my bkmk is shown.
Is that é just an U+00E9 character, or the sequence U+0065 U+0301 ?
I don't have an SVN build to check if the bug is still present. But it was still in 2.18.1. Your use-case always worked, "video" matched "vidéo" until the "o" was typed.The correct use case would be to check if the search "eeeee" matches the topic "eeeée".
I do not think that the current western layouts produce characters with diacritics, but rather produce precomposed characters. I think that the big question is whether to decompose the names and tags of bookmarks and search names, before trying to apply a search. This is a general issue, and is possible it has been already addressed elsewhere, such as in tracker or beagle. It would be good to try a consistent solution.
Fixed, part of #517960.