GNOME Bugzilla – Bug 206386
Searches containing 8-bit characters broken
Last modified: 2013-09-10 13:54:57 UTC
If I search for a string containing 8-bit characters in the search bar above the message list it looks like it disregards the 8-bit chars. searching on "él" turns up messages from Ismael, so it looks like it treats é as e
this is intentional, we currently decompose (is that the right word?) 8bit chars so that the match is a little more flexible. I guess if this isn't the desired effect, we could change it. Now, if you search for "él" and messages that it should match don't come up, then that would be a bug. I'm not gonna close this bug because perhaps we should decompose text, I feel it's a nice thing but if it's not the expected/desired effect, perhaps we shouldn't? I dunno. dawn? ettore? what do you guys think?
This was done intentionally, as you said. If we were going to change it, I'd suggesting asking on the list (and maybe gnome-i18n-list) for opinions.
I agree with Kjartan, this is not the desired effect. For example, if I search for öö I don't want to find oo. Most non-English languages have diacritic characters and we don't want to search them without diacritics.
i'm not sure if this is a 1.0 bug. searches are approximate at best anyway, with most peole's spelling abilities.
Yeah. Blame it on the users, that works :)
naah i'm just saying that you need to fuzzify any search string you use when searching. It also reduces the size of the indexes we generate for body indexed folders. Having said all that, not decpomosing is an easy fix, but i dont think we have time do decide what's best for 1.0, so i'm marking 1.1 (if someone strongly disagrees of course that could change)
Because of the decision to remap 1.1->1.2 and 1.2->1.4, I'm going to be moving a large number of bugs around in the bugzilla. You can just search on 'body contains' 'Because of the decision to remap' and mark all as read. Please direct all questions about this change to evolution@ximian.com, not the bug. Luis
making it part of indexing 'rewrite'
This should be fixed, but i'm not sure, i dont think it'll handle case properly for example ... Can anyone confirm? You need HEAD.
It seems to work for me. I searched for "påske" which got me two matches - namely "påsken" and "Påskeferie". On the other hand, if I search for "PÅSKE" it finds no matches.
Cool. Can we close this?
I'd say its fixed ...