After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 423274 - find/search doesn't normalize
find/search doesn't normalize
Status: RESOLVED WONTFIX
Product: gtranslator
Classification: Other
Component: general
HEAD
Other Linux
: Normal normal
: 1.1.8
Assigned To: Pablo Sanxiao
Ross Golder
Depends on:
Blocks: 423036
 
 
Reported: 2007-03-27 08:56 UTC by Denis Jacquerye
Modified: 2009-02-14 13:41 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
fr.po from gtk+, all strings are NFD (114.81 KB, text/plain)
2007-05-04 20:20 UTC, Denis Jacquerye
  Details
This patch normalize the find function (1.35 KB, text/x-patch)
2007-05-28 16:49 UTC, Pablo Sanxiao
  Details
find and replace normalize (2.20 KB, patch)
2007-06-05 09:19 UTC, Pablo Sanxiao
none Details | Review
This is a bit patch to trunk so that find/search is normalized. (422 bytes, patch)
2007-08-24 10:07 UTC, Pablo Sanxiao
none Details | Review

Description Denis Jacquerye 2007-03-27 08:56:57 UTC
When searching for text in a file precomposed characters are not
found/search as their equivalent characters in Unicode.

Example: 
- string has "école" that's with <U+00E9 LATIN SMALL LETTER E WITH ACUTE>.
- search for "école" with <U+0065 LATIN SMALL LETTER E;U+0301 COMBINING ACUTE ACCENT>

The first string does not match the search but it should.

g_utf8_normalize() should be used before comparing strings.
Comment 1 Denis Jacquerye 2007-03-27 08:57:17 UTC
blocks meta bug 423036
Comment 2 Pablo Sanxiao 2007-05-04 16:01:48 UTC
Hi, I can try to fix this bug but I don´t Know very well how probe it.
Could you give an example or a little po file in order to probe?

Comment 3 Denis Jacquerye 2007-05-04 20:20:41 UTC
Created attachment 87570 [details]
fr.po from gtk+, all strings are NFD 

Here's a copy of fr.po from gtk+, all strings are NFD.

A simple example of a search can be a search for the "é" <U+00E9 LATIN SMALL LETTER E WITH ACUTE> which is not present since the its NFD is "é" <U+0065 LATIN SMALL LETTER E;U+0301 COMBINING ACUTE ACCENT>.

Searching for on should match the other, both ways.

Also, you'll notice there's a bug in the highlighting if you search for "é" <U+0065 LATIN SMALL LETTER E;U+0301 COMBINING ACUTE ACCENT>. For some reason the combining characters are incorrectly counted and highlighted matches are off. Should that be another bug report?

Something else to digest: Should all input be NFC'ed, i.e. should all files written by gtranslator be in NFC (the recommended normalization form by the W3C)?
Comment 4 Pablo Sanxiao 2007-05-11 17:12:38 UTC
I agree with you, the highlighting doesn´t work correctly.
If you want, you can report it as new bug and I will try to fix both.
Comment 5 Pablo Sanxiao 2007-05-28 16:49:04 UTC
Created attachment 88948 [details]
This patch normalize the find function

I think now the find is normalized but at the moment the highlighting is not fixed yet. Could you probe this patch? If it works well then I will try to repair the highlighting.
Comment 6 Denis Jacquerye 2007-05-28 17:46:40 UTC
(In reply to comment #5)
> Created an attachment (id=88948) [edit]
> This patch normalize the find function
> 
> I think now the find is normalized but at the moment the highlighting is not
> fixed yet. Could you probe this patch? If it works well then I will try to
> repair the highlighting.
> 
I'mo not able to apply the patch to my copy of SVN.
Is parse.c different?
Comment 7 Pablo Sanxiao 2007-05-29 07:30:00 UTC
(In reply to comment #6)

> I'mo not able to apply the patch to my copy of SVN.
> Is parse.c different?
> 

This patch was generate from version 1_1_7 of SVN.
I only modified the lines which are on patch.
Comment 8 Denis Jacquerye 2007-05-29 08:03:00 UTC
OK. The search works for equivalent strings with the patch. But the patch uses G_NORMALIZE_DEFAULT which is NFD, so everything is decomposed. This also affects the saved files.
This means all the current po files with precomposed characters will be saved with decomposed characters after the patch.

I'd suggest using G_NORMALIZE_DEFAULT_COMPOSE to avoid the major change, as well as allowing legacy software to work with the saved files even if they don't support Unicode equivalences, like the current version.
Comment 9 Pablo Sanxiao 2007-06-05 09:19:32 UTC
Created attachment 89396 [details] [review]
find and replace normalize

I changed the normalization as I was suggested by Denis Jacquerye and I added the normalization to the replace function.
Comment 10 Denis Jacquerye 2007-06-05 13:21:18 UTC
Thank you Pablo.

This works fine.
The only remaining bug is wrong match highlighting.
Comment 11 Ignacio Casal Quinteiro (nacho) 2007-07-01 20:18:17 UTC
Applied path to dialogs.c on svn but i can't apply it in parse.c becouse there isn't append_line func.
Comment 12 Pablo Sanxiao 2007-07-02 08:30:36 UTC
(In reply to comment #11)
> Applied path to dialogs.c on svn but i can't apply it in parse.c becouse there
> isn't append_line func.
> 

This path was done to stable version (1.1.7).
Comment 13 Pablo Sanxiao 2007-08-24 10:07:23 UTC
Created attachment 94241 [details] [review]
This is a bit patch to trunk so that find/search is normalized.

I changed the normalization mode in g_utf_normalize with respect to stable version because gettext's functions use G_NORMALIZE_DEFAULT instead G_NORMALIZE_DEFAULT_COMPOSE.
Comment 14 Ignacio Casal Quinteiro (nacho) 2007-08-25 18:50:50 UTC
Applied.
Comment 15 Pablo Sanxiao 2007-11-15 20:46:18 UTC
Applied patch in branch gtranslator_1_1_8.
I'm keeping this bug open because the highlighting doesn't work yet.
Comment 16 Pablo Sanxiao 2009-02-14 13:41:21 UTC
The 1.1.8 is not longer maintained. This is fixed in trunk version.