After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 618971 - Search for similar (compatible) characters of unicode symbols
Search for similar (compatible) characters of unicode symbols
Status: RESOLVED DUPLICATE of bug 522782
Product: evince
Classification: Core
Component: general
2.30.x
Other Linux
: Normal enhancement
: ---
Assigned To: Evince Maintainers
Evince Maintainers
: 681315 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2010-05-18 11:25 UTC by Arian@sanusi.de
Modified: 2013-06-15 06:02 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Arian@sanusi.de 2010-05-18 11:25:42 UTC
evince and other pdf readers are unable to search for "combined characters" that tex uses to generate umlaute like ä, for example in this document: http://www.math.ethz.ch/education/bachelor/lectures/fs2010/math/nm/skript.pdf

I don't know about the details, but maybe it is possible to detect these characters during searching and allow them to be found by either of the base characters, e.g. allowing to find "Ausgleichslösung" on p. 149 by searching for "Ausgleichslosung".

Of course, this would only be a workaround for people understanding the background. But maybe it is possible to transform the search term to base characters when search stumbles upon a combined character (when searching for "Ausgleichslösung" search also for "Ausgleichlosung" when a combined character passes the search). This would lead to false positives in rare circumstances, which is imo acceptible.

Sorry if the description is somewhat imprecise, as I am neither developer nor know the implementation of these characters
Comment 1 Germán Poo-Caamaño 2013-02-28 02:53:31 UTC
*** Bug 681315 has been marked as a duplicate of this bug. ***
Comment 2 Germán Poo-Caamaño 2013-02-28 02:54:13 UTC
From the report in https://bugzilla.gnome.org/show_bug.cgi?id=681315:

"When a PDF document contains long s (U+017f), long s-t (U+FB05) or german
double s (U+00DF), searching for "s", "st" or "ss" respectively does not match.
Please can you add these to the possible matches?"
Comment 3 Germán Poo-Caamaño 2013-02-28 02:57:48 UTC
I am not sure whether there is an option in Pango for this kind of search.  Otherwise, it might worth to take a look at:

http://www.unicode.org/reports/tr36/confusables.txt
Comment 4 Germán Poo-Caamaño 2013-06-15 06:02:18 UTC

*** This bug has been marked as a duplicate of bug 522782 ***