GNOME Bugzilla – Bug 618971
Search for similar (compatible) characters of unicode symbols
Last modified: 2013-06-15 06:02:18 UTC
evince and other pdf readers are unable to search for "combined characters" that tex uses to generate umlaute like ä, for example in this document: http://www.math.ethz.ch/education/bachelor/lectures/fs2010/math/nm/skript.pdf I don't know about the details, but maybe it is possible to detect these characters during searching and allow them to be found by either of the base characters, e.g. allowing to find "Ausgleichslösung" on p. 149 by searching for "Ausgleichslosung". Of course, this would only be a workaround for people understanding the background. But maybe it is possible to transform the search term to base characters when search stumbles upon a combined character (when searching for "Ausgleichslösung" search also for "Ausgleichlosung" when a combined character passes the search). This would lead to false positives in rare circumstances, which is imo acceptible. Sorry if the description is somewhat imprecise, as I am neither developer nor know the implementation of these characters
*** Bug 681315 has been marked as a duplicate of this bug. ***
From the report in https://bugzilla.gnome.org/show_bug.cgi?id=681315: "When a PDF document contains long s (U+017f), long s-t (U+FB05) or german double s (U+00DF), searching for "s", "st" or "ss" respectively does not match. Please can you add these to the possible matches?"
I am not sure whether there is an option in Pango for this kind of search. Otherwise, it might worth to take a look at: http://www.unicode.org/reports/tr36/confusables.txt
*** This bug has been marked as a duplicate of bug 522782 ***