GNOME Bugzilla – Bug 696030
Handle spaces and hyphenation when search pdf
Last modified: 2018-05-22 15:00:51 UTC
Hi, This is similar to Bug 598759, but I would like to skip something more complex. First of all, I have no idea what happens in the background. For example, I have no idea of how evince finds out that there is some sort of space between two letters or words. I could imagine, that PDF files don't contain any information like "there's space here" and that some (bad) heuristic is at work. Anyhow: 1) Searching for a single word: When searching for a single word, evince often fails to find it. I don't know why, but copy/pasting the word from the PDF reveiled, that evince things that there are some spaces between the letter. Searching for the same word and adding some spaces here and there fixes makes evince find it. But you can imagine, that I don't want to guess the locations of spaces in order to find a word. 2) Searching across lines: Evince could ignore hyphens at the end of a line, if a word has been hyphenated. Of course, it could be a composite word like "in-between" that has been split into "in-" and "between". So just stripping all hyphens at the end of a line won't do. 3) Searching multiple words: When I enter "the king is dead", I guess what evince does is to search for that string in the PDF. If it is spread among multiple lines, evince won't find it. If the PDF reports that two spaces are between "the" and "king", then evince won't find it. Well, Adobe Reader implements all of the above and probably much more.
Hi, 1) is certainly a clear bug that we should fix. Please attach some pdf and steps to reproduce. About 2) and 3), it seems to me they are the same no? if so, they are other bugs about it, so we should discuss this in the other bugs.
*** Bug 686045 has been marked as a duplicate of this bug. ***
evince actually knows enough to collapse the hyphenation -- at least in some parts of the display. I'm looking at: http://www.andrew.cmu.edu/user/danupam/dtd-pets15.pdf in evince 3.16.1. If i click on the search magnifying glass button, and i type "big" in the searchbox, i see three hits in the left-hand pane: of bigotry. Given the pervasive… 14 President, "Big data: Seizing o… 16 docs/big_data_privacy_report… 16 but if i add an "o" to the end of my search term (making it "bigo") evince removes all hits, saying "Not found". It's weird to see it correct in the left-hand pane and then watch it disappear as i make the filter more specific. This would be a useful improvement!
*** Bug 750579 has been marked as a duplicate of this bug. ***
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/evince/issues/333.