After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 710518 - Search does not find all results in specific PDF
Search does not find all results in specific PDF
Status: RESOLVED NOTGNOME
Product: evince
Classification: Core
Component: PDF
3.8.x
Other Linux
: Normal normal
: ---
Assigned To: Evince Maintainers
Evince Maintainers
Depends on:
Blocks:
 
 
Reported: 2013-10-19 17:05 UTC by André Klapper
Modified: 2013-11-03 01:05 UTC
See Also:
GNOME target: ---
GNOME version: 3.7/3.8



Description André Klapper 2013-10-19 17:05:23 UTC
evince-3.8.3-2.fc19.i686
poppler-0.22.1-4.fc19.i686

1. Download http://darwin.bth.rwth-aachen.de/opus3/volltexte/2010/3412/pdf/3412.pdf
2. Search for "bug"
3. Get only run result on page 128 (143 of 209)

See that page vii (7 of 209) has the word 
      BugzillaMetrics
(in italic), last but 10 line. 

Would have expected the search to find it.
Comment 1 Carlos Garcia Campos 2013-10-19 17:29:25 UTC
There seems to be a problem with the text in that page. Try selecting BugzillaMetrics and pasting it in gedit (or somewhere else) and see the result. It could be that the glyph to utf8 map is wrong in the pdf. Could someone try to copy and paste that text with acroread?
Comment 2 André Klapper 2013-10-20 00:20:11 UTC
      True, gibberish when copying from evince to gedit.

      When copying the line from Firefox' pdf.js to gedit, I get
stoÿ für das Op en Source Pro jekt
Bugzil laMetrics
gegeb en hat. Eine b esondere
      instead of
stoß für das Open Source Projekt BugzillaMetrics gegeben hat. Eine besondere

      No acroread around.
Comment 3 Jason Crain 2013-10-22 05:31:45 UTC
acroread copy and paste shows this:
    sto
    ÿ für das Open Source Projekt BugzillaMetrics gegeben hat. Eine besondere

Firefox shows this:
    stoÿ für das Op en Source Pro jekt
    Bugzil laMetrics
    gegeb en hat. Eine b esondere

Foxit, Chrome, and Evince all show a bunch of wingdings characters.  This might be a regression because the version on Debian Wheezy (Evince 3.4.0-3.1 / poppler 0.18.4-6) shows the correct text.
Comment 4 Jason Crain 2013-10-22 06:02:17 UTC
Managed to find this poppler commit through git bisect:

126bf08105e319f9216654782e5a63f99f1d1825 is the first bad commit
commit 126bf08105e319f9216654782e5a63f99f1d1825
Author: Albert Astals Cid <aacid@kde.org>
Date:   Sun Feb 19 23:18:25 2012 +0100

    Update glyph names to Unicode values mapping
    
    Added Zapf Dingbat names and fixed copyrightsans, copyrightserif, registersans, registerserif, trademarksans, trademarkserif
    Kudos to Adrian Johnson for find what was missing :-)
    Bug #13131

:040000 040000 7cbaf40a93f8e49c086f80ad7d1054a10ae8a060 c4e3427944390237da8ec2c1e53e545ac815d667 M      poppler

I think it's poppler bug 60243 <https://bugs.freedesktop.org/show_bug.cgi?id=60243>.  The patch I posted there fixes this problem.
Comment 5 Germán Poo-Caamaño 2013-11-03 01:05:29 UTC
I'm closing this as NOTGNOME, then.