After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 583809 - allow only showing OCR'ed text layer instead of the image background layer
allow only showing OCR'ed text layer instead of the image background layer
Status: RESOLVED OBSOLETE
Product: evince
Classification: Core
Component: PDF
2.26.x
Other Linux
: Normal enhancement
: ---
Assigned To: Evince Maintainers
Evince Maintainers
Depends on:
Blocks:
 
 
Reported: 2009-05-25 14:50 UTC by Jean-François Fortin Tam
Modified: 2018-05-22 13:33 UTC
See Also:
GNOME target: ---
GNOME version: 2.25/2.26


Attachments
test case (651.20 KB, application/pdf)
2009-05-25 14:52 UTC, Jean-François Fortin Tam
Details
screenshot (316.92 KB, image/png)
2009-05-25 14:53 UTC, Jean-François Fortin Tam
Details
saved background image from the first page (170.33 KB, image/png)
2009-05-25 20:11 UTC, Jean-François Fortin Tam
Details
saved background image from the first page (226.16 KB, image/png)
2009-05-25 20:14 UTC, Jean-François Fortin Tam
Details
jstor pdf. Drag the background to a new evince and it looks better. (206.75 KB, image/png)
2009-11-12 12:05 UTC, Rune Schjellerup Philosof
Details

Description Jean-François Fortin Tam 2009-05-25 14:50:53 UTC
Certain PDFs have horrible text quality/resolution because they have been scanned and then OCR'ed. Selecting the text makes it much more legible, showing that evince actually has access to a hidden "layer" of OCR'ed, computer-readable text.

I would really love being able to tell evince to just show this layer. I don't give a darn about the original "image" layer of text, it makes things unpleasant to read.
Comment 1 Jean-François Fortin Tam 2009-05-25 14:52:46 UTC
Created attachment 135329 [details]
test case

This document has both "text as image" (shown as default) and computer/eye-friendly text (revealed when selected)
Comment 2 Jean-François Fortin Tam 2009-05-25 14:53:27 UTC
Created attachment 135330 [details]
screenshot

Comparing selected text (much better) to the text you see when not selected (horrible).
Comment 3 Carlos Garcia Campos 2009-05-25 15:10:11 UTC
They are not layers, the thing is that poppler use different methods to render text and selected text. 
Comment 4 Jean-François Fortin Tam 2009-05-25 16:53:09 UTC
But then why does acroread render the same "ugly" version of the text?
Comment 5 Nickolay V. Shmyrev 2009-05-25 19:13:55 UTC
It's because it does things properly and strictly follows the spec, using embedded not very nice fonts :)
Comment 6 Jean-François Fortin Tam 2009-05-25 20:10:25 UTC
Again, sorry for being so clueless about this, but that doesn't sound quite right/match what I'm experiencing; I don't understand how embedded fonts can possibly look so horrible.

Besides, I did a new experiment. I right-clicked the first-page in that document, and I had an option to save the image. Saved to PNG and it looks exactly like the crappy default output that we see. It really looks like it's using an image instead of actual text.

And if there wasn't an "image", I wouldn't have this option to save it in the popup menu anyway.
Comment 7 Jean-François Fortin Tam 2009-05-25 20:11:18 UTC
Created attachment 135341 [details]
saved background image from the first page
Comment 8 Jean-François Fortin Tam 2009-05-25 20:14:26 UTC
Created attachment 135343 [details]
saved background image from the first page

Whoops, wrong page. Here's the actual first page.
Comment 9 Nickolay V. Shmyrev 2009-05-25 21:27:37 UTC
Hm, we might be wrong indeed. There might be an image and the text below it. But this case is so specific and hard to define I really wonder what should we do to help here. It could be only "Try to fix this broken ABBYY Finereader crap" tool item.
Comment 10 Jean-François Fortin Tam 2009-05-26 03:23:14 UTC
Just making sure, did you mean
- add a togglable option to ignore the image layer and show the text data?
- "try to fix ABBY Finereader"? if that's the case, not possible, it's proprietary.
Comment 11 Rune Schjellerup Philosof 2009-11-12 12:05:35 UTC
Created attachment 147567 [details]
jstor pdf. Drag the background to a new evince and it looks better.

The reporter posted a pdf with a bad quality image in the background.
But, if the background image is good quality, it still looks bad until you save it.
All jstor.org (scholarship archive site) pdfs have this problem it seems.

I have also posted this image to https://bugs.freedesktop.org/show_bug.cgi?id=5589

A workaround would be greatly appreciated until poppler is fixed (if ever, it is an old bug).
I see this problem quite often.
Comment 12 GNOME Infrastructure Team 2018-05-22 13:33:13 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/evince/issues/89.