After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 756805 - most non-ascii characters get rewritten to "fi" in PDF forms
most non-ascii characters get rewritten to "fi" in PDF forms
Status: RESOLVED NOTGNOME
Product: evince
Classification: Core
Component: PDF
3.18.x
Other Linux
: Normal normal
: ---
Assigned To: Evince Maintainers
Evince Maintainers
: 742146 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2015-10-19 11:02 UTC by Kamil Páral
Modified: 2015-10-23 09:35 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
example PDF form (187.23 KB, application/pdf)
2015-10-19 11:11 UTC, Kamil Páral
Details
screenshot of character corruption (175.82 KB, image/png)
2015-10-19 11:12 UTC, Kamil Páral
Details

Description Kamil Páral 2015-10-19 11:02:05 UTC
An example PDF form attached. Evince says it has all fonts embedded. Yet I can't input many Czech (non-ascii) characters. Characters that work fine:

ascii chars
áéíóú

Characters that get rewritten to "fi" instead:
ůěžščřďťň

So if try to input:
Purkyňova
it becomes:
Purkyfiova

I can reproduce with many other PDF forms files. This bug has been present since Evince started supporting PDF forms, it's not a recent regression.

Thanks for looking into this.

evince-3.18.0-1.fc23.x86_64
poppler-0.34.0-1.fc23.x86_64
poppler-glib-0.34.0-1.fc23.x86_64
poppler-data-0.4.7-4.fc23.noarch
poppler-utils-0.34.0-1.fc23.x86_64
Comment 1 Kamil Páral 2015-10-19 11:11:53 UTC
Created attachment 313655 [details]
example PDF form

I don't know if character corruption occurs in every PDF form, but I've seen it in many and this is one of them.
Comment 2 Kamil Páral 2015-10-19 11:12:52 UTC
Created attachment 313656 [details]
screenshot of character corruption

Screenshot of PDF form filled with ěščřžďťň characters which got rewritten to fififififififi.
Comment 3 Germán Poo-Caamaño 2015-10-20 14:44:38 UTC
*** Bug 742146 has been marked as a duplicate of this bug. ***
Comment 4 Marek Kašík 2015-10-22 14:43:07 UTC
Hi Kamil,

this is actually a bug in poppler for which there is already several bugs filed. See:
https://bugs.freedesktop.org/show_bug.cgi?id=42944
https://bugs.freedesktop.org/show_bug.cgi?id=36111
https://bugs.freedesktop.org/show_bug.cgi?id=20009

The problem you are facing is well described here: http://stackoverflow.com/a/15973614.
Basically the font used for drawing of the characters doesn't contain the required glyphs (or the encoding specified in the PDF doesn't list the glyphs).

Another problem here is that even if the font does have the glyph, it is not found correctly. I've filed a bug for this together with patches (see https://bugs.freedesktop.org/show_bug.cgi?id=92597).

The solution to the font problem is not easy because we would basically need what Adobe Reader does, find a font which has the required glyphs and embed it (or his part) to the resulting PDF. This is something which is not implemented in poppler yet and won't be anytime soon I guess.

However this is not an evince bug so I'm closing this with resolution notgnome.

Regards

Marek
Comment 5 Kamil Páral 2015-10-23 09:35:05 UTC
Thanks, Marek, for your work.