GNOME Bugzilla – Bug 527222
No information about missing fonts and possible errors in display and or printing
Last modified: 2013-01-05 12:13:08 UTC
[ From http://bugs.debian.org/461499 ] A Debian user reported that Evince doesn't provide information about missing fonts in documents. This might result in a document not being displayed or printed as intended. Acrobat Reader apparently provides this functionality. If my reading of bug 164843 isn't wrong, this functionality was intended to be added to the Properties dialog: "Differentiate which fonts this system has that the document is using and which fonts the system doesn't have which the document wants. Display Name, Type, Unicode, Embedded, and Subset." It might also be a good idea to immediately display a warning about missing fonts and that a fallback font was used and as a result the document may not look/print as intended.
*** Bug 616084 has been marked as a duplicate of this bug. ***
Was looking how to fix this. My idea was to display a GtkInfoBar when a font is missing. To get font info for the document, I need to call ev_document_fonts_scan then ev_document_fonts_fill_model and add a boolean column in the model set to TRUE if poppler_fonts_iter_get_substitute_name() returns !=NULL. Or I could add a new virtual method that returns a list of missing fonts, implementation (for pdf) will be similar to pdf_document_fonts_fill_model. I'm new with evince codebase, so I'm not sure this is the best way. I can work on a patch if evince devs confirms this is a good solution.
Example of PDF with missing fonts, useful for testing when we have a patch: http://www.emesystems.com/pdfs/topboard.pdf (taken from bug #616084)
Xavier, what would be the definition of missing font? If missing fonts are fonts which are not embedded for which poppler didn't find a substitute font, then I think it will be fine to have a method which ask for missing fonts, and using that to display an infobar in such a case.
Correct me if I'm wrong but my understanding is fonts can be either: 1) embedded 2) found on system 3) not found on system, fallback to some other system font So 1 and 2 are fine, but in 3 we should tell the user that the document won't use the intended font. Is it possible that evince don't find a substitute, or will it always fallback to system's document font in the worst case?
There is 4) not found and no fallback. I am not really sure about warning in 3)... For instance, a lot of doc, like the test you attached, just have "helvetica", that will not be found in most systems... But most systems do have a proper replacement that should not affect the rendering, so in these cases I wouldn't like to have a warning like the one you are proposing... anyway, it is a subtle thing, right now in 3.6, things are much better since we are telling the user which font was used to substitute.
Hmm, so maybe it needs a whitelist of fallbacks that are perfect replacements and so don't need to be warned? Why is there a 4) ? Won't it always fallback to system font, like "Sans" or "Dejavu"?
You are probably right and there is no 4)... I was thinking on problems with japanese or other fonts like that... but I think I was mistaking as those are problems with poppler-data not being installed I presume.
Actually I did not though about non-latin alphabets, indeed. I rekon don't know enough about this to know if there is a 4 or not. IMO if there is a 4, we clearly should warn user. In case 3 it seems to be more a case-by-case depending if the fallback is visibly different or not, and that can only be done with a whitelist of fonts known to be acceptable fallbacks. Right?
Bug 675925 added some information to the font properties dialog about substitute fonts. (In reply to comment #5) > Correct me if I'm wrong but my understanding is fonts can be either: > 1) embedded > 2) found on system > 3) not found on system, fallback to some other system font There are only two options: 1) embedded 2) not embedded, use the fallback font supplied by fontconfig. I assume you want to know if the substitute font is not a good match based on either the font name or charset. Poppler would need additional API to report this information.
Shouldn't there be three different cases? 1) embedded 2) one of the 14 PDF fonts that do not need to be embedded 3) a non-standard font that is not embedded although it should be. show warning to user "These fonts are sometimes called the base fourteen fonts.[34] These fonts, or suitable substitute fonts with the same metrics, must always be available in all PDF readers and so need not be embedded in a PDF" -- http://en.wikipedia.org/wiki/Portable_Document_Format
The issue seems to be the third case: "a non-standard font that is not embedded although it should be. show warning to user". The problem I see is: What can the user do? What do expect from them? If I open the file linked in #c3 in Acroread 9 and I do not get any warning. It seems that Acroread renders the document better than Evince, but it might be because it matches the document with different fonts. Geneva -> Adobe Sans MM, Type 1 Helvetiva -> ArialMT, TrueType TektonMM_623_wt_718_wd -> Adobe Sans MM, Type 1 In Evince the fonts I get replaced are: Geneva -> DejaVu Sans Helvetiva -> Nimbus Sans L TektonMM_623_wt_718_wd -> DejaVu Sans However, in both cases I get the information in similar place (Properties/Fonts). Where else should I look at?
(In reply to comment #12) > The issue seems to be the third case: "a non-standard font that is not embedded > although it should be. show warning to user". > > The problem I see is: What can the user do? What do expect from them? If fontconfig is not returning the best match the user should adjust their fontconfig settings to improve the match.
(In reply to comment #13) > (In reply to comment #12) > > The issue seems to be the third case: "a non-standard font that is not embedded > > although it should be. show warning to user". > > > > The problem I see is: What can the user do? What do expect from them? > > If fontconfig is not returning the best match the user should adjust their > fontconfig settings to improve the match. Not an evince bug. Moreover, the commit that implements the font substitution is: http://git.gnome.org/browse/evince/commit/?id=c8b3fd2c which is already available in 3.6.0. (you provided the patch). IMVHO, that commit is enough to consider this bug solved. If there is something to improve, that is the font matching. However, that does not fall in evince's field. A warning message (even in the info bar) could become annoying. Several matches are good enough in spite of lacking exact metric, and user might not want to install every single font required to read PDF's maybe only the ugly ones. The font might not even be available for free or at reasonable price. So having a bar pestering with no real information on how to solve it, it can be worse.
The user can complain to whoever made the document and let them know that it does not comform to the PDF specification?
(In reply to comment #15) > The user can complain to whoever made the document and let them know that it > does not comform to the PDF specification? The specification that requires fonts embedding is PDF/A (PDF for Archiving). Not necessarily the case. The user still can check the property dialog, as the user would do when using Acroread. Sometimes a font is used for a specific paragraph, or even a symbol. Even when the font is not available, these are several situations that is not a big deal. A warning would make the situation worse, specially for people who consume dozens or hundreds PDFs a week.
Ok I didn't know that this was only true for PDF/A. Since users don't know what the 14 fonts are they can't figure out if property dialog lists some other fonts. In any case, font issues are the largest cause of complains that I hear. Usually users blame evince and switch to adobe reader without knowing how to diagnose the issue further.
(In reply to comment #17) > Ok I didn't know that this was only true for PDF/A. > > Since users don't know what the 14 fonts are they can't figure out if property > dialog lists some other fonts. FWIW, the properties dialog did not provide information of font substitution previously. > In any case, font issues are the largest cause of complains that I hear. > Usually users blame evince and switch to adobe reader without knowing how to > diagnose the issue further. It is an issue that improves anytime that poppler improves. As I said, rhere are many documents where the font substitution is not a big deal, even with different metric. So, the thing is how can we make that information easier to discover to the user? I do not think that a warning using the GtkInfoBar or a message dialog is the solution, because they can be very disruptive in most of the cases.
Amen to the different time zones! @Adrian, thanks for the clarifications. I was looking a bit at fontconfig API and it looked to me that when you do a FcSubstitute, you will always get results. Do you know if there is a way in Fontconfig to know if the substitute is good or not? @Xavier, as having a whitelist to know when to deploy the infobar, I don't think it is feasible at all (think different distros), and it would be a nightmare to maintain. @German, I agree with most of your points. I think the best way of fixing that, unless we can measure how good a substitute (which I really doubt is feasible but have no idea), is to improve the Documentation on the new Font dialog features. Currently in the user help there is no information about this feature, so users who don't understand about pdf fonts not being embedded have no clue of how to diagnose/solve a pdf font problem. So, what we should (at least) do is to attemp to write a new help entry on "My PDF is not rendered correctly" with all the possible problems you might have and some tips to fix them.
I don't know much about fontconfig. Details of the font matching are at: http://www.freedesktop.org/software/fontconfig/fontconfig-user.html I did not see anything in the API that would indicate how good the match is. You could compare the PDF font name with the substitute font name but this will not always work. For example the Ghostscript base 35 fonts distributed with ghostscript are a very good match for the PDF base 14 fonts but they have names like "Nimbus Sans L" (equivalent to "Helvetica" in PDF). I think the best solution would be based on what Timo Lindfors suggested in comment 11. ie 3 options: - Font is embedded. - Font not embedded but is one of the base 14 fonts. Evince must have suitable substitutes for the base 14 to comply with the PDF standard. I think it is safe to assume that an exact match will always be found. Distros should make suitable base 14 substitute fonts a dependency of Evince. - Font not embedded and not one of the base 14. In this case there is no guarantee that an exact match can be found. Even if a font of the same name is found it may have different metrics or be missing some glyphs. It would be useful to provide some indication to the user that the PDF may not be rendered correctly. Hopefully this might also help discourage creating PDFs with non embedded non base 14 fonts.
Created attachment 230019 [details] Evince fonts dialog This is what I see in the Evince fonts dialog with the file from comment #3. It already says whether the fonts are embedded and their substitutions. It does not say whether the fonts belong to one of the 14 base fonts. In backend/pdf/ev-poppler.cc pdf_document_fonts_fill_model() we could just compare the string returned by poppler_fonts_iter_get_name() with the 14 font names. Poppler does not export the 14 fonts names but they are defined e.g. in poppler/GfxFont.cc: static const char *base14SubstFonts[14] = { "Courier", "Courier-Oblique", "Courier-Bold", "Courier-BoldOblique", "Helvetica", "Helvetica-Oblique", "Helvetica-Bold", "Helvetica-BoldOblique", "Times-Roman", "Times-Italic", "Times-Bold", "Times-BoldItalic", "Symbol", "ZapfDingbats" }; Should we reuse that list and add a note "This font is not one of the Standard 14 Fonts, the substitution might be imperfect" when the font is not embedded? I also think it is better to keep that information in the Properties/Fonts dialog rather than with a GtkInfoBar.
Created attachment 230030 [details] [review] [PATCH] font properties: say whether non-embedded fonts are standard
Created attachment 230031 [details] Evince fonts dialog with the previous patch applied
Review of attachment 230030 [details] [review]: Patch looks good to me. Adrian, are you ok with the patch too? ::: backend/pdf/ev-poppler.cc @@ +1124,3 @@ + else + embedded = _("Not one of the Standard 14 Fonts\n" + "Not embedded, the substitution might be imperfect"); I wonder if we shoudl show only one of the two things, either whether it's once of the standard fonts, or whether the substitution might not be accurate. alban suggested on IRC to simply say whether it's one of the 14.
(In reply to comment #24) > Review of attachment 230030 [details] [review]: > > Patch looks good to me. Adrian, are you ok with the patch too? The problem is the user needs to open the fonts dialog and scroll through the list (which could potentially be very long) to identify any issues with rendering the pdf. I suggest displaying a message at the top of the fonts dialog warning of any non embedded non standard fonts. The changes to the font list could then be simplified to only change the font type line to indicate standard 14 fonts. eg Helvetica Type 1 (Standard 14 Font) Encoding: Custom Not embedded, substituting with Nimbus Sans L (/usr/share/fonts ...) Note that the Standard 14 fonts are all Type 1 fonts. A non embedded TrueType font with the same name is not a Standard 14 font. There is still the problem that the pdf may not be rendered correctly but the user is unaware of this unless they check the font dialog of every pdf they open.
(In reply to comment #25) > [...] > There is still the problem that the pdf may not be rendered correctly but the > user is unaware of this unless they check the font dialog of every pdf they > open. This is tricky because it is hard to know when a font substitution makes a document looks ugly. At least we have a list of ugly fonts, but still we would need to know how frequent the font is used in the document. If an ugly font is used for one symbol in a 200 pages document is not a big deal. But if the ugly font is used in the whole document, then this becomes an issue.
(In reply to comment #26) > At least we have a list of ugly fonts, but still we would need to know how > frequent the font is used in the document. If an ugly font is used for one > symbol in a 200 pages document is not a big deal. But if the ugly font is used > in the whole document, then this becomes an issue. The poppler-glib api does not provide this information. I would suggest just using a GtkInfoBar to indicate if there are non embedded non standard fonts. This will allow users who care about accurate rendering to know whether they can trust the rendering of the pdf they opened. As long as the GtkInfoBar has a "don't show this again" option I don't think it will annoy anyone.
Created attachment 231178 [details] [review] [PATCH] font properties: say whether fonts are pdf standard Updated patch: - Don't say it is a standard font if it is not a Type 1 font - Add a "summary" label before the tree view in the font dialog. Backends can return a summary line. The PDF backend says whether some fonts are non-standard without being embedded. (still no GtkInfoBar in this patch)
Created attachment 231179 [details] Evince fonts dialog, with missing fonts, with the last patch applied
Created attachment 231180 [details] Evince fonts dialog, with all fonts, with the last patch applied
+ *summary = _("Some non-standard fonts are missing. " + "Font substitution might be imperfect."); I suggest making the message more informative. Something like: "This document contains non-embedded fonts that are not from the PDF Standard 14 fonts. If the substitute fonts selected by fontconfig are not the same as the fonts used to create the PDF, the rendering may not be correct." + } else { + standard_str = _("Not one of the Standard 14 Fonts"); I recommend leaving standard_str empty for this case. Stating that an embedded font is not one of the standard 14 suggests there is something wrong with it.
Created attachment 231372 [details] [review] [PATCH] font properties: say whether fonts are pdf standard Thanks for your review Adrian. Updated patch: - text in parenthesis is either non-existent for embedded fonts, or specifying whether the font is standard for non-embedded fonts. - better summary line as suggested by Adrian. - summary label is multiline I noticed a problem with the patch as it is (hence the status "needs-work"): - fill_model() from the EV_TYPE_DOCUMENT_FONTS interface can be called several times: fonts are loaded by iteration. So having a local variable "missing_fonts" to keep a global state of the document does not work. It could generate a wrong summary label on big documents.
Review of attachment 231372 [details] [review]: Patch looks good, but we should try to no break the API if possible. Thanks. ::: backend/pdf/ev-poppler.cc @@ +1052,3 @@ +{ + /* list borrowed from Poppler: poppler/GfxFont.cc */ + static const char *base14SubstFonts[14] = { We use underscore for variable names in evince, this should be base_14_subst_fonts @@ +1085,3 @@ pdf_document_fonts_fill_model (EvDocumentFonts *document_fonts, + GtkTreeModel *model, + const gchar **summary) The problem is that this breaks the API. Instead of this, we could save whether there are missing fonts, and add a new method pdf_document_fonts_get_fonts_summary() or something like that. @@ +1089,3 @@ PdfDocument *pdf_document = PDF_DOCUMENT (document_fonts); PopplerFontsIter *iter = pdf_document->fonts_iter; + gboolean missing_fonts = FALSE; You could save this in the private struct @@ +1137,3 @@ + */ + standard_str = _(" (One of the Standard 14 " + "Fonts)"); I think it's a bit clear if this is just one line. @@ +1145,3 @@ + */ + standard_str = _(" (Not one of the Standard 14" + " Fonts)"); Ditto. @@ +1183,3 @@ + " rendering may not be correct."); + else + *summary = _("All fonts are either standard or embedded."); This block would be the implementation of pdf_document_fonts_get_fonts_summary() ::: shell/ev-properties-fonts.c @@ +201,3 @@ + if (font_summary) + gtk_label_set_text (GTK_LABEL (properties->fonts_summary), + font_summary); This is updated everytime fonts are updated, using a different method, you would only call ev_document_fonts_get_fonts_summary when the job has finished. This way we would also avoid the label to change while the jobs are scanned which would be a bit weird.
Review of attachment 231372 [details] [review]: Patch looks good, but we should try to no break the API if possible. Thanks. ::: backend/pdf/ev-poppler.cc @@ +1052,3 @@ +{ + /* list borrowed from Poppler: poppler/GfxFont.cc */ + static const char *base14SubstFonts[14] = { We use underscore for variable names in evince, this should be base_14_subst_fonts @@ +1085,3 @@ pdf_document_fonts_fill_model (EvDocumentFonts *document_fonts, + GtkTreeModel *model, + const gchar **summary) The problem is that this breaks the API. Instead of this, we could save whether there are missing fonts, and add a new method pdf_document_fonts_get_fonts_summary() or something like that. @@ +1089,3 @@ PdfDocument *pdf_document = PDF_DOCUMENT (document_fonts); PopplerFontsIter *iter = pdf_document->fonts_iter; + gboolean missing_fonts = FALSE; You could save this in the private struct @@ +1137,3 @@ + */ + standard_str = _(" (One of the Standard 14 " + "Fonts)"); I think it's a bit clear if this is just one line. @@ +1145,3 @@ + */ + standard_str = _(" (Not one of the Standard 14" + "Fonts)"); Ditto. @@ +1183,3 @@ + " rendering may not be correct."); + else + "fonts that are not from the PDF Standard" This block would be the implementation of pdf_document_fonts_get_fonts_summary() ::: shell/ev-properties-fonts.c @@ +201,3 @@ + if (font_summary) + gtk_label_set_text (GTK_LABEL (properties->fonts_summary), + font_summary); This is updated everytime fonts are updated, using a different method, you would only call ev_document_fonts_get_fonts_summary when the job has finished. This way we would also avoid the label to change while the jobs are scanned which would be a bit weird.
Created attachment 232727 [details] [review] [PATCH] font properties: say whether fonts are pdf standard Updated patch with fixes from the review.
Comment on attachment 232727 [details] [review] [PATCH] font properties: say whether fonts are pdf standard Pushed with some fixes. Thanks!