GNOME Bugzilla – Bug 733525
PDF loader: Katakana characters show up in File->Open preview, but not in the loaded image
Last modified: 2014-10-02 12:30:11 UTC
Created attachment 281352 [details] Katakana character chart The Katakana characters to the left of the Latin ones in the attached PDF chart show up in the File->Open preview image (small, but recognizable), but are absent in the final image that is imported. This might be Windows- (and 32-Bit)-specific, as other people have reported this as working for them, albeit without specifying their platforms, or on Linux distros.
For info, it is indeed working here on a Linux distribution. I will test maybe later on Windows, unless someone wants to test first. :-)
I confirmed the bug on a freshly compiled master branch on my Windows 7 VM. Same as Michael, I think I see the katakana in the small preview, but nothing in the finale render on canvas.
Seems to be a poppler bug. I could reproduce the problem with a basic pdf viewer written with poppler. The same things happen when cross-compiling this code and running it under my Windows 7 VM. So I opened a bug report on poppler tracker: https://bugs.freedesktop.org/show_bug.cgi?id=81746
Well it turns out the problem is just that we likely miss the package poppler-data (when I installed it in my locale cross-build, I get the Japanese character). Michael, did you encounter the issue with the official release? In any case, I am proposing the poppler people to add a pkg-config file for their data package. This way, we can at least output a warning at configure (if not a strong dependency, if we consider that being able to handle Asian character should not be optional) so that our Windows packager is aware of the issue.
Ok so I added the poppler-data optional dependency in the INSTALL file. I don't close the current report yet though, because I wait to see if the upstream poppler project will accept my patch for a pkg-config file to their poppler-data package. If they do, we will be able to add a warning in our configure script to warn packagers. commit 9c2dbf46c588c263ec4c7ffa1b657dfb0abb9d75 Author: Jehan <jehan@girinstud.io> Date: Mon Jul 28 17:16:40 2014 +0000 Bug 733525 - Japanese characters not rendered from imported PDF. poppler-data is an optional encoding package, necessary to import PDF with CJK and cyrillic text.
I think that adding a warning in our configure script is probably going a little overboard. It's not really the configure script's job to babysit packagers and determine whether all of our dependencies were built with every possible optional feature enabled. I think it should be enough to just ping all of the builders on irc and make sure we ship these files. My builds have had the data files since last november, but it appears that the osx build, and ender's build are both missing them right now.
I don't agree. I think the fontconfig configuration bugs are the perfect example of what went wrong because there were no messages in the centralized place that is the configure script. => I made messages on the mailing lists, I asked 10 times on IRC how to make sure that our builders would include the right versions (or fix themselves the config), I wrote messages in the various reports in the bugtrackers, and in the end? Well if not mistaken, we had 2 official Windows releases with the many fontconfig bugs, even though we already knew the solutions (and apparently some people even knew the solution even longer before!). So no, by experience, I can say that "ping[ing] all of the builders" is far from enough. Now we have a warning in fontconfig when compiling for win32 with old fontconfig. And I think we should do the same for the poppler-data, which is basically a similar issue (that's not a code bug, but definitely a build bug). Moreover I don't see why data packages should be considered differently as a library. With it, we have a feature; without, we don't. This may not necessarily be a warning. It can take the form of an added feature in the finale summary after a ./configure. Something like: Optional Plug-Ins: [...] PDF (import): $have_poppler (cyrillic and CJK support: $have_poppler_data) PDF (export): $have_cairo_pdf
Fixed for the OS X build: commit 6fa627d47d0e57afe9846c4f72483e8377db3ede Author: Sven Claussner <sclaussner@src.gnome.org> Date: Tue Aug 5 21:15:02 2014 +0200 Bug 733525 - Add poppler-data package to the OS X build Add poppler-data package to the OS X build. **** Bugfix: show Cyrillic and Asian (CJK) characters in PDF files. build/osx/gimp.modules | 8 ++++++++ 1 file changed, 8 insertions(+) Mike Henning already fixed it in the Windows nightly builds. Out of the blue I don't remember nor find a FontConfig discussion at the mailing list, but in general I think it's better to use the mailing list just to catch all necessary people.
Hi Sven, There have been so many threads (sometimes quite long!) about these specific fontconfig issues: - sept 2013: we were searching for the cause of the problem which appeared in GIMP 2.8.6 (apparently fine in 2.8.4): https://mail.gnome.org/archives/gimp-developer-list/2013-September/msg00117.html This is where drawoc told us that the cause was actually already known and fixed in the nightly build: https://mail.gnome.org/archives/gimp-developer-list/2013-September/msg00124.html - oct 2013: I asked for the the 2.8.8 release specifically because of these fontconfig bugs: https://mail.gnome.org/archives/gimp-developer-list/2013-October/msg00015.html Well 2.8.8 ended up being released without the fixes. - nov 2013: I wrote again about a fontconfig-related patch to be used while waiting upstream: https://mail.gnome.org/archives/gimp-developer-list/2013-November/msg00056.html - Later the same month, when the 2.8.10 was already being prepared, I asked again to not forget the patches: https://mail.gnome.org/archives/gimp-developer-list/2013-November/msg00081.html - Finally I went a step ahead and tried to propose a release procedure where the packagers could follow some steps to not forget some things (citing specifically this fontconfig problem). I prepared wiki pages for packagers to fill up, it made a long thread and several people (you included, Sven!) answered it was a very good idea: https://mail.gnome.org/archives/gimp-developer-list/2013-November/msg00082.html Again 2.8.10 got released a few days later. The bugs were still there. No patches had been applied to the official release, even though some of them had been applied in the nightly builds for eons. The wiki pages about a release procedure never got filled up (check the wiki history: no edit since the original creation. We may as well delete them now). In all these topics, there has been emails from most of the main contributors of GIMP, from Mitch to you, as well as schumaml, drawoc, etc. But we still got 2 releases broken, while the fixes are known. So we definitely "caught" all necessary people, but it proved to be not enough. Now I patched and pushed the patches upstream so that we could add configure warning (which I added now). Hopefully this will be enough. Time will tell. Also the fact that you don't remember these many discussions, even though you were participating in most of them really makes my point when I say that the bugtracker is a much better place to keep track of discussions, rather than mailing lists! :P
What we did in similar cases in the past was to require a recent-enough version of a library or utility for the affected platforms only (if that library version had been released already).
Reminding others not to forget things is not exactly a guarantee for success :) Anyway, all this stuff is only in mail or in bugzilla, or on some wiki page. Unless we have a FIXED wiki page about TODOs before the next release, and that wiki page is mentioned in devel-docs/release-howto.txt, I'm afraid not much will change.
> Unless we have a FIXED wiki page about TODOs before the next release, and that wiki page is mentioned in devel-docs/release-howto.txt, I'm afraid not much will change. I agree but devel-docs/release-howto.txt deals only with the source release, which has no such dependency issues. I think we should have similar step-by-step procedure doc for each of the binary release that we do officially (i.e. Windows and OSX). And I actually asked for these also many times. I see there is a build/osx/README though I'm not sure that's the procedure to make the release binary, or just a dev version. In any case, I still see nothing for Windows. I'd like to be able to build a Windows setup, but I have no idea how to use these files in build/windows/! Would it be possible to have the package maintainers write devel-docs/(windows|osx)-release-howto.txt files written which would be as accurate as the source howto? This would be awesome!
+1
(In reply to comment #12) > I see > there is a build/osx/README though I'm not sure that's the procedure to make > the release binary, or just a dev version. Hmm, the OS X README distinguishes between 2.8 and master. What information are you missing? As this goes offtopic from the original issue, let's discuss it separately (in Bugzilla, if you want). Jehan, in our mail correspondence to me you asked: > Well I know that we have a build/osx/README. [...] Though as a > side note, it seems to make .bundle files, though on our download > page, we have .dmg file. What is the difference? No, the script doesn't create bundle files, they are already there in the build/osx directory (see them in the 2.8 branch as for now). These bundle files control the GTK-Mac-Bundlers behaviour to create an App directory (GIMP.app). This App directory is a self-contained GIMP application with all dependencies. The App directory can't usually be shipped as is because of its directory nature and is therefore packed into a DMG file.
Upstream accepted my patch, pushed it and bumped their version to 0.4.7. So I modified the configure script. This is only an informational change, no requirement. Basically in the ./configure summary, instead of: pdf import: yes We will have: pdf import: yes (Cyrillic and CJK support: yes) or pdf import: yes (Cyrillic and CJK support: no or poppler-data < 0.4.7) ----------------------------------------------- commit f212c9bfc2cbdc99d708c5620a4da44aa887c96e Author: Jehan <jehan@girinstud.io> Date: Wed Aug 13 00:47:12 2014 +0000 Bug 733525: check presence of poppler-data (informational only). As of version 0.4.7, poppler-data has a pkg-config file, allowing us to verify its presence. The configure summary is only informational, and we don't impose this version since older versions may still work. Moreover poppler-data is only a runtime dependency, so you can also add it afterwards.
This is something that can change at runtime, right? If people remove or install the poppler-data package, I mean. So checking for it in this way is nice to tell a package builder whether their build environment is complete, but nothing whatsoever about support for it in the users environments. We probably need a check at runtime for this in the poppler-using plug-in(s).
Yes this is a runtime dependency. You are right in saying that users can install or remove it anytime after installation (and this will transparently change the behavior as for Cyrillic/CJK). Nevertheless a runtime check is maybe not possible because usually packagers would separate end-user from development files (the common -devel package). The .pc file is commonly considered a dev file. And pkg-config itself is also not installed by users normally. That makes it very hard to check in a finalized "everyday" installation. Other than importing a pdf with such a character and check the output, of course. This last solution could actually be done internally (loading in background a very small known pdf with a single character, and we know the expected rendered bitmap, then compare it without giving it a canvas, for instance), but is the bother worth it? I think we just have to go with the current solution. For OSX and Windows users, the user will normally never touch the dependencies anyway. So for them whatever we tell the package builder is what matters. As for Linux, the package maintainer will decide whether to make it a hard dependency or not. Or else maybe add a "recommends"/"enhances" flag for package systems which have such a feature (.deb package, or Gentoo ones, for instance).
The runtime check would not check for the availability of .pc or related development files, but whether poppler is capable of rendering those letters. If the API for this in poppler?
*If there's
I'm not knowledgeable with poppler API, but after a quick skimming, it doesn't look like there is such a function: https://developer.gnome.org/poppler/0.24/ Now there are ways (like the "try and compare result" that I said earlier), but I'm personally not interested into going that far. So I won't implement this, at least not now. I understand the need though. So maybe let's reopen and see if someone wants to implement this some time later? Should we reopen?
This works fine with the 2.8.14 installer now.
Not fixed with GIMP 2.8.14 for OS X (the new official DMG uploaded today, tested on OS X 10.7.5): The Katakana characters to the left of the Latin ones are shown in the preview, but are missing in the final image. The same issue was encountered with (self-contained, relocatable) OS X packages of Inkscape: the files of the poppler_data package were not found even if included (copied) into the app bundle (no relocation support in poppler for poppler_data on OS X). On systems where the app bundle is created, the failure is only noticeable after renaming/removing the build env (the prefix where poppler and poppler_data is installed to). I'm not a core Inkscape developer myself, but here's how I made it work for Inkscape.app: http://bazaar.launchpad.net/~inkscape.dev/inkscape/osx-packaging-update/revision/13546 (Note: the branch is pending for merge, and has not yet been reviewed by core inkscape devs)
Oh well, since OSX use self-contained programs (with all dependencies), I guess that makes sense that it is broken. I believe the problem to be more in libpoppler than GIMP. We can see in poppler/GlobalParams.cc that they special-case win32 by replacing the value POPPLER_DATADIR by the return value of get_poppler_datadir() (which build a new path relatively to the binary path). Well they should probably do something similar for OSX then. I would do the patch, should be quite straightforward. Unfortunately I have no OSX to test such a patch. If someone could open a ticket there and link it here, this would be nice. In the meantime, a patch similar to what you propose for Inkscape would be welcome. This GlobalParams() thing seems indeed the right workaround. I don't think it's needed to have an intermediary environment variable though. Just build the path relatively, assuming everything is in the same prefix inside the OSX package (which, I guess, should be the case all the time, unless packagers like complicated setups, no?).
It appears that my earlier comment wrt GIMP 2.8.14 for Mac OS X was based on a wrong assumption: on closer inspection, the content of the poppler_data package is not actually included in the latest GIMP application bundle: AFAIU commit 6fa627d47d0e57afe9846c4f72483e8377db3ede (referenced in comment 8) only added it as build dependency (build/osx/gimp.modules), but not to the gtk-mac-bundler files (build/osx/gimp-*-python.bundle)). Whether or not GIMP.app requires further changes (similar to Inkscape.app) to support relocation for poppler's data files needs to be tested once app bundles are available which include the poppler_data files.
Ok well if someone could: - check if poppler-data is indeed in our OSX release; - if not, make a new package with it and test if it is enough to fix the CJK bug; - if that's enough, well we could update the OSX release; if not, make a patch for the workaround + a bug report to libpoppler.
People, this has been reported against the Windows installer. Please open a new bug if it happens elsewhere - you missed to change the assignee, for example. Actually, the whole handling of this in the build system would have deserved its own bug report, as it isn't really a fix for the problem that packagers can leave out arbitrary non-required dependencies.
Ok. Opened bug 737779 for the OSX package.