GNOME Bugzilla – Bug 394260
PDF of JPG way too large
Last modified: 2012-04-11 19:27:22 UTC
Hi, PDF files can embed PNG and JPG files as is. This would be a useful feature for the print dialog (not sure if EOG is the right place to report). Printing scanned images to files in infeasible right now. example: the original: [[hanwen@haring ~]$ ls -l ~/makam-scan.jpg -rw-r----- 1 hanwen hanwen 662908 Jan 8 15:47 /home/hanwen/makam-scan.jpg EOG print to PDF file output: hanwen@haring ~]$ ls -l persoonlijk/correspondentie/output.pdf -rw-r--r-- 1 hanwen hanwen 23658766 Jan 8 15:56 persoonlijk/correspondentie/output.pdf jpg -> pdf via Latex: [hanwen@haring ~]$ ls -l persoonlijk/correspondentie/makam-scan.pdf -rw-r--r-- 1 hanwen hanwen 664052 Jan 8 16:10 persoonlijk/correspondentie/makam-scan.pdf
Not sure, but this sounds more to me like a bug in libgnomeprint. We have already migrated eog to use GtkPrint instead of libgnomeprint, so I think, at least in eog, this bug may be fixed. I can't test now but will check as soon as possible.
Ok, I generated two prints of the same image, one with EOG 2.16.2 (libgnomeprint), and other with EOG 2.17.4 (gtkprint). The filesizes follow: claudio@dijkstra:~$ ls -lh dscn3752.jpg -rw-rw-rw- 1 claudio claudio 765K 2006-03-10 17:55 dscn3752.jpg claudio@dijkstra:~$ ls -lh *.pdf -rw-r--r-- 1 claudio claudio 7.3M 2007-01-08 19:56 output-libgtkprint.pdf -rw-r--r-- 1 claudio claudio 30M 2007-01-08 19:55 output-libgnomeprint.pdf Even when I see an improvement, I'm not sure if it's optimal. However, we use the high level GtkPrint API, which uses cairo to render the pages. I don't think we can achieve any improvement in the EOG (or GTK+) front. If anything could be improved, that would need work on the cairo PDF backend. I generated the same print, now in Postscript format, and converted it to PDF by means of ps2pdf. The resulting sizes are: -rw-r--r-- 1 claudio claudio 14M 2007-01-08 20:05 output-2.ps -rw-r--r-- 1 claudio claudio 216K 2007-01-08 20:06 output-2.pdf So, I have the feeling that there's room for improvement, but in the cairo front. I'm cc'ing Behdad here, as he's probably wiser than me on this matter.
note that ps2pdf may introduce probably introduces some downsampling. PDF files can simply include verbatim Jpeg data, see. ch3 page 60 of PDF Reference v1.5 4th ed. (DCTDecode filter).
yeah, this needs some new cairo API for the least. Cairo's PDF backend (and all other backends FWIW) embed images as PNG. You know what happens then. One way to fix this is to use a PDF toolkit, like pdftk, to embed the JPG directly, and bypass cairo. Not sure how feasible that is with GtkPrint. Anyway, I'll draw attention to this bug on cairo list. We'll see. Thanks.
Discussion on this topic has been started on the cairo mailing list. We should keep an eye on it. The thread is here: http://lists.freedesktop.org/archives/cairo/2007-January/009096.html
*** Bug 440382 has been marked as a duplicate of this bug. ***
Just spotted this in the cairo 1.9.2 release announcement: > API additions: > > cairo_surface_set_mime_data() > cairo_surface_get_mime_data() > > Should this take unsigned int, unsigned long or size_t for the length > parameter? (Some datasets may be >4GiB in size.) > > Associate an alternate, compressed, representation for a surface. > Currently: > "image/jp2" (JPEG2000) is understood by PDF >= 1.5 > "image/jpeg" is understood by PDF,PS,SVG,win32-printing. > "image/png" is understood by SVG. Let's see how this evolves until cairo 1.10. This might be part of the solution here. :-)
Cairo 1.10 has been released. I think we can start cooking a patch for this.
I had it done for JPEG in PDF already. Let's see if I can dig it out again.
Created attachment 169699 [details] [review] proof-of-concept Found and only needed a small fix to apply again. Obviously needs some work as it only works for unmodified image and unrotated images. But I guess we can reuse some code from the SVG rendering parts here too.
Hmm, simply applying the transformation matrix to the cairo context doesn't seem to work. Although the image is embedded in the PDF it won't display.
Created attachment 206666 [details] [review] Extended version handling transformated images Okay, I finally got rotated/flipped images working. It's not the nicest stuff but it probably won't get any better without changing large parts of eog (e.g. to get the original image size). One problem still prevents inclusion: It doesn't work with autorotated images. :|
Works with autorotation too now. commit 7029dfe154cf8d1ce4df8fd6f3157b344e0ad9b7 Author: Felix Riemann <> Date: Wed Apr 11 21:15:39 2012 +0200 Avoid recompressing JPEGs as PNG when printing Use cairo's feature to simply attach the source file data to the printing surface. This reduces the file size of the resulting PDF file pretty much to the source file size. https://bugzilla.gnome.org/show_bug.cgi?id=394260 --- This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.