GNOME Bugzilla – Bug 347857
advanced printing example crashes
Last modified: 2006-08-07 18:49:39 UTC
(1) PrintFormOperation lifetime With the silly code where ExampleWindow keeps the sigc connections, and disconnects from them after it has run it, the behaviour is (I've been printing to PDF in all cases): - it prints once succesfully - PFO is destroyed - segfaults when trying to open the print dialog once again gdb backtrace: 0x00507385 in Gtk::PrintOperation::set_print_settings (this=0x956ca48, print_settings=@0xbfbd64f0) at ../../gtk/gtkmm/printsettings.h:218 218 GtkPrintSettings* gobj() { return reinterpret_cast<GtkPrintSettings*>(gobject_); } (gdb) bt
+ Trace 69404
The line where it happens is priv->start_page (op, print_context, page_setup); It's not possible to step in, but anyway, all these objects seem to be correct. Yet maybe (spot the question marks): (gdb) print *priv $1 = {action = GTK_PRINT_OPERATION_ACTION_PREVIEW, status = GTK_PRINT_STATUS_PREPARING, error = 0x0, status_string = 0x9b6eca8 "Preparing to print", default_page_setup = 0x9ae0160, print_settings = 0x9ad3630, job_name = 0x9b4fdc0 "lt-example job #1", nr_of_pages = -1 ?????, current_page = -1 ?????, unit = GTK_UNIT_PIXEL, export_filename = 0x0, use_full_page = 0, track_print_status = 1, show_progress = 0, cancelled = 0, allow_async = 0, is_sync = 1, print_pages_idle_id = 0, show_progress_timeout_id = 0, print_context = 0x9b24508, print_pages = GTK_PRINT_PAGES_ALL, page_ranges = 0x0, num_page_ranges = 0, manual_num_copies = 0, manual_collation = 0, manual_reverse = 0, manual_orientation = 0, manual_scale = 0, manual_page_set = GTK_PAGE_SET_ALL, custom_widget = 0x0, custom_tab_label = 0x0, platform_data = 0x0, free_platform_data = 0, rloop = 0x0, start_page = 0, end_page = 0, end_run = 0} Btw set_n_pages() from PFO::on_begin_print() doesn't get called... I think we should first clear up the relationship between PrintOperation and the "spawned" PrintOperationPreview. Ok, that's it for today, it's way past 2am.
Thanks for writing stuff down. I haven't read all this yet, but I'll reply quickly to the last part: > I think we should first clear up the relationship between PrintOperation and the "spawned" PrintOperationPreview It's the same object. The object sends (in the signal) a pointer to its own base class.
The problem related to PrintSettings is solved now: 2006-07-20 Marko Anastasov <marko@marko.anastasov.name> * gtk/src/printoperation.hg: Added the forgotten refreturn for get_print_settings(), which fixes a part of #347857.
So, do the examples crash now?
Yes - (2) is still to be solved (the simple example doesn't crash at all, as it has none of the parts that cause problems here). Here's what's left from (1): (without the manual disconnecting) - open the print dialog, touch the font button, print - repeat, with or without the font button They print fine, PrintFormOperations are not yet destroyed, and on exit there's a segfault, like previously: 0x009741e4 in sigc::internal::slot_rep::disconnect (this=0x9428df0) at functors/slot_base.cc:50 50 (cleanup_)(data_); // Notify the parent (might lead to destruction of this!). (gdb) bt
+ Trace 69496
However, it does not segfault if I don't touch the font button (?), I'll see what is this about. With connections and manual diconnecting PFOs get destroyed immediately and there are no problems.
Let's take out the manual disconnecting. I don't understand what it's meant to achieve.
With or without the manual disconnecting, I get this from valgrind when choosing File/Print Preview (not that line numbers might be different in my version): ==11877== ==11877== Jump to the invalid address stated on the next line ==11877== at 0x0: ??? ==11877== by 0x427E86A: Gtk::PrintOperationPreview::render_page_vfunc(int) (printoperationpreview.cc:277) ==11877== by 0x427DE43: Gtk::PrintOperationPreview_Class::render_page_vfunc_callback(_GtkPrintOperationPreview*, int) (printoperationpreview.cc:100) ==11877== by 0x4559881: gtk_print_operation_preview_render_page (gtkprintoperationpreview.c:106) ==11877== by 0x427E7B6: Gtk::PrintOperationPreview::render_page(int) (printoperationpreview.cc:249) ==11877== by 0x8054E16: PreviewDialog::on_expose_event(_GdkEventExpose*) (previewdialog.cc:90) ==11877== by 0x42D5253: Gtk::Widget_Class::expose_event_callback(_GtkWidget*, _GdkEventExpose*) (widget.cc:4453) ==11877== by 0x4527FB7: _gtk_marshal_BOOLEAN__BOXED (gtkmarshalers.c:83) ==11877== by 0x4ABF88E: g_type_class_meta_marshal (gclosure.c:567) ==11877== by 0x4ABFEBE: g_closure_invoke (gclosure.c:490) ==11877== by 0x4ACF10D: signal_emit_unlocked_R (gsignal.c:2476) ==11877== by 0x4ACFFC5: g_signal_emit_valist (gsignal.c:2207) ==11877== by 0x4AD05C8: g_signal_emit (gsignal.c:2241) ==11877== by 0x462FB96: gtk_widget_event_internal (gtkwidget.c:3901) ==11877== by 0x4526CFD: gtk_main_do_event (gtkmain.c:1379) ==11877== by 0x4754509: gdk_window_process_updates_internal (gdkwindow.c:2324) ==11877== by 0x47545DC: gdk_window_process_all_updates (gdkwindow.c:2387) ==11877== by 0x449F1A4: gtk_container_idle_sizer (gtkcontainer.c:1113) ==11877== by 0x4B20F01: g_idle_dispatch (gmain.c:3924) ==11877== by 0x4B1EBCC: g_main_context_dispatch (gmain.c:2043) ==11877== Address 0x0 is not stack'd, malloc'd or (recently) free'd And I get much the same from a gdb backtrace. It's as if base->render_page is somehow now valid, though we check it for null.
I added the manual disconnecting when I noticed the segfault on program exit and thought it's because ExampleWindow is destroyed before PrintFileOperations, and I wanted the PFOs to be destroyed immediately after they're done and not on program exit, so it's more like an experiment. It's something which rarely needs to be done I know, but (only?) that way the program doesn't segfault on exit after more than one PFO. But that does not affect the previewing stuff, it has a much lower priority now so let's move on. I get the same output in valgring, but I think it's missing a bit - see my gdb backtrace in the first post (the segfault is in GTK's common_render_page()), you should get the same. But that's just the diagnosis. When I would get a segfault while writing the first test I'd see somewhere in GTK code that I didn't create a cairo context for a print context or similar - something was null, so I could tell. I don't know what that means here.
I was comparing our priv object (GtkPrintOperationPrivate) that gets in common_render_page() with the one in gtk+/tests/print-editor.c . The difference: - us: rloop = 0x0, start_page = 0, end_page = 0, end_run = 0 - them: rloop = 0x96b2400, start_page = 0xa9e1f0 <preview_start_page>, end_page = 0xa9dff0 <preview_end_page>, end_run = 0xa9eea0 <preview_end_run> For instance, preview_start_page() is at gtkprintoperation.c: 233 and it emits got-page-size.
Look in gtkprintoperation.c: print_pages() (line 2185): segfault occurs after the preview signal is emitted of course (l. 2229), but before these callbacks are set in the lines that follow. The preview signal is not handled well, too much happens. In print-editor, drawable's expose_event, where for us render_page() fails, is handled after those callbacks are set. In our case, everything goes after preview is emitted, but not everything is ready yet.
I made some minor changes and commented out some of the advanced example, to make it clearer where the problem is. In bool PrintFormOperation::on_preview(), I noticed that m_refLayout is null, probably because on_begin_print() has never been called (though it's an override of the default signal handler). I guess that it would be bad to give a null m_refLayout to the Preview dialog.
Created attachment 70035 [details] [review] A patch that fixes all remaining problems described above, but also makes some new Resizing the window causes a segfault:
+ Trace 69870
Also on my computer I can't print to file and so do a preview in the simple example: cairo-array.c:206: _cairo_array_index: Assertion `0 <= index && index < array->num_elements' failed.
So, the fix seems to be all about refing the PaperSize, but it's hard to read the patch. Please feel free to apply it, but please do keep the _CONVERSION()s in the .hg files, because they are unusual.
Also, those rfs in the PaperSize _CONVERSIONS() probably don't belong in the .m4 file. If it's for a get* method then that method should just have a refreturn option instead.
My initial idea that in on_preview() a (blocking) dialog should be put up was wrong. The purpose of on_preview() is essentially to set a cairo context on the provided PrintContext. Asynchronysm also applies here and after that is done on_preview() should return (true). on_begin_print() and the rest get called after that. So now PreviewDialog is actually a Window, and PrintFormOperation keeps a pointer to it. Moreover, I noticed that I didn't even connect to given PrintOperationPreview's ready and got_page_size. Another mistake was that I wrote PreviewDialog::on_expose_event() and thus implemented the window's virtual expose_event handler. Renaming it to on_drawing_area_expose_event resolved this unwanted behaviour. I had to make PreviewDialog keep a pointer to PrintFormOperation just to be able to access the Pango::Layout it sets. With the initial approach of keeping a copy of it in PD, later when it's needed in on_got_page_size() a segfault occurs, the pointer is somehow null, IIRC. Regarding the conversions. I know it's weird to put reftaking as default but in fact in both cases, for PrintContext and PaperSize, all conversions that exist appear to demand reftaking. For PrintContext it's in printoperation(preview).hg and for PaperSize it's pagesetup.hg and printsettings.hg . I'm sure for PaperSize that unless I put the ref in convert_gtk.m4, valgrind complains a lot about uninitialized values (in on_got_page_size()) and later a segfault occurs, double delete of a PaperSize object as a member of PrintSettings (on an attempt to unref it somewhere in gtk code). That's *with* overriding CONVERSIONs for all get_* functions. Anyway, I don't see how these changes to *.m4 and the hg files can affect print-to-pdf behaviour. I removed all my locally modified files and rebuilt gtkmm with example(s) and there's still a problem. Please can you check this again, with current gtk and cairo.
Excellent detective work. On behalf of gtkmm users everywhere I thank you. Please check in what you have, but attach it as a patch here so I can easily see what's changed. Then I'll investigate again.
Created attachment 70071 [details] [review] Committed changes. ChangeLog entry is also visible in the patch. I have managed to keep the reftaking _CONVERSIONs for PrintContext in the hg files. Not for PaperSize.
So, I know see a window when I choose File/Preview. It shows a spin button and a close button and nothing else.
This is how it's supposed to look: http://marko.anastasov.name/Screenshot-Preview.png .
I made some changes and fixed some crashes that valgrind found. I now see the text, as in the screenshot, and it doesn't crash for me at all. However, it does not open the preview dialog a second time.
The preview window also works for me with the current CVS version (looks like in Screenshot-Preview.png). Reopening the preview dialog a second time only works for me when I close the preview window via that close button text to the spin entry. When I close it via the X in the top-right corner (delete_event?) it does not show up again. Furthermore, "DEBUG: print status: Finished" is not printed to the terminal as it is done when the dialog is closed through the close button. The application also does not terminate correctly when the main window is closed if the preview window has not been closed via the close button (or is still been shown). The main window disappears, but the program is still running. Overriding on_delete_event in PreviewDialog and calling m_refPreview->end_preview() in there seems to help. When I try to print to PDF, some cairo assertion fails and aborts the program (as already mentioned in comment #11). This all applies to my amd64 machine.
Adding to Armin's comments about the preview only showing up again if you use the "Close" button rather than the window manager close button: if you do close the preview preview dialog with the window manager and then try to close the main window, the application doesn't exit -- I have to kill it with Ctrl-C. Attempting to print to PDF results in this crash for me: ** (lt-example:569): DEBUG: starting with PRINT_DIALOG ** (lt-example:569): DEBUG: pfo ctor [Thread debugging using libthread_db enabled] [New Thread 46912552443056 (LWP 569)] [New Thread 1074006368 (LWP 577)] [New Thread 1082399072 (LWP 578)] [New Thread 1090791776 (LWP 579)] [New Thread 1099184480 (LWP 580)] [New Thread 1107577184 (LWP 581)] [New Thread 1115969888 (LWP 582)] [New Thread 1124362592 (LWP 583)] [New Thread 1132755296 (LWP 585)] [Thread 1132755296 (LWP 585) exited] [Thread 1124362592 (LWP 583) exited] [Thread 1115969888 (LWP 582) exited] [Thread 1082399072 (LWP 578) exited] [Thread 1090791776 (LWP 579) exited] [Thread 1099184480 (LWP 580) exited] [Thread 1107577184 (LWP 581) exited] [New Thread 1082399072 (LWP 586)] [New Thread 1099184480 (LWP 587)] [New Thread 1090791776 (LWP 588)] [Thread 1090791776 (LWP 588) exited] [Thread 1082399072 (LWP 586) exited] [Thread 1099184480 (LWP 587) exited] [New Thread 1099184480 (LWP 591)] [New Thread 1082399072 (LWP 592)] [New Thread 1090791776 (LWP 594)] [New Thread 1107577184 (LWP 595)] [New Thread 1115969888 (LWP 596)] [New Thread 1124362592 (LWP 597)] [New Thread 1132755296 (LWP 598)] [New Thread 1141148000 (LWP 599)] [New Thread 1149540704 (LWP 600)] [New Thread 1157933408 (LWP 601)] ** (lt-example:569): DEBUG: PrintFormOperation::on_begin_print [Thread 1099184480 (LWP 591) exited] [Thread 1082399072 (LWP 592) exited] [Thread 1090791776 (LWP 594) exited] [Thread 1107577184 (LWP 595) exited] [Thread 1115969888 (LWP 596) exited] [Thread 1124362592 (LWP 597) exited] [Thread 1132755296 (LWP 598) exited] [Thread 1141148000 (LWP 599) exited] [Thread 1157933408 (LWP 601) exited] [Thread 1149540704 (LWP 600) exited] Program received signal SIGSEGV, Segmentation fault.
+ Trace 69927
Thread 46912552443056 (LWP 569)
That's a deep call stack. I'm also running on amd64
Created attachment 70103 [details] valgrind log when attempting to print Here's a valgrind log captured when trying to print to PDF. Command & output: $ G_DEBUG=gc-friendly valgrind --tool=memcheck --leak-check=full -v --log-file=printing ./examples/book/printing/advanced/.libs/lt-example ** (lt-example:931): DEBUG: starting with PRINT_DIALOG ** (lt-example:931): DEBUG: pfo ctor ** (lt-example:931): DEBUG: PrintFormOperation::on_begin_print Killed
> Overriding on_delete_event in PreviewDialog and calling > m_refPreview->end_preview() in there seems to help. If we need to do that, we should do it in on_hide(), so it works for both close buttons.
> When I try to print to PDF, some cairo assertion fails and aborts the program This does seem to be happening only on 64-bit machines. Is it happening with the simple example too? Is it happening with gtk+/tests/print-editor too?
> Is it happening with the simple example too? Yes. The preview does not work in the simple example either, because it tries to print a PDF to show it with evince. > Is it happening with gtk+/tests/print-editor too? No. But evince fails to display the resulting pdf file. It shows just an empty page and prints "Error: failed to load truetype font" to the terminal. But this could also be a local problem.
Another problem, probably independant from the crashing PDF file creation, is that examplewindow.cc binds the whole RefPtr to PrintOperation::signal_done() instead of just a reference (or a pointer) to it. This causes the PrintOperation to be never freed because the signal connection still holds a reference to the PrintOperation. As soon as the window (and thus the connection) is finally destroyed, memory corruption seems to occur since the destroyed connection unrefs the PrintOperation which itself wants to destroy the connection in the signal destructor.
(In reply to comment #23) > > Overriding on_delete_event in PreviewDialog and calling > > m_refPreview->end_preview() in there seems to help. > > If we need to do that, we should do it in on_hide(), so it works for both close > buttons. > Fixed in CVS.
> Is it happening with gtk+/tests/print-editor too? > No. But evince fails to display the resulting pdf file. It shows just an empty > page and prints "Error: failed to load truetype font" to the terminal. But this > could also be a local problem. This sounds very suspicious. I suggest that you mention it on the cairo or gtk mailing list. > Another problem, probably independant from the crashing PDF file creation, is > that examplewindow.cc binds the whole RefPtr to PrintOperation::signal_done() > instead of just a reference (or a pointer) to it. This causes the > PrintOperation to be never freed because the signal connection still holds a > reference to the PrintOperation. Very clever. I'll figure out something to fix this if you don't first. But I would prefer to avoid storing the connection and disconnecting them explicitly because that complicates things.
> I'll figure out something to fix this if you don't first. Binding just a reference or a pointer to the RefPtr should do the trick.
> This sounds very suspicious. I suggest that you mention it on the cairo or gtk mailing list. I first tried to investigate this a bit further. It worked well when I took a non-standard font (so, neither "Sans" nor "Monospace", but for example "Rudelsberg" worked). There seems to be a relation between this and the PDF crasher bug: When I use the "Rudelsberg" font (or some other) in the simple printing example, it produces a working PDF file instead of throwing that cairo exception. Can others confirm this?
I can sort of confirm this. Here are my results: "verdana 12": no crash, produced empty PDF "Gentium 12": crash "bitstream charter 12": no crash, produced good PDF
Yes Rudelsberg worked for me too. "bitstream charter 12" crashed. "bitstream vera" also, but "bitstream vera sans" and "verdana" produced an empty pdf. I think we can't be 100% sure that these pdfs are actually empty, because evince produces many errors like "Error: Couldn't create a font for 'BitstreamVeraSans'". Likewise I have dejavu fonts as of a few days ago and I get this errors like that from evince even with a pdf created with the GTK+ test program.
Maybe you'd like to open a GTK+ bug so you have a cleaner bug to show to the GTK+ and cairo guys.
I have removed the use of sigc::bind in cvs. I am not convinced that it was a lifetime problem, but it is simpler without it. I see no more problems with the example on my 32-bit system, and the remaining issue looks like a GTK+ or cairo amd64 bug.
Can someone confirm that resizing the preview window doesn't crash on a 64-bit system? It's still crashing for me, and within the GTK+ test program it is not. Maybe even my EM64T Gentoo is influencing this somehow.
Resizing works for me on AMD64. By the way, is the text supposed to resize when the window resizes?
Yes, they should slowly become bigger.
OK, sorry for the slight derail. I just wanted to make sure that was intentional
> I see no more problems with the example on my 32-bit system, and the remaining issue looks like a GTK+ or cairo amd64 bug. I just filed a GTK+ bug about it (#349826). > Can someone confirm that resizing the preview window doesn't crash on a 64-bit system? Resizing the window works for me, too.
Is there anything more we need to do here? Both Armin and I confirmed that the GTK+ bug #349826 is fixed by an updated cairo. So now both of the printing examples seem to work fine for me. So can we close this bug or are there more outstanding issues yet?
Yes, that's how good we are.