After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 792787 - Raise critical warnings in GUI with stacktrace for bug reporting?
Raise critical warnings in GUI with stacktrace for bug reporting?
Status: RESOLVED FIXED
Product: GIMP
Classification: Other
Component: General
git master
Other Linux
: Normal blocker
: 2.10
Assigned To: GIMP Bugs
GIMP Bugs
: 792980 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2018-01-22 15:50 UTC by Jehan
Modified: 2018-02-12 03:58 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Upon receiving a CRITICAL (169.74 KB, image/png)
2018-01-25 18:16 UTC, Jehan
Details
Upon a FATAL error (164.10 KB, image/png)
2018-01-25 18:19 UTC, Jehan
Details

Description Jehan 2018-01-22 15:50:30 UTC
I have been cleaning a bit our core from assert() code. Basically the core should not directly crash without a warning if possible. People creating with GIMP may have a lot of unsaved work.
This is slightly more acceptable for plug-ins though.

The only advantage of a crash is that it can't be missed, and fortunately one may be able to remember what one was doing, reproduce it and report a bug.
With g_return_*if_fail() code, unless you run GIMP in a terminal, you'd miss the issue. Worse if this ever leaves GIMP in an inconsistent state, there can be problems later on.

This is why I am thinking we should add a handler on at least FATAL and CRITICAL (but even WARNING, why not) to raise a GUI dialog saying there was an internal problem (and exposing the error message), that a bug should be reported giving steps for reproduction. Also that GIMP may have been left in an inconsistent state.

Ideally we could also display a stacktrace in the GUI.
For FATAL errors, we already run g_on_error_stack_trace () but that is once again only in terminal. That should not be too hard to redirect the output into a string to show in a GUI.
FATAL error would still crash but at least people will have a stacktrace to directly report. CRITICAL errors still won't crash but now they will be visible to people for bug reporting.
Comment 1 Jehan 2018-01-22 15:52:09 UTC
P.S.: for Windows, we now have this DrMingW thing which already outputs a stacktrace (upon crashes only, not criticals, but maybe that is possible too). So I guess this feature could even help to implement for this platform.
Comment 2 Jehan 2018-01-22 15:53:17 UTC
Setting this for myself as a blocker for 2.10. That could really improve the development if every people out there could just open reports with backtrace when this applies.
I'll look into it.
Comment 3 Michael Natterer 2018-01-22 17:48:24 UTC
Those asserts are mostly in constructed() for the simple purpose of
checking absolute invariants. If any of them is triggered, we are
doomed anyway. You can't just return from these functions, the code
will almost definitely crash later, just much weirder and harder
to debug. Removing the asserts doesn't make things more stable
or less likely to crash.

About a (semi-) automatic stack trace for warnings and criticals,
good idea.
Comment 4 Jehan 2018-01-25 18:16:52 UTC
Created attachment 367436 [details]
Upon receiving a CRITICAL

(In reply to Michael Natterer from comment #3)
> Those asserts are mostly in constructed() for the simple purpose of
> checking absolute invariants. If any of them is triggered, we are
> doomed anyway. You can't just return from these functions, the code
> will almost definitely crash later, just much weirder and harder
> to debug. Removing the asserts doesn't make things more stable
> or less likely to crash.

I know. But at least if instead of crashing immediately, we raise a GUI saying "something is really wrong, GIMP is unstable, please save and restart", it gives the chance to everyone to actually save your work if needed, rather than just crashing and byebye your hour of unsaved work!

Also this way, we are giving people the opportunity to create very easily a bug report with meaningful information (something they may not be aware was possible since not everybody seems to understand the concept of Free Software).

> About a (semi-) automatic stack trace for warnings and criticals,
> good idea.

Yep I have been working on it since the other day I had the idea.
Attached is a screenshot showing my debug system when receiving a critical error (typically anything like g_return_*() or g_critical()).

The idea is to show a text saying that you can report a bug, and in the scrolled area below is just text (staying in English, unlike the upper part) giving the information details of GIMP (same as `gimp -v`) and a stack trace of the critical warning.
Comment 5 Jehan 2018-01-25 18:19:16 UTC
Created attachment 367438 [details]
Upon a FATAL error

And the system I am building also catches FATAL errors (crashes!) and raises the same dialog. So it would be extremely easy to report crashes.

Note that on fatal errors, I add the "Restart GIMP" button to directly restart (in this case, your data is doomed, there is nothing which can be done, it's too late).
Comment 6 Jehan 2018-01-25 18:23:50 UTC
See branch wip/bug792787-debug-stacktrace-GUI on the repository which I just pushed.

I still have a bit more to do. I want to at least have a preferences option to allow people to opt-out of these stacktrace generations.

Also I want to add support for LLDB (apparently more common on macOS), and maybe look at what DrMinGW can do to see if we can also have the same thing on Windows.
Comment 7 Michael Natterer 2018-01-25 20:02:53 UTC
There is no way to "recover" or even "warn" for these asserts()
in constructed(). They are there to enforce invariants and will be
triggered during development. Most of them already have a
g_return_if_fail() counterpart in the respective class' new()
function.
Comment 8 Jehan 2018-01-25 20:11:56 UTC
(In reply to Michael Natterer from comment #7)
> There is no way to "recover" or even "warn" for these asserts()
> in constructed(). They are there to enforce invariants and will be
> triggered during development. Most of them already have a
> g_return_if_fail() counterpart in the respective class' new()
> function.

When they are assert(), no indeed you can't recover from them. But when they are g_return*(), you can at least warn that something is really wrong and leave the time to save before closing. You don't "recover" a sane GIMP, but you can save the work.

Which is why I was transforming assert() into g_return*() and now what we can do is catch the g_return*() to generate backtrace. Hence you avoid the "crash later and hard to debug" problem since you get immediate backtrace at the first issue.

I know that works, the code in my branch already does work globally ok doing so.
Comment 9 Michael Natterer 2018-01-26 10:26:00 UTC
No you can't save your work because the very core object system is
corrupt. The code *will* crash, there is no "let's try to save".

I know exactly where to put an assert() and where not. You can miss
a warning during development, but you can't miss an assert. The chances
of such a crash getting shipped are *tiny*, and I did catch quite some
object model bugs with these asserts during development.

Note that I think your branch is an excellent idea, for all other
warnings. But these are not warnings about "some parameter is wrong"
(preconditions), these are *invariants*
Comment 10 Jehan 2018-01-26 12:45:48 UTC
Ok I see what you mean now. Even when we had assert() on undo objects for instance? This will still crash if we try to save?
Anyway you know better than I do of course, and now I understood what you meant. Should I revert my assert() changes?
Comment 11 Michael Natterer 2018-01-26 15:10:10 UTC
There is a middle ground :) And no *guarantee* that things are not
emergency-savable any longer. And good point on the undo objects, they
most likely don't affect saving, unless there is some larger-scale
object corruption.

What about this (already done locally):

g_assert() -> gimp_assert() (right now simply a #define)

and in there, simply try and save all images, and/or pop your
new backtrace dialog, or whatever.

This way there is at least a chance of not losing the data?
Comment 12 Jehan 2018-01-26 15:24:51 UTC
> and in there, simply try and save all images, and/or pop your new backtrace dialog, or whatever.

We should not save the images ourselves though. That happens sometimes you do something but absolutely don't intend to have it saved; very often for instance when I resize before an export (for web or whatever, until we can improve the export dialog), and I certainly would resent an autosave at this time losing the high reso!

Actually if we think there is a risk of failed save, this can be worse than unsaved work if we corrupt the file (hence lose much more than just since last save!). Maybe what could happen then is: autosave but in a .bak.xcf file or something, warning the user that we *tried* to save and to check out the .bak.xcf file after GIMP restart.

And of course the backtrace dialog and all stays the same.
Comment 13 Michael Natterer 2018-01-26 16:48:03 UTC
We would of course save to recovery files, but that would also require
checking for such files on startup and offering recovery :)
Comment 14 Jehan 2018-01-28 00:29:05 UTC
*** Bug 792980 has been marked as a duplicate of this bug. ***
Comment 15 Jehan 2018-01-28 03:25:56 UTC
So I went further:

- I added support for LLDB. As far as I understood this is more common especially on macOS. So now the debug system uses GDB if available and fallbacks to GDB.

- I added Windows support by hacking the Dr MinGW support which was added recently. Basically the feature for generating backtraces upon crashes was already there, but quite hidden. Who know that backtraces are generated in some folder somewhere but people following closely the development?
Now my system just read the file contents and display it on a crash in the same backtrace dialog, allowing to copy to clipboard, open the browser to our bugzilla, and restart GIMP.

Note: this means on Windows, the backtrace dialog only appear for definitive crashes (on Unix-like system, it appears also for critical errors).

For info: this is tested under Windows in a VM and works fine.

- I created a separate preferences page to allow people to disallow debugging. This page won't be visible on Windows (maybe later if we have other features then) though since Dr MinGW API does not allow disabling backtracing.

What could still be done:

- Check how to plug this system into plug-ins. I saw there was separate code for plug-in backtracing.

- Autosave (attempt) feature as a .bak.xcf upon crash as discussed previously.

- The gimp_assert() you proposed.

Should I just merge this or should I continue to work in a branch? Could you review maybe?

There is a bit of questionable code, as in any important new code, but I think the one you will not like is that I removed the GIMP_APP_GLUE_COMPILATION check in app/version.h (see commit 287dc3c494ca244bacdbc816e981c599e01a8021). The reason was that my GUI wanted to also grab the GIMP version string so that people will just give us all the details about their GIMP build without even having to ask.

Maybe we could just rename this file as app/gimp-version.h and just make it officially a file which subdir code is allowed to include.
Comment 16 Jehan 2018-01-28 16:43:07 UTC
For info, the branch has been merged.

It has been decided to delay adding similar debugging for plug-ins (though it would be easy to do), first because even some core plug-ins are a mess (I'm thinking of the metadata editors which goes crazy and outputs dozens of warnings and criticals), but even more because people would report bugs for third-party plug-ins as well. We'll need a standard way to distinguish between core and third-party plug-ins (may come with my plug-in improvements planned later on!).

Also pippin was proposing to use `man backtrace` as a third fallback on Linux (I cite: "it is what gegl uses for the advanced mode of the buffer leaks, using backtrace() to capture traces, and backtrace_symbols to get the string names for the corresponding trace afterwards". GDB and LLDB would still be preferred because when debug symbols are installed, they have more info (like line numbers). But that is a good additional fallback which would always work on Linux without adding a dependency!
Comment 17 Piotr Drąg 2018-01-28 20:15:10 UTC
https://git.gnome.org/browse/gimp/commit/?id=ee6e981c042f1394ba93a071ddf5853ba6912a5e made “Copy bug information” and “Open bug tracker” buttons untranslatable.
Comment 18 Michael Schumacher 2018-01-29 08:51:16 UTC
We got a build failure on build.gimp.org with this as the main reason:

../app/gimp-version.c:35:25: fatal error: git-version.h: No such file or directory
 #include "git-version.h"

Is this related to changes here?
Comment 19 Jehan 2018-01-29 19:21:10 UTC
(In reply to Michael Schumacher from comment #18)
> We got a build failure on build.gimp.org with this as the main reason:
> 
> ../app/gimp-version.c:35:25: fatal error: git-version.h: No such file or
> directory
>  #include "git-version.h"
> 
> Is this related to changes here?

Yeah this was because this is now needed in tools/ as well (for the separate debug tool used when we crash), and this is a generated file in app/, hence generated later.
I moved git-version up one folder (in the root) in commit 44f23bdf0c53c997234cbeae9ea61107ab885e05. We usually don't have any code file in the root. Yet since this is a header describing the git repository itself, I figured this is an acceptable exception amongst generated code files. With config.h, this is the second generated header file in the root.
Comment 20 Jehan 2018-01-29 19:37:39 UTC
(In reply to Piotr Drąg from comment #17)
> https://git.gnome.org/browse/gimp/commit/
> ?id=ee6e981c042f1394ba93a071ddf5853ba6912a5e made “Copy bug information” and
> “Open bug tracker” buttons untranslatable.

Yep. I actually think they were "translatable", but not auto-extractable. But since translators are not going to add these strings manually, that's about the same. Just nitpicking. :-)

I must have fixed the issue with commit 42e9ddd4dde81c1a648a1e989b5f43e79598ace1. Tell me if there is still any problem. :-)
Comment 21 Massimo 2018-02-01 06:22:05 UTC
I tried to use this new dialog. Knowing that closing GIMP
while the "Configure input devices" dialog is open prints
many CRITICALS, I did it with a newly built GIMP, but the
new dialog, empty, is shown for a little and then closed.

On the console the output is:

> (gimp-2.9:31042): GLib-GObject-WARNING **: 07:18:23.412: invalid (NULL) pointer instance
>
> (gimp-2.9:31042): GLib-GObject-CRITICAL **: 07:18:23.412: g_signal_handlers_disconnect_matched: assertion 'G_TYPE_CHECK_INSTANCE (instance)' failed
>
> (gimp-2.9:31042): GLib-GObject-WARNING **: 07:18:23.412: invalid (NULL) pointer instance
>
> (gimp-2.9:31042): GLib-GObject-CRITICAL **: 07:18:23.412: g_signal_handlers_disconnect_matched: assertion 'G_TYPE_CHECK_INSTANCE (instance)' failed
>
> (gimp-2.9:31042): GLib-GObject-WARNING **: 07:18:27.770: invalid (NULL) pointer instance
>
> (gimp-2.9:31042): GLib-GObject-CRITICAL **: 07:18:27.771: g_signal_handlers_disconnect_matched: assertion 'G_TYPE_CHECK_INSTANCE (instance)' failed
>
> (gimp-2.9:31042): GLib-GObject-WARNING **: 07:18:27.771: invalid (NULL) pointer instance
>
> (gimp-2.9:31042): GLib-GObject-CRITICAL **: 07:18:27.771: g_signal_handlers_disconnect_matched: assertion 'G_TYPE_CHECK_INSTANCE (instance)' failed
Comment 22 Massimo 2018-02-01 11:40:25 UTC
Apparently, this dialog is not used for GEGL CRITICALS.

For example, when closing a Cage transform polygon that is completely
outside the image rectangle, on the console there is a stream of

(gimp-2.9:1082): GEGL-CRITICAL **: 12:38:14.584: gegl_buffer_get: assertion 'GEGL_IS_BUFFER (buffer)' failed

But the pop-up dialog is not opened
Comment 23 Jehan 2018-02-01 14:21:09 UTC
Yeah currently in the code, it is used only for Gimp-* (and GIMP core) criticals. I thought about this limitation as well. I will probably update the code to make it be raised on any domain's critical.
Comment 24 Jehan 2018-02-11 00:55:57 UTC
Mitch > will you push your gimp_assert() commit? :-)
The system can most likely be improved (actually I plan to improve slowly with time; I already have ideas of what), but the base system is there. The bug report can most likely be closed as soon as you push your gimp_assert().

As you say, for the time being, this can even just be a #define which calls g_assert(). This will be caught by the debugging system too.
Comment 25 Jehan 2018-02-11 23:26:25 UTC
Ok I saw that Mitch pushed his gimp_assert().
I will close this report as FIXED. The main feature is here. I will still improve it as we go (before and after 2.10). I have a lot more to go. But the "blocker" I gave myself is now gone and I'm happy with the current state. :-)

commit 539927ebfab6a7308aacaa92757546312eea026c
Author: Michael Natterer <mitch@gimp.org>
Date:   Sun Feb 11 22:23:10 2018 +0100

    app: replace all g_assert() by the newly added gimp_assert()
    
    which is just a #define to g_assert for now, but can now easily be
    turned into something that does some nicer debugging using our new
    stack trace infrastructure. This commit also reverts all constructed()
    functions to use assert again.

 [… snip too many files changed to list them all …]
 104 files changed, 213 insertions(+), 211 deletions(-)
Comment 26 Jehan 2018-02-12 03:58:00 UTC
(In reply to Massimo from comment #22)
> Apparently, this dialog is not used for GEGL CRITICALS.
> 
> For example, when closing a Cage transform polygon that is completely
> outside the image rectangle, on the console there is a stream of
> 
> (gimp-2.9:1082): GEGL-CRITICAL **: 12:38:14.584: gegl_buffer_get: assertion
> 'GEGL_IS_BUFFER (buffer)' failed
> 
> But the pop-up dialog is not opened

For information, I am improving the debug dialog to include GEGL WARNINGs and CRITICALs (I still have a commit to push, but it's nearly there) and I have opened a bug report for the error you pointed, with appropriate backtraces: bug 793371.