GNOME Bugzilla – Bug 772692
Evolution does not show body of messages in view pane - just a graphic block
Last modified: 2016-10-24 20:25:42 UTC
Created attachment 337312 [details] Demonstration of missing content Description of problem: Since recent Cauldron updates, Evolution will now start (was crashing when trying to compose a new message) - looks fine, but as I scroll through the messages - the message view pane then only shows a graphical block with no message contents (see attachment) Version-Release number of selected component (if applicable): evolution-data-server-3.22.1-1.mga6 evolution-3.22.1-1.mga6 evolution-debuginfo-3.22.0-1.mga6 How reproducible: Everytime - Steps to Reproduce: 1. Running KDE Plasma desktop - start Evolution 2. Scroll through various email messages 3. See missing message content It would appear that the "webkitwebprocess" seems to hang in the background - by killing this process manually, the message content reappears - briefly - then the new "webkitwebprocess" fails again - so something with webkit and evolution together!!
Also have a bug opened @ Mageia: https://bugs.mageia.org/show_bug.cgi?id=19470
Thanks for a bug report. I agree it's due to webkit2 being stuck on something, which can be also evolution's fault, due to its webkit2 extensions. Please install debuginfo for the evolution (not for the webkit itself at the moment) and when you get into this odd state grab a backtrace of the evolution, then of the associated WebkitWebProcess. The command to get it looks like: $ gdb --batch --ex "t a a bt" -pid=`pidof evolution` &>bt.txt Please check the bt.txt for any private information, like passwords, email address, server addresses,... I usually search for "pass" at least (quotes for clarity only). Thanks in advance. By the way, what is the version of the WebKit2 for you, please? It can be packages as webkitgtk4, or it's this way in Fedora at least, I do not know how Mageia packages it.
Here's the list of webkit related packages I have installed - the debug info I will add as an attachment: [rfox@foxmain ~]$ rpm -qa | grep webkit libqtwebkit2.2_4-2.3.4-8.mga6 lib64qtwebkit2.2_4-2.3.4-8.mga6 webkit2-2.14.0-2.mga6 lib64webkit-gir3.0-2.4.10-2.mga6 lib64webkit2-devel-2.14.0-2.mga6 python3-qt5-webkit-5.6-8.mga6 qtwebkit-qmlplugin-2.3.4-8.mga6 lib64webkitgtk1.0_0-2.4.10-2.mga6 lib64qt5webkitwidgets5-5.6.1-2.mga6 webkit-2.4.10-2.mga6 lib64webkitgtk3.0-devel-2.4.10-2.mga6 python3-qt5-webkitwidgets-5.6-8.mga6 lib64qt5webkit5-5.6.1-2.mga6 lib64webkit2gtk4.0_37-2.14.0-2.mga6 lib64smokeqtwebkit3-4.14.3-4.mga6 webkit3.0-2.4.10-2.mga6 webkit1.0-2.4.10-2.mga6 lib64webkit2gtk-gir4.0-2.14.0-2.mga6 lib64webkitgtk3.0_0-2.4.10-2.mga6 lib64proxy-webkit-0.4.13-2.mga6 lib64kdewebkit5-4.14.24-1.mga6 webkit3-2.4.10-2.mga6 python3-qt4-webkit-4.11.4-10.mga6
Created attachment 337525 [details] Output of debug per the request
Thanks for the update. The backtrace shows the evolution idle. Could you also capture backtrace of the running WebKitWebProcess processes, please? There can be more, one for the preview panel, the other for a composer window, if you'd have any open. Ideally grab both. The command is the same as in comment #2, only the `pidof evolution` argument is changed to the process ID number, as returned for example with: `ps ax | grep WebKitWebProcess`
Created attachment 337569 [details] Dump for WebKitWebProcess
Created attachment 337570 [details] test-wk2.c Thanks for a quick update. Now that gets confusing, because the WebKitWebProcess is also idle (as far as I can tell). Can we try another application which uses WebKit2, whether it'll be able to show the content, please? For example dev-help and Epiphany comes on my mind as the starter. Or save this attachment as test-wk2.c file and compile it as the first line shows. It'll open http://webkitgtk.org in a window. Try to click on some links in the window, to also test changes. It's not exactly what evolution does, there's the dev-help closer to it, but it's a good test for a plain webkit2.
(In reply to Milan Crha from comment #7) > Created attachment 337570 [details] > test-wk2.c You could save yourself some time and use MiniBrowser (in WebKitGTK+ 2.14.1 it's under libexec, previously it was in bin) or Epiphany.
I am attaching two new screenshots - and I tried the test-wk2.c as you requested - Funny thing is it seems to work fine only sometimes - and I land on a page that I can no longer click on any links. With Evolution - you see that it is not only graphics which are missing, but the full message body - when I manually kill the running "WebKitWebProcess" - the message pane works again for a short time then returns to the strange empty box (see attachments) I also notice in the terminal window when the test-wk2 was failing the following errors: Error sending IPC message: Broken pipe Error sending IPC message: Broken pipe Error sending IPC message: Broken pipe Error sending IPC message: Broken pipe Error sending IPC message: Broken pipe Error sending IPC message: Broken pipe Error sending IPC message: Broken pipe Error sending IPC message: Broken pipe Error sending IPC message: Broken pipe
Created attachment 337574 [details] Same website opened with Firefox and Epiphany as comparison
Created attachment 337576 [details] Normal working window pane (after killing WebKitWebProcess manually
Created attachment 337577 [details] Same message now broken after opening different messages back to back
(In reply to Robert Fox from comment #9) > Error sending IPC message: Broken pipe > Error sending IPC message: Broken pipe > Error sending IPC message: Broken pipe > Error sending IPC message: Broken pipe > Error sending IPC message: Broken pipe > Error sending IPC message: Broken pipe > Error sending IPC message: Broken pipe > Error sending IPC message: Broken pipe > Error sending IPC message: Broken pipe These are harmless, they occur whenever a web process is closed. It is a bug that the messages are being displayed, but not the bug you're looking for.
(In reply to Robert Fox from comment #10) > Created attachment 337574 [details] > Same website opened with Firefox and Epiphany as comparison Looks to me like WebKit is totally broken. Do you see any errors in the web inspector? (Ctrl+Shift+I)
No errors found in console - like I said, by killing the WebKitWebProcess - the windows pane works again - but only for a short time - when I flip through several e-mails and evolution is loading the pane - it eventually fails again.
It looks like the D-Bus connection between the evolution and the WebKitWebProcess, which uses org.gnome.Evolution.WebExtension D-Bus service name, got lost for some reason, but that might not explain why also plain WebKit2 is affected. What your screen shots show is usually caused by the D-Bus connection issue, and/or when the WebProcess is stuck in something, but your backtrace didn't show anything. I've not much idea how to debug this further, I would left it to Michael, ideally I will also fill this in webkit, because it's not evolution specific. Could there be anything odd with the graphics drivers, what confuses WebKit after some time? Maybe try to run evolution under valgrind. It'll be very slow, but maybe it'll show anything useful. The command is: $ G_SLICE=always-malloc valgrind --trace-children=yes evolution &>log.txt Once it'll show the main window and the work will settle (consider adding also --offline right after the 'evolution' in the valgrind command to speed a bit some things on start) and the CPu load will go down, then move between message and once you get into that state where the view will be stuck simply quit the evolution. Note it can be stuch for several seconds, but recover afterwards, due to the valgrind memory checker, which slows things down.
Am I correct in saying that WebKit just doesn't work at all for you on Mageia? Is it able to render any websites properly? Does Devhelp work? What version of Epiphany is that?
The ~/.xsession-error log also gets flooded with messages like: ** (WebKitWebProcess:12750): CRITICAL **: gchar* webkit_dom_element_get_attribute(WebKitDOMElement*, const gchar*): assertion 'WEBKIT_DOM_IS_ELEMENT(self)' failed and when killed logs this as final log: ERROR: Exiting process early due to unacknowledged closed-connection /build/webkit2gtk-7MbNeh/webkit2gtk-2.14.0/Source/WebKit2/Shared/ChildProcess.cpp(56) : WebKit::didCloseOnConnectionWorkQueue(IPC::Connection*)::<lambda()>
Created attachment 338078 [details] Log output using valgrind - core dumped Using nouveau driver for NVidia card
(In reply to Michael Catanzaro from comment #17) > Am I correct in saying that WebKit just doesn't work at all for you on > Mageia? Is it able to render any websites properly? Does Devhelp work? What > version of Epiphany is that? Epiphany works - but only sporadically. Not sure how to test definitively whether webkit if functioning properly or not . . .
Those "Use of uninitialised value of size 8" from webkitgtk4 look suspicious, from my point of view, though I do not see such here.
(In reply to Robert Fox from comment #20) > Epiphany works - but only sporadically. Not sure how to test definitively > whether webkit if functioning properly or not . . . It sounds like Mageia's WebKit package is broken. Unfortunately I don't have any guess what's wrong. This does not look like an Evolution problem at all to me, since you can reproduce a very similar problem in Epiphany. That needs to be investigated and fixed before we return to debugging Evolution, which adds much extra complexity. (In reply to Milan Crha from comment #21) > Those "Use of uninitialised value of size 8" from webkitgtk4 look > suspicious, from my point of view, though I do not see such here. Maybe, but it's impossible to say without any debuginfo installed. I also see far too many "Conditional jump or move depends on uninitialised value(s)". Maybe debuginfo will reveal something interesting, maybe not.
I will post comment #22 back on Mageia's bugzilla: https://bugs.mageia.org/show_bug.cgi?id=19470 Back to the drawing board . . . Thx
There's no downstream patch in Mageia, this is an upstream issue. There other softwares that works fine with webkit2 on mageia such as Mageia Control Center, the program displaying help. The CRITICAL assertions (comment #18) must be fixed in evolution and/or webkit2. Same as the bugs caught by valgrind (comment #21)
The problem is: if we can't reproduce the issue, we can't fix it. Since there is no flood of bug reports, it's safe to say the issue is probably either hardware-specific or distro-specific, one or the other; I guessed distro because Noveau is not so uncommon and I'd expect more bug reports if something was wrong with that driver. It doesn't necessarily mean Mageia is doing anything wrong, just that some combination of packages and compile flags could be required to reproduce the issue. Now, the criticals in comment #18. In general, I'd ask for a backtrace taken with G_DEBUG=fatal-criticals to debug criticals, but we don't need it in this case: the criticals in comment #18 are surely occurring because the page did not load properly; Evolution expects the DOM to be built in a certain way and it didn't happen. WebKit is huge, the bug could be anywhere, or in any dependency, or in graphics hardware; it's going to be impossible to debug unless a developer can reproduce the issue somehow. Now, the valgrind log. I see a couple noveau driver issues at the top of the log; it's possible those could be related, but of course you're not going to get help for the noveau driver on GNOME Bugzilla. :) Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s) is a WebKit bug, but a minor one; we're sending uninitialized memory into a pipe, which is bad, but those bytes just get ignored by the other end, and it's not related to this issue. Everything else I see are issues in cairo or freetype; it's entirely possible they could be to blame, but it's not very likely, and the valgrind log is useless here as there's no debuginfo. Lastly, I see some uninitialized memory use in WebKit coming from Evolution's DOM API use, which is bad, but again there's no debuginfo so the log isn't useful. Lastly, the X window system error that crashes the web process at the bottom of the log is clearly a bad problem, but that's a completely different bug report (it looks like probably bug #773302, so you don't need to file one).
(In reply to Thierry Vignaud from comment #24) > There's no downstream patch in Mageia, this is an upstream issue. > There other softwares that works fine with webkit2 on mageia such as Mageia > Control Center, the program displaying help. This indeed suggests that it's an Evolution problem, but we can clearly see the same bug in Epiphany, which I've never seen before. Nobody has ever reported anything like this until now, and now the same person has such a similar issue in two different WebKit apps, it's very very difficult to believe it's a coincidence. Perhaps it's just triggered by some web content and not other. I don't know, it's a hard issue. :(
I just tried the NVidia proprietary drivers as well - same issue dkms-nvidia-current-367.57-1.mga6.nonfree.x86_64 Interesting note - when I tried the Mageia MCC tool (configuration utility) - Using NVidia proprietary drivers, it fails to show the icons - and I get the following errors in a console: [root@foxmain rfox]# mcc Ignore the following Glib::Object::Introspection & Gtk3 warnings Subroutine Gtk3::main redefined at /usr/lib/perl5/vendor_perl/5.22.2/Gtk3.pm line 525. (drakconf:30654): Gtk-WARNING **: Theme parsing error: mcc.css:31:33: The style property GtkWidget:interior-focus is deprecated and shouldn't be used anymore. It will be removed in a future version (drakconf:30654): Gtk-WARNING **: Theme parsing error: mcc.css:32:35: The style property GtkWidget:focus-line-width is deprecated and shouldn't be used anymore. It will be removed in a future version GLib-GObject-WARNING **: gsignal.c:2423: signal 'populate-popup' is invalid for instance '0x314db90' of type 'WebKitWebView' at /usr/lib/libDrakX/mygtk3.pm line 629. "/usr/sbin/drakmenustyle" is not executable [Menus] at /usr/libexec/drakconf line 819. "/usr/sbin/drakbackup" is not executable [Backups] at /usr/libexec/drakconf line 819. "/usr/sbin/tomoyo-gui" is not executable [Tomoyo Policy] at /usr/libexec/drakconf line 819. "/usr/sbin/drakguard" is not executable [Parental Controls] at /usr/libexec/drakconf line 819. *** Gtk3::MenuItem::activate: passed too many parameters (expected 1, got 2); ignoring excess at /usr/libexec/drakconf line 1031. libGL error: No matching fbConfigs or visuals found libGL error: failed to load driver: swrast (WebKitWebProcess:30669): Gdk-ERROR **: The program 'WebKitWebProcess' received an X Window System error. This probably reflects a bug in the program. The error was 'BadValue (integer parameter out of range for operation)'. (Details: serial 175 error_code 2 request_code 53 (core protocol) minor_code 0) (Note to programmers: normally, X errors are reported asynchronously; that is, you will receive the error a while after causing it. To debug your program, run it with the GDK_SYNCHRONIZE environment variable to change this behavior. You can then get a meaningful backtrace from your debugger if you break on the gdk_x_error() function.) libGL error: No matching fbConfigs or visuals found libGL error: failed to load driver: swrast (WebKitWebProcess:30703): Gdk-ERROR **: The program 'WebKitWebProcess' received an X Window System error. This probably reflects a bug in the program. The error was 'BadValue (integer parameter out of range for operation)'. (Details: serial 175 error_code 2 request_code 53 (core protocol) minor_code 0) (Note to programmers: normally, X errors are reported asynchronously; that is, you will receive the error a while after causing it. To debug your program, run it with the GDK_SYNCHRONIZE environment variable to change this behavior. You can then get a meaningful backtrace from your debugger if you break on the gdk_x_error() function.) libGL error: No matching fbConfigs or visuals found libGL error: failed to load driver: swrast (WebKitWebProcess:30722): Gdk-ERROR **: The program 'WebKitWebProcess' received an X Window System error. This probably reflects a bug in the program. The error was 'BadValue (integer parameter out of range for operation)'. (Details: serial 175 error_code 2 request_code 53 (core protocol) minor_code 0) (Note to programmers: normally, X errors are reported asynchronously; that is, you will receive the error a while after causing it. To debug your program, run it with the GDK_SYNCHRONIZE environment variable to change this behavior. You can then get a meaningful backtrace from your debugger if you break on the gdk_x_error() function.) libGL error: No matching fbConfigs or visuals found libGL error: failed to load driver: swrast (WebKitWebProcess:30741): Gdk-ERROR **: The program 'WebKitWebProcess' received an X Window System error. This probably reflects a bug in the program. The error was 'BadValue (integer parameter out of range for operation)'. (Details: serial 175 error_code 2 request_code 53 (core protocol) minor_code 0) (Note to programmers: normally, X errors are reported asynchronously; that is, you will receive the error a while after causing it. To debug your program, run it with the GDK_SYNCHRONIZE environment variable to change this behavior. You can then get a meaningful backtrace from your debugger if you break on the gdk_x_error() function.) libGL error: No matching fbConfigs or visuals found libGL error: failed to load driver: swrast (WebKitWebProcess:30762): Gdk-ERROR **: The program 'WebKitWebProcess' received an X Window System error. This probably reflects a bug in the program. The error was 'BadValue (integer parameter out of range for operation)'. (Details: serial 175 error_code 2 request_code 53 (core protocol) minor_code 0) (Note to programmers: normally, X errors are reported asynchronously; that is, you will receive the error a while after causing it. To debug your program, run it with the GDK_SYNCHRONIZE environment variable to change this behavior. You can then get a meaningful backtrace from your debugger if you break on the gdk_x_error() function.) [root@foxmain rfox]#
Robert, could you provide a backtrace (with GDK_SYNCHRONIZE) for that in bug #773302 please? The reporter there seems to be having some trouble getting a backtrace. That's the same issue as from your valgrind log, but I think it's different than this bug because if that were happening, I don't think your web process would show anything at all. Also, please check which package version of webkit2 is in use: make sure it's 2.14.1. This was a known issue in 2.14.0 that should have been fixed in 2.14.1, but in bug #773302 it's clear that it's still sometimes happening in 2.14.1. A backtrace with 2.14.1 would be very helpful there.
I will try the backtrace - but just checked the webkit2 version: webkit2-2.14.0-2.mga6 Looks like it's the older version . . . .
(In reply to Robert Fox from comment #29) > I will try the backtrace - but just checked the webkit2 version: > webkit2-2.14.0-2.mga6 > > Looks like it's the older version . . . . OK, then please sit on this until the 2.14.1 update works its way through Mageia. Once you get that update, if you can still reproduce the X11 error, then please post the backtrace (in bug #773302).
Robert, you can test webkit2-2.14.1 by enabling the "Core Update Testing" medium/repository
Hi guys, another sufferer here. Unfortunately, the bug persists with webkit2-2.14.1 :( My symptoms are: after clicking on 5-7 emails (the number is quite stable), the message body area turns into an empty frame, until WebKitWebProcess is killed. As for the D-Bus hypothesis, I've traced D-Bus calls with dbus-monitor, and I can't tell the trace is much different for the "normal" and "broken" states. The path=/org/gnome/Evolution/WebExtension endpoint remains active and responds to method calls as usual. Devhelp works perfectly, as does MCC. Epiphany is a bit unstable. It displays simple websites correctly, but JavaScript-laden sites like Twitter or GitHub drive it crazy. These sites are displayed as if they didn't contain any CSS or JS at all.
(In reply to Dimitri from comment #32) > Devhelp works perfectly, as does MCC. Epiphany is a bit unstable. It > displays simple websites correctly, but JavaScript-laden sites like Twitter > or GitHub drive it crazy. These sites are displayed as if they didn't > contain any CSS or JS at all. OK, so you're surely having the same problem as Robert. Which distro and which graphics driver? I don't think it could possibly be related to JavaScript because Evolution disables JavaScript (which you presumably do not want running in your emails :) I guess it's time to report a WebKit bug. It's a shame we don't have a reproducer, but I guess hardware details would do. Could you both please run the command 'glxinfo' and post an attachment here? I will then create a WebKit bug and attach your output to it.
That would Mageia Cauldron too as he'using MCC and his email is using mageia domain...
I have tested with the latest webkit2-2.14.1 - which took longer to fail, but still fails :-( I have also attached the glxinfo output. On a side note, when I tested NVidia driver again, MCC fails to show the icons (maybe related to closed bug ( https://bugs.mageia.org/show_bug.cgi?id=17500 ) I am experiencing several issues with NVidia (like compositing) so I've been sticking with Nouveau for now . . .
Created attachment 338287 [details] output of glxinfo
For me any reply on a formatted email with html formatting response will set the WebKitWebProcess into a 100% CPU cycle loop. I've captured the gdb bt of this process.
Created attachment 338294 [details] Backtrace evolution WebKitWebProcess Backtrace WebKitWebProcess while in the 100% cpu loop after an html formatted reply in evolution.
(In reply to Michael Catanzaro from comment #33) > (In reply to Dimitri from comment #32) > OK, so you're surely having the same problem as Robert. Which distro and > which graphics driver? I'm on Mageia Cauldron (i586) with NVIDIA 340.98 (proprietary), see attachment for the glxinfo output. I've got another box with Cauldron, it's a x86_64 with Intel graphics; if I remember right, it has the same issue. Will attach its glxinfo output a bit later.
Created attachment 338310 [details] glxinfo output
Comment on attachment 338310 [details] glxinfo output Mageia Cauldron (i586) NVIDIA 340.98
(In reply to Bart from comment #37) > For me any reply on a formatted email with html formatting response will set > the WebKitWebProcess into a 100% CPU cycle loop. I've captured the gdb bt of > this process. This could be bug 772803
OK, I've moved this to WebKit Bugzilla: https://bugs.webkit.org/show_bug.cgi?id=163897 If you want to create WebKit Bugzilla accounts and CC yourselves there, it might increase the chances of the bug being fixed. Unfortunately I don't think any WebKit developers are affected by this issue so it might be unlikely to be fixed. I was hoping that your glxinfo output would be similar and reveal that WebKit is broken on some specific hardware, but it looks completely different to me unfortunately. Another thing that could possibly help is to take the valgrind log again, this time with appropriate debuginfo packages installed to avoid useless ????? backtraces. Lastly, I closed bug #772803 because the reporter discovered his problem, but it seems like multiple users in this bug report might still be affected. I encourage you to file a separate bug (with a stacktrace taken with GDK_SYNCHRONIZE=1) for the crash there if you're still affected. Preferably on WebKit Bugzilla, but it's OK to file it here first as well.
(In reply to Michael Catanzaro from comment #43) > Lastly, I closed bug #772803 ... Just a note that you meant bug #773302 instead. I have one reproducer, I'm going into that bug report.