GNOME Bugzilla – Bug 742563
Image stays blurred until key pressed
Last modified: 2021-05-19 14:38:58 UTC
When advancing rapidly through images in fullscreen mode, images are first displayed blurred and only sharpend later. So far, so good. After a while, images will stay blurry until a key is pressed. Often, this advances to the next image and I have to go back to see the sharp picture. I'm using 0.20.1 from Debian 8.0 (jessie). This worked in version 0.12.3 from Debian 7.0 (wheezy).
I've just upgraded to shotwell 0.20.2 and the problem still persists.
I can confirm that with 0.22.0, the bug still exists. BTW: It seems to be more pronounced at high resolutions. Normally, I'm using 2560x1600 and it hangs a lot. At 1152x864 I can browse thru images almost smoothly.
Here is a way to reproduce this (again, works only at some resolutions, but then 100%): 1) start shotwell 2) double-click on a picture to open it 3) do not move the mouse for ca. 3 s 4) move the mouse, but only within the picture (the images gets sharp at as soon as the mouse is moved out onto any of shotwell's grey area or a key is pressed) I used the opportunity to attach gdb to it. I don't know if this backtracke helps:
+ Trace 235069
After a completely fresh install, Debian/stretch with Gnome 3.22 and shotwell 0.25.1 still has this problem. (Since this is related to screen resolutions, I'm wondering if Shotwell can work on a 4k monitor. Has anyone tried?)
I have not yet looked into that, sorry. It's on the list for the next stable release
After unsuccessfully trying to debug this through the Vala sources and the multiple threads, let me give up and summarize. :-/ I cannot reproduce any more what I described above (blurry until keypress). However, shotwell sometimes takes a very long time to display a crisp image. Here are steps to reproduce: 1) Import a couple of large photos (>16MPixel). 2) Make adjustments of exposure, temperature, shadows, whatnot on these photos. 3) In a large shotwell window, double click the first one, then thumb through the images back and forth quickly by pressing the arrow keys. Stop somewhere. Repeat. One clearly observes the lag for sharpening. Say, it typically takes 0.5s per image. But sometimes, shotwell just seems to stop and after about 30s the images eventually gets sharp. GDB tells me that it is somewhere in GLIB during these 30s.
(In reply to Richard B. Kreckel from comment #6) > After unsuccessfully trying to debug this through the Vala sources and the > multiple threads, let me give up and summarize. :-/ Welcome to my hell. Parts of the code are really not discoverable at all, sorry :( > > I cannot reproduce any more what I described above (blurry until keypress). > However, shotwell sometimes takes a very long time to display a crisp image. > Here are steps to reproduce: > 1) Import a couple of large photos (>16MPixel). > 2) Make adjustments of exposure, temperature, shadows, whatnot on these > photos. > 3) In a large shotwell window, double click the first one, then thumb > through the images back and forth quickly by pressing the arrow keys. Stop > somewhere. Repeat. > > One clearly observes the lag for sharpening. Say, it typically takes 0.5s > per image. But sometimes, shotwell just seems to stop and after about 30s > the images eventually gets sharp. So what happens when you switch between photos (I think) is it takes the large thumbnail, shows that fullscreen (hence the blurryness), then takes the backing picture (which is either the developed raw megapixel JPEG or the original JPEG), applies transformations just in time (which are somewhat slow) and then shows the image. There's a series of patches to speed up image transformations in queue for 0.26 which might improve the situation. I'm not sure if they still cleanly apply but you can check if you like: https://bugzilla.gnome.org/show_bug.cgi?id=716627 https://bugzilla.gnome.org/show_bug.cgi?id=716644
(In reply to Jens Georg from comment #7) > So what happens when you switch between photos (I think) is it takes the > large thumbnail, shows that fullscreen (hence the blurryness), then takes > the backing picture (which is either the developed raw megapixel JPEG or the > original JPEG), applies transformations just in time (which are somewhat > slow) and then shows the image. > > There's a series of patches to speed up image transformations in queue for > 0.26 which might improve the situation. I'm not sure if they still cleanly > apply but you can check if you like: > > https://bugzilla.gnome.org/show_bug.cgi?id=716627 > https://bugzilla.gnome.org/show_bug.cgi?id=716644 I now see that the time gets spent in function PixelTransformer.transform_to_other_pixbuf(). There's this loop over each pixel and some profiling and reading the asm reveals that the loop body does way too many function calls. I don't know any vala, but methinks there should be a way to refactor this code so the compiler can inline more? (The first patch you mention seems to address this, but alas, it also seems to address so many other things.)
Probably not, depends on the C that results from the Vala code. I would like to parallelize those processors, maybe for 0.28
Created attachment 347881 [details] [review] use AX_CC_MAXOPT Can you try this patch? You might need to in stall autoconf-archive for all the macro dependencies.
At least for me (x201, core i5) that seems to speed up things significantly: Before: L 4808 2017-03-13 22:32:13 [DBG] Photo.vala:3524: PIPELINE [1] /tmp/shotwell/library/2016/04/23/DSCF0733.RAF (scaling: viewport 953x708 (not scaled up)): redeye=0,000001 crop=0,000001 adjustment=1,772586 orientation=0,000002 straighten=0,000000 scale=0,084227 total=1,882829 L 4808 2017-03-13 22:32:17 [DBG] Photo.vala:3524: PIPELINE [1] /tmp/shotwell/library/2016/04/23/DSCF0733.RAF (scaling: UNSCALED): redeye=0,000001 crop=0,000000 adjustment=6,780621 orientation=0,000002 straighten=0,000000 scale=0,000000 total=6,872586 After: L 27545 2017-03-13 22:39:14 [DBG] Photo.vala:3524: PIPELINE [1] /tmp/shotwell/library/2016/04/23/DSCF0733.RAF (scaling: viewport 953x708 (not scaled up)): redeye=0,000001 crop=0,000000 adjustment=0,650499 orientation=0,000002 straighten=0,000000 scale=0,129226 total=0,799805 L 27545 2017-03-13 22:39:16 [DBG] Photo.vala:3524: PIPELINE [1] /tmp/shotwell/library/2016/04/23/DSCF0733.RAF (scaling: UNSCALED): redeye=0,000000 crop=0,000000 adjustment=2,489454 orientation=0,000002 straighten=0,000000 scale=0,000000 total=2,578619
Sorry, please move AX_CC_MAXOPT before AX_ENABLE_DEBUG... Apparently the later breaks AX_CC_MAXOPT
(In reply to Jens Georg from comment #10) > Can you try this patch? On my Athlon II X2 270, this patch speeds up shotwell applying two transformations (one RGB followed by one HSV) on a single 4912x3264 picture from 11.5s to 3.5s. Impressive.
Strange: Without this patch there is no compiler optimization whatsoever (it calls gcc without optimization which is equivalent to -O0). Wasn't autoconf supposed to turn on decent optimization à la -O2 by default? (For the record: on my Athlon II X2 27 it optimized aggressively with -O3 -fomit-frame-pointer -malign-double -fstrict-aliasing -ffast-math -march=barcelona.)
(In reply to Richard B. Kreckel from comment #14) > (For the record: on my Athlon II X2 27 it optimized aggressively with -O3 > -fomit-frame-pointer -malign-double -fstrict-aliasing -ffast-math > -march=barcelona.) ...with AX_CC_MAXOPT
(In reply to Richard B. Kreckel from comment #14) > Strange: Without this patch there is no compiler optimization whatsoever (it > calls gcc without optimization which is equivalent to -O0). Wasn't autoconf > supposed to turn on decent optimization à la -O2 by default? Yeah, that's that I assumed as well. Maybe AX_ENABLE_DEBUG breaks that as it also overwrites what AX_CC_MAXOPT does
hm. AX_CHECK_ENABLE_DEBUG runs before AC_PROG_CC, so in theory it shouldn't mess with those variables, but if I remove it I get -O2 as expected. Can you quickly check the numbers when removing AX_CHECK_ENABLE_DEBUG, please?
I dug a bit deeper. AC_PROG_CC only sets CFLAGS if CFLAGS is unset. But AX_CHECK_ENABLE_DEBUG explicitly sets CFLAGS to "" when disabled. This is done somewhat on purpose, but creates this odd behavior here.
(In reply to Jens Georg from comment #17) > Can you quickly check the numbers when removing AX_CHECK_ENABLE_DEBUG, please? It turns out to be the same 3.5s as with the aggressive settings above.
Yes, I just tested, -O1 already gives maximum speedup
Created attachment 347963 [details] [review] Do not use abstract functions to lookup constants Signed-off-by: Jens Georg <mail@jensge.org>
That shaves off a couple of cycles
Wow, GType really hurts there. I'm trying to find a way to remove the casting checks
Please try to remove the --enable-checking from COMMON_VALAFLAGS in common.am
and add --disable-assert
(In reply to Jens Georg from comment #24) > Please try to remove the --enable-checking from COMMON_VALAFLAGS in common.am (In reply to Jens Georg from comment #25) > and add --disable-assert This change brings the timing down to a spectacular 2.3s.
Still not brilliant but way better.
I put all this (and two more modifications) together in https://git.gnome.org/browse/shotwell/log/?h=wip/optimize
I also know know why it is resolution dependent. Shotwell scales down the image to viewport size first and then applies the pipeline - which makes sense since the amount of pixels to process is usually lower.
Created attachment 348120 [details] [review] patch needed for build
On wip/optimize (after applying above patch), I'm now down at 2.2s. Question: What's the reason for using GLib.get_num_processors() + 1 jobs? On my dual-Athlon the third job does not speed up anything at all. It seems to make things a little bit slower (but that is on the verge of what's measurable).
Ah sorry, I'm just checking things with the new tool I added so I can quickly recompile (make src/shotwell-graphics-processor) which takes an input file, output file and a adjustment config (can be get from database; it's just an ini file) The +1 is just testing things. on the Quad i5 here it helped. But I just tested the patch set I mentioned above and with that I can actually get down to ~700ms for a 16M pic from 1.7s, and even 90ms if you don't have a saturation operation (which is still quite expensive) So those are the way to go forward I think.
(In reply to Jens Georg from comment #28) > I put all this (and two more modifications) together in > https://git.gnome.org/browse/shotwell/log/?h=wip/optimize Has commit 949c8216 "Use double" improved your timings? I'm asking because - in my experience - double and float operations are equally fast on amd64 (ring operations +, - and, * are all essentially one cycle) but double sometimes gives slightly better timings than float due to alignment. In contrast, on x86, float operations are much faster than double operations. What do you think? Should this be timed on a 32 bit machine? Or would you say these are getting obsolete?
yes, weirdly it has, ~300ms/picture. But as I said, the lookup stuff beats that branch by factor 10
Unfortunately some of color operations were ported from operating on HSV to RGB which made them look differently. But I noticed that Vala introduces an awful lot of unnecessary struct copies in the processing code :(
(In reply to Jens Georg from comment #35) > Unfortunately some of color operations were ported from operating on HSV to > RGB which made them look differently. When was that? You aren't referring to this wip/optimize branch, I must assume.
I mean in the lookup patch I mentioned before. You should try building master with clang, btw. Something in the Vala code causes GCC to not being able to properly optimize away the unnecessary struct assignments.
(In reply to Jens Georg from comment #37) > You should try building master with clang, btw. Something in the Vala code > causes GCC to not being able to properly optimize away the unnecessary > struct assignments. Interesting. Is there a GCC bug report for this?
Not yet, I wanted to dig a bit more into this, but x86 ASM makes my brain hurt.
I pushed a couple of things to master now
(In reply to Jens Georg from comment #40) > I pushed a couple of things to master now With all of them applied, the picture transformation I used to time above in this thread now takes 1.2s. Good.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/shotwell/-/issues/4584.