GNOME Bugzilla – Bug 346923
unable to find room for a tile
Last modified: 2008-01-15 14:07:21 UTC
System: x86_64-sun-solaris2.10 Compiler: Sun Studio 11 gimp CFLAGS: -Xa -xO2 -xstrconst -mt -KPIC -xtarget=native -xarch=amd64 Some of the relevant prerequisites I installed before the gimp: cairo-1.0.4 freetype-2.1.10 glib-2.10.2 gtk+-2.8.19 libXft-2.1.10 pango-1.12.2 Please let me know if you need information on additional packages. Problem: built gimp from 2.3.10 source (as a 64 bit executable), ran the gimp, and tried Xtns->Logos->Alien Glow (with all default parameters). While "Blending" is displayed in the progress bar and "gimp-edit-blend" is displayed below the progress bar, I get literally thousands of the following message to the xterm that I started the gimp from: (gimp:3936): Gimp-Base-WARNING **: cache: unable to find room for a tile In addition, if the window containing the resulting logo is obscured (covered by an xterm or hidden when I switch to different workspace in my window manager) and then exposed, I get dozens more of the same message. If I quit gimp and discard the new image, when the gimp exits I see: (gimp:3936): Gimp-Base-WARNING **: tile ref count balance: 27 I've repeated this test several times now, and about a third of the time the image looks as I would expect (ALIEN surrounded by a green glow), but the majority of the time sections of the image are out of place. I have a copy of the XCF from one of these, if you need it.
Your harddisk is full. We probably should detect this and present it with a dialog to the user.
Right.
I have 11 Gig free in /export/home, and I'm the only person on the system (it's my desktop workstation): $/local/gnu/bin/df -h Filesystem Size Used Avail Use% Mounted on /dev/dsk/c2t0d0s0 2.0G 236M 1.7G 13% / swap 2.0G 568K 2.0G 1% /etc/svc/volatile /dev/dsk/c2t0d0s3 16G 2.6G 14G 17% /usr /usr/lib/libc/libc_hwcap2.so.1 16G 2.6G 14G 17% /lib/libc.so.1 /dev/dsk/c2t0d0s4 7.9G 1.3G 6.6G 17% /var swap 2.3G 237M 2.0G 11% /tmp swap 2.0G 28K 2.0G 1% /var/run /dev/dsk/c2t0d0s6 20G 11G 9.1G 54% /local /dev/dsk/c2t0d0s5 4.0G 1.3G 2.7G 31% /opt /dev/dsk/c2t0d0s7 17G 5.2G 11G 32% /export/home
The actual error message that occurs when pushing tiles to the swap file failes is "unable to write tile data to disk". This would be followed by the warning "unable to find room for a tile". The error handling and error messages of the tile code can definitely be improved. Tim, did you get such a warning? Can you please check your config and verify that your swap folder exists and is writable?
I've run this set of steps probably a dozen times now, and before today I hadn't received that warning, ever. Today I received a warning dialog (once, so far) with the message that gimp was unable to *read* tile data from disk (sorry, I didn't save the exact message, but it was definitely a warning about reading, not writing). I've repeated the test procedure several times since then, and I haven't received a warning dialog a second time. Note that before running gimp 2.3.10 (and subsequently as I've done additional testing), I've done rm -rf ~/.gimp* to get rid of all gimp configuration, and just let it be regenerated as part of the user install procedure. I do receive errors (in the xterm I started gimp from) during the initial plugin query for the uri plug-in, but that's a separate problem unrelated to this one (I'll file a separate report). I can see that after removing ~/.gimp* and running gimp, my ~/.gimp-2.3 directory is created with appropriate permissions. While the Alien Glow script-fu is running, the swap file is created successfully: mooney$ls -al ~/.gimp-2.3/ total 5122 drwxr-xr-x 22 mooney faculty 512 Jul 10 17:56 ./ drwxr-x--- 133 mooney faculty 8704 Jul 10 17:56 ../ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 brushes/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 curves/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 environ/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 fonts/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 fractalexplorer/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 gfig/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 gflare/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 gimpressionist/ -rw------- 1 mooney faculty 2322432 Jul 10 17:56 gimpswap.11175 drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 gradients/ -rw-r--r-- 1 mooney faculty 1051 Jul 10 17:56 gtkrc drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 interpreters/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 levels/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 modules/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 palettes/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 patterns/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 plug-ins/ -rw-r--r-- 1 mooney faculty 256910 Jul 10 17:56 pluginrc drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 scripts/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 templates/ -rw-r--r-- 1 mooney faculty 330 Jul 10 17:56 themerc drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 themes/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 tmp/ drwxr-xr-x 2 mooney faculty 512 Jul 10 17:56 tool-options/
Since I had never built any version of gimp on this Solaris 10 workstation, I thought I should try build an older version, to make sure the problem wasn't happening there too. I built 2.2.12 using the exact same toolchain and CFLAGS, etc., and it does not have this problem -- all the tests I tried worked as expected under 2.2.12. Whatever the problem with writing tiles turns out to be, it's new to the 2.3.x series.
But it is absolutely reproducable with 2.3.10 for you? Could you run gimp in a debugger and pass the --g-fatal-warnings command-line option to gimp? You should then be able to get a stack trace from the warning. Perhaps that gives us an idea of what's happening.
I'll need to recompile with debugging enabled to get a decent stack trace. Here's the stack I get without symbols: Gimp-Base-WARNING **: cache: unable to find room for a tile aborting... Gimp-Base-WARNING **: cache: unable to find room for a tile aborting... t@2 (l@2) signal ABRT (Abort) in __lwp_kill at 0xfffffd7ffe53de5a 0xfffffd7ffe53de5a: __lwp_kill+0x000a: jae __lwp_kill+0x18 [ 0xfffffd7ffe53de68, .+0xe ] (dbx) where current thread: t@2 =>[1] __lwp_kill(0x2, 0x6, 0xffffffff829533c0, 0x5, 0xfffffd7ffe645750, 0x10001), at 0xfffffd7ffe53de5a [2] _thr_kill(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe538e83 [3] raise(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe4e73e9 [4] abort(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe4ca39e [5] g_logv(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe5dc2b7 [6] g_log(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe5dc358 [7] tile_cache_insert(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x6e9026 [8] tile_release(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x6e8bce [9] gradient_calc_shapeburst_angular_factor(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x64c990 [10] gradient_render_pixel(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x64cfe0 [11] gradient_fill_single_region_rgb_dither(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x64dd02 [12] do_parallel_regions(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x6e3a75 [13] g_thread_pool_thread_proxy(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe5f2018 [14] g_thread_create_proxy(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe5f0a78 [15] _thr_setup(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe53aa4b [16] _lwp_start(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe53ac80 (dbx)
Is that a multiprocessor system? Does changing the value for num-processors to 1 in the Preferences dialog or gimprc make any difference?
It is a multiprocessor system (dual AMD Opterons). Setting the processors to 1 (from 2) in the preferences dialog does appear to fix it. I've run the Alien Glow several times now, and it hasn't caused any problems when using just 1 processor. Note that after recompiling with debugging enabled, when I gimp with 2 processors, I don't get the tile warnings anymore, I get a segfault: t@3 (l@3) signal SEGV (no mapping at the fault address) in tile_cache_flush_internal at line 235 in file "tile-cache.c" 235 tile->next->prev = tile->prev; (dbx) where current thread: t@3 =>[1] tile_cache_flush_internal(tile = 0x212c160), line 235 in "tile-cache.c" [2] tile_cache_flush(tile = 0x212c160), line 212 in "tile-cache.c" [3] tile_lock(tile = 0x212c160), line 140 in "tile.c" [4] tile_manager_get(tm = 0x1c78fa0, tile_num = 0, wantread = 1, wantwrite = 0), line 276 in "tile-manager.c" [5] tile_manager_get_tile(tm = 0x1c78fa0, xpixel = 45, ypixel = 4, wantread = 1, wantwrite = 0), line 148 in "tile-manager.c" [6] gradient_calc_shapeburst_angular_factor(x = 45.0, y = 4.0), line 515 in "gimpdrawable-blend.c" [7] gradient_render_pixel(x = 45.0, y = 4.0, color = 0xfffffd7ffdbfdd90, render_data = 0xfffffd7fffdfc4d0), line 708 in "gimpdrawable-blend.c" [8] gradient_fill_single_region_rgb_dither(rbd = 0xfffffd7fffdfc4d0, PR = 0xfffffd7ffdbfde58), line 1054 in "gimpdrawable-blend.c" [9] do_parallel_regions(processor = 0xfffffd7fffdfc278), line 119 in "pixel-processor.c" [10] g_thread_pool_thread_proxy(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe382018 [11] g_thread_create_proxy(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe380a78 [12] _thr_setup(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe2caa4b [13] _lwp_start(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffe2cac80 (dbx) (dbx) print tile->prev tile->prev = 0x212c160 (dbx) print tile->next tile->next = (nil)
We should review the tile locking code then. But somehow I find it surprising that this only shows up now and that it shows up on a not so popular platform. This code seems to work fine on Linux or we should have seen such a bug report earlier. After all the parallelized gradient rendering is in GIMP since before version 2.3.0 was released. Perhaps there's an issue with GMutex on your platform?
There's a check for tile->next being != NULL in the line above and tile_flush() is called with a mutex lock being held for the tile in question. So we will probably have to look for a place where the tile is being modified without the lock being held.
FWIW, I've had the same problem on a dual-Xeon machine: 22GB available and thousands of "unable to find room for a tile" messages. OS = Red Hat Enterprise Linux AS release 4 (Nahant Update 3), gimp 2.3.9. I only saw this once a few nights ago while doing a series of blends; switching to single-processor fixes it.
I just figured that the whole tile locking code is not in effect at all. I will enable it and add some extensive asserts to the code.
Created attachment 70977 [details] [review] first attempt at a fix
Bug confirmed on a 2 processor SMP system. The patch in comment #15 seems to fix it.
After another review, I believe that this is indeed the right fix. Commited to CVS and closing as FIXED. I would appreciate if everyone with access to a multiprocessor machine gives this some testing, preferably with the tile-cache-size set to a very low value. 2006-08-16 Sven Neumann <sven@gimp.org> * app/base/tile-cache.c: actually enable tile cache locking and added a missing lock in tile_idle_preswap(). Should fix bug #346923.
Fix works fine on a dual 2.8GHz Intel Xeon RHEL 4 AS machine.