After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 311816 - Collections of [VH]LOOKUP on a single table are slow
Collections of [VH]LOOKUP on a single table are slow
Status: RESOLVED DUPLICATE of bug 525875
Product: Gnumeric
Classification: Applications
Component: import/export MS Excel (tm)
git master
Other All
: High normal
: ---
Assigned To: Jody Goldberg
Jody Goldberg
Depends on:
Blocks:
 
 
Reported: 2005-07-28 03:14 UTC by Jeff Blank
Modified: 2008-04-11 21:54 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Jeff Blank 2005-07-28 03:14:21 UTC
When opening an MS-Excel .xls file needing formula recalculation, Gnumeric takes
an extremely long time to recalculate. Jody Goldberg <jody@gnome.org> writes:
-----
Ahh, I see the problem.  We're using to the more expensive case
insensitive comparison immediately which requires alot of string
conversions to handle the collation.  I'll look into checking
straight equality first.
-----

Other information:
A rather large file that exposes the poor performance can be found at
http://wfltest.mr-paradox.net/players.zip . The zipped spreadsheet was created
with the perl module Spreadsheet::WriteExcel v2.14 and contains one formula per
row per worksheet (all in the "WFL" column on each sheet). Due to the way
Spreadsheet::WriteExcel handles formulas, each formula cell requires
recalculation upon opening by a spreadsheet reader (Excel, Gnumeric, etc.).
Comment 1 Jody Goldberg 2005-07-28 14:29:26 UTC
Running a value_compare (case_insensitive = TRUE) on large data sets is very
expensive.  We're doing a pair of casefolds and a collate.
Comment 2 Morten Welinder 2005-07-28 16:56:32 UTC
The collate alone is 76% of total time:

 100.00%   main
  99.07%   wb_view_new_from_uri
  99.07%   wb_view_new_from_input
  97.43%   workbook_recalc
  97.43%   dependent_eval
  97.43%   cell_eval_content
  97.43%   gnm_expr_eval
  97.43%   function_call_with_list
  97.42%   gnumeric_vlookup
  97.42%   find_index_linear
  96.44%   value_compare
  72.62%   g_utf8_collate
  40.08%   g_convert
  30.32%   open_converter
  30.15%   g_utf8_normalize
  26.83%   _g_gnulib_sprintf
  26.77%   _g_gnulib_vsprintf
  25.63%   _g_gnulib_vasnprintf
  25.60%   _g_utf8_normalize_wc
  22.81%   g_utf8_casefold
  15.20%   g_realloc
  15.10%   _g_gnulib_printf_parse
  14.36%   realloc
  12.08%   _malloc_unlocked
  11.86%   g_string_append_unichar
  10.45%   g_string_insert_unichar
   9.98%   g_malloc
   8.99%   malloc
   7.65%   g_string_insert_len
   6.63%   cleanfree
   6.53%   combine
   6.40%   g_convert_with_iconv
   6.28%   g_free
   5.29%   g_string_maybe_expand
   4.90%   g_unicode_canonical_ordering
   4.70%   memcpy
   4.20%   free
   3.99%   sprintf
   3.92%   g_iconv
   3.89%   g_ucs4_to_utf8
   3.82%   libiconv
   3.68%   _doprnt
   3.67%   _free_unlocked
   3.53%   unicode_loop_convert
   3.38%   g_unichar_tolower
   3.32%   g_string_new
   3.25%   g_string_sized_new
   3.24%   g_utf8_get_char
   3.23%   realfree
   3.15%   close_converter[libglib-2.0.so.0/103]
   2.92%   g_hash_table_lookup
   2.88%   mutex_unlock
   2.53%   t_delete
   2.16%   g_unichar_to_utf8
   1.94%   find_decomposition
   1.68%   mutex_lock
   1.64%   go_file_opener_open
   1.64%   go_plugin_file_opener_open
   1.64%   go_plugin_loader_module_func_file_open
   1.64%   excel_file_open
   1.64%   excel_read_workbook
   1.64%   excel_read_BOF
   1.64%   excel_read_sheet
   1.58%   t_splay
   1.55%   strlen
   1.28%   g_string_free
   1.13%   g_hash_table_remove
   1.10%   g_hash_table_insert
Comment 3 Morten Welinder 2005-07-28 17:48:14 UTC
See also bug 172406.
Comment 4 Jeff Blank 2005-07-31 20:00:53 UTC
In case it becomes necessary for anyone to download this spreadsheet from this 
point forward, it is now available only at
http://www.wildfootball.org/players.zip . The 'wfltest' hostname in the initial 
problem report has ceased to exist.

And I have one question, as I'm not much of a source diver: would this problem 
also cause poor performance when updating a cell in a worksheet referenced by 
vlookup formulas, or is that a separate issue? Gnumeric goes out to lunch for 
quite some time, similar to recalculation upon opening, if I change seemingly 
any cell value in the 2004WFL_Players worksheet of the above workbook.
Comment 5 Morten Welinder 2005-07-31 22:41:59 UTC
Same problem.

The problem is that we are going to re-evaluate all cells that depend on the
cell you change.  That means that all the lookup stuff gets re-evaluated,
just as it does on load for this sheet.  The killer is that we are not
noticing and using that a number of lookups on the same table is being
done.
Comment 6 Morten Welinder 2008-04-11 21:54:46 UTC
Accidentally fixed.

Thanks for taking the time to report this bug.
This particular bug has already been reported into our bug tracking system, but we are happy to tell you that the problem has already been fixed. It should be solved in the next software version. You may want to check for a software upgrade.


*** This bug has been marked as a duplicate of 525875 ***