After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 136185 - FileChooser performance...
FileChooser performance...
Status: RESOLVED FIXED
Product: gtk+
Classification: Platform
Component: Widget: GtkFileChooser
unspecified
Other All
: Normal normal
: ---
Assigned To: gtk-bugs
gtk-bugs
Depends on:
Blocks:
 
 
Reported: 2004-03-04 15:10 UTC by Morten Welinder
Modified: 2011-02-04 16:16 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Quantify log (18.65 KB, text/plain)
2004-03-04 15:11 UTC, Morten Welinder
  Details
Patch 1 (2.43 KB, patch)
2004-03-08 17:36 UTC, Morten Welinder
none Details | Review
Patch 1a -- variant that moves path checking up from the pathbar (5.25 KB, patch)
2004-03-08 18:12 UTC, Morten Welinder
none Details | Review
Patch 2 -- FYI only -- read and cache whole directories (9.72 KB, patch)
2004-03-11 02:54 UTC, Morten Welinder
none Details | Review
Patch 3 -- another patch to try (13.28 KB, patch)
2004-03-11 16:32 UTC, Morten Welinder
none Details | Review
Patch 4 -- fully functional patch (15.70 KB, patch)
2004-03-11 17:11 UTC, Morten Welinder
none Details | Review
Patch 5 -- caching mime_type and icon_type too (20.66 KB, patch)
2004-03-11 20:26 UTC, Morten Welinder
none Details | Review
Patch 5a -- updated to resolve conflicts with jrb's recent check-in (21.08 KB, patch)
2004-03-11 21:54 UTC, Morten Welinder
none Details | Review
Path 6 -- Updated and merged patch (19.66 KB, patch)
2004-03-15 01:24 UTC, Morten Welinder
none Details | Review

Description Morten Welinder 2004-03-04 15:10:29 UTC
Bringing up the file chooser for gnumeric in its src directory takes far
too long, something like 1s wall clock.  Some info:

1. That directory is under NFS.
2. Four files match the filter.
3. There are seven subdirectories.
4. There are ~420 files in total.

I am going to attach Quantify logs for the specific time period from
clicking "Open" and until the dialog is drawn.

Some points:

a. There are 1927 calls to stat(), allmost five calls per file!
b. 666k calls to strlen seems excessive.
c. Letext is a gcc (2.95) bug.
Comment 1 Morten Welinder 2004-03-04 15:11:08 UTC
Created attachment 25163 [details]
Quantify log
Comment 2 Federico Mena Quintero 2004-03-04 18:00:48 UTC
Not sure we can do any real performance work for 2.4.0, but thanks for
the log.  It would be great if you could do some further investigation
of where all the stat() calls are coming from.

It is very likely that GtkFileSystemUnix needs to do more caching, as
right now it stat()s almost every time you call
gtk_file_folder_unix_get_info().  In contrast, GtkFileSystemGnomeVFS
does read the entire folder when it is requested, asynchronously, and
keeps the info structures around.  It has no problems with files
changing underneath as it uses FAM, so we get notification about such
changes and deal with them properly.

Reducing the number of stat() calls should be a priority.  I don't
think strcmp() is very important to look at right now.

The bare framework to do caching in GtkFileSystemUnix is in place, as
it does keep the open folders alive.  It does this to aggregate
requested info types in ::get_folder().  So, implementing a scheme to
stat things just once should not be horribly hard.  It should be
something along these lines:

1. Create an internal FileInfo structure that holds a struct stat
buffer and flags to indicate which info types have been read for that
file.  Add an array of FileInfo structures to GtkFileFolderUnix.

2. get_folder() - No changes should be needed, I think; it already
aggregates new info types to the folder's mask.

3. folder_unix_list_children() - This is essentially a readdir(). 
Store the list in the folder's array of FileInfo structures -- just
filenames, with the "available info" flags turned off.
  * Do we need to opendir() and readdir() every time, so that we can
pick up changes to the file sytsem?  If we do that, we should update
the array of files in such a way that we can emit the appropriate
files-added, files-removed signals.

4. folder_unix_get_info() - Return the available info from the
FileInfo array if available.  If the requested file is not in the
array, stat() it, and emit files-added if appropriate.  If the
requested file is in the array but its available-info flags don't
match the folder's potentially more current requested-info flags,
re-read the file info and update the flags.

5. render_icon() - we have a simple cache for icons; do we need to
move some of this into the FileInfo structures?

In summary, GtkFileSystemGnomeVFS should be a good reference for how
to structure all of this.  The only difference is that there we can
get notifications from the file system, and in GtkFileSystemUnix we
have to assume that the underlying file system will not notify us
about anything.
Comment 3 Morten Welinder 2004-03-04 18:15:35 UTC
If you start in /home/foo/bar, it will stat all files in
/home/foo/bar [ok], /home/foo, /home, and / [3 * not ok].

NFS based sites can have a _huge_ /home directory and NFS is not a
sprinter to begin with --> no good.

AFS based sites will be totally killed by this.  You just don't stat
all the directories in /afs.  Given downed sites that will take ages
(probably days).
Comment 4 Federico Mena Quintero 2004-03-04 18:27:44 UTC
Even with all the caching stuff in place, we should not need to stat()
all the way up to the root.  What is doing that in the code?

[If it is the GtkFileSystemModel used for the folder tree, ignore it;
that's going to be changed by a plain list very soon.  In any case,
you may want to see what's going on in GtkFileSystemModel that is
causing everything to be stat()ed.]
Comment 5 Morten Welinder 2004-03-05 15:14:03 UTC
(gdb) cond 6 (filename[0] == '/' && filename[1] == 'h' && filename[2]
== 'o' && filename[3] == 'm' && filename[4] == 'e' && filename[5] ==
'/' && filename[6] == 'w' && filename[7] == 'e' && filename[8] == 'l'
&& filename[9] == 'i' && filename[10] == 'n' && filename[11] == 'd' &&
filename[12] == 'e' && filename[13] == 'r' && filename[14] == '/' &&
filename[15] == 'R' && filename[16] == 'M' && filename[17] == 'A' &&
filename[18] == 'I' && filename[19] == 'L' && filename[20] == 0)
(gdb) c
Continuing.

Breakpoint 6, filename_get_info (filename=0x6a4ca8
"/home/welinder/RMAIL", types=7, error=0x0) at gtkfilesystemunix.c:1388
1388      info = gtk_file_info_new ();
(gdb) where
  • #0 filename_get_info
    at gtkfilesystemunix.c line 1388
  • #1 gtk_file_folder_unix_get_info
    at gtkfilesystemunix.c line 1307
  • #2 gtk_file_folder_get_info
    at gtkfilesystem.c line 849
  • #3 file_model_node_get_info
    at gtkfilesystemmodel.c line 1209
  • #4 file_model_node_is_visible
    at gtkfilesystemmodel.c line 1235
  • #5 file_model_node_get_children
    at gtkfilesystemmodel.c line 1418
  • #6 find_child_node
    at gtkfilesystemmodel.c line 958
  • #7 find_and_ref_path
    at gtkfilesystemmodel.c line 998
  • #8 find_and_ref_path
    at gtkfilesystemmodel.c line 992
  • #9 find_and_ref_path
    at gtkfilesystemmodel.c line 992
  • #10 find_and_ref_path
    at gtkfilesystemmodel.c line 992
  • #11 _gtk_file_system_model_path_do
    at gtkfilesystemmodel.c line 1083
  • #12 gtk_file_chooser_default_set_current_folder
    at gtkfilechooserdefault.c line 2953
  • #13 _gtk_file_chooser_set_current_folder_path
    at gtkfilechooser.c line 846
  • #14 gtk_file_chooser_set_current_folder
    at gtkfilechooser.c line 519
  • #15 gtk_file_chooser_widget_constructor
    at gtkfilechooserwidget.c line 160

(gdb) p (char*)0x6a2430
$2 = 0x6a2430 "/home/welinder/private"
(gdb) p (char*)0x6a3db0
$3 = 0x6a3db0 "/home/welinder/private/gnome"
(gdb) p (char*)0x6a4240
$4 = 0x6a4240 "/home/welinder/private/gnome/gnumeric"
(gdb) p (char*)0x6a1ba0
$5 = 0x6a1ba0 "/home/welinder/private/gnome/gnumeric/src"

...so it's find_and_ref_path that walks up the tree.
Comment 6 Morten Welinder 2004-03-08 17:36:01 UTC
Created attachment 25345 [details] [review]
Patch 1
Comment 7 Morten Welinder 2004-03-08 17:37:25 UTC
Patch 1 adds a "name_only" argument to get_file_info and makes the
latter only get file information when and as needed.
Comment 8 Morten Welinder 2004-03-08 18:12:39 UTC
Created attachment 25350 [details] [review]
Patch 1a -- variant that moves path checking up from the pathbar
Comment 9 Federico Mena Quintero 2004-03-08 22:54:04 UTC
Patch 1 is to be ignored.  I already committed Patch 1a.
Comment 10 Morten Welinder 2004-03-11 02:54:05 UTC
Created attachment 25505 [details] [review]
Patch 2 -- FYI only -- read and cache whole directories
Comment 11 Morten Welinder 2004-03-11 16:32:03 UTC
Created attachment 25524 [details] [review]
Patch 3 -- another patch to try
Comment 12 Morten Welinder 2004-03-11 17:11:22 UTC
Created attachment 25527 [details] [review]
Patch 4 -- fully functional patch
Comment 13 Morten Welinder 2004-03-11 17:16:09 UTC
Apart from the g_print statements, I think that's an official patch.
Comment 14 Morten Welinder 2004-03-11 18:13:42 UTC
Further steps needed (comments welcome):

1. Persist mime type in the same way stat-info is being done in
   gtkfilesystemunix.c

2. Make get_icon_type use the stat result we already have.  The
   easiest way probably is to persist it at the time we stat.
Comment 15 Morten Welinder 2004-03-11 20:26:17 UTC
Created attachment 25532 [details] [review]
Patch 5 -- caching mime_type and icon_type too
Comment 16 Morten Welinder 2004-03-11 20:27:43 UTC
Those further steps implemented.
Comment 17 Morten Welinder 2004-03-11 21:54:17 UTC
Created attachment 25544 [details] [review]
Patch 5a -- updated to resolve conflicts with jrb's recent check-in
Comment 18 Morten Welinder 2004-03-15 01:24:10 UTC
Created attachment 25655 [details] [review]
Path 6 -- Updated and merged patch
Comment 19 Federico Mena Quintero 2004-03-15 02:09:47 UTC
Committed to CVS.  I'm going to close the bug and open new ones for
the missing notification about changes.