GNOME Bugzilla – Bug 535223
gbookmark file inefficiency ...
Last modified: 2008-07-21 13:05:29 UTC
Just stracing glib startup and I see: line call 3213 time(NULL) = 1211965787 ... 4610 time(NULL) = 1211965787 ie. we did over a thousand time calls (back to back) on startup ;-) It seems it comes from here: (gdb) bt
+ Trace 198826
Of course - notice that 'time' returns a granularity of seconds anyway ;-) How many seconds do we think it is going to take to parse the recent files ? and could we not simply call 'time' once ahead of time (sic) ;->
in my glib - all callers of bookmark_app_info_new then set the stamp (or set it to time (NULL)) - or free the structure. So the fix is: - retval->stamp = time (NULL); + retval->stamp = 0; ;-) of course - the saving in time is less than a second per app that uses bookmarks, but ... in some situations: eg. virtualised system calls like this are quite expensive.
Interestingly the same bug would do well to be fixed in gtkrecentmanager.c at the same time ;-) [ cut & paste ? ].
Yet more interestingly I see several thousands of: 8765 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3661, ...}) = 0 ... 10513 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3661, ...}) = 0 all back to back; a cursory breakpoint in ___xstat64 shows that: (gdb) bt
+ Trace 198827
It's not immediately clear what to do about that I guess ;-) but well - it's clearly not that ideal to be doing thousands of back-to-back stats of the same file micro-seconds apart ;-)
(In reply to comment #2) > So the fix is: > > - retval->stamp = time (NULL); > + retval->stamp = 0; > > ;-) of course - the saving in time is less than a second per app that uses > bookmarks, but ... in some situations: eg. virtualised system calls like this > are quite expensive. will apply in gbookmarkfile.c, and in gtkrecentmanager.c as well. thanks. (In reply to comment #3) > Yet more interestingly I see several thousands of: > > 8765 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3661, ...}) = 0 ugh. > It's not immediately clear what to do about that I guess ;-) but well - it's > clearly not that ideal to be doing thousands of back-to-back stats of the same > file micro-seconds apart ;-) the stat() of localtime is due to the conversion of the timestamps from string form to seconds from the epoch; this is really lame, glibc should be caching that stuff. other than that, we surely care about the timezone used to save a file inside the recently used files; maybe less so inside a bookmarks file, but special casing one is going to be a pain.(In reply to comment #1)
Weelll ... ;-) reading the gbookmark.c code - I guess we could simply store the time internally in UTC - and thus avoid all the conversions on read, and all the conversions on write too: leaving just a few conversions for getters/setters - which (I hope) there are fewer of. Failing that, we could store & invalidate a parallel 'gmtime' variant of each of the 3 fields at some cost. Of course - a solution to this: $ grep '<bookmark:application ' ~/.recently-used.xbel | wc -l 711 is also interesting - do we really need an ever-growing recently-used file ? :-)
(In reply to comment #5) > Weelll ... ;-) reading the gbookmark.c code - I guess we could simply store the > time internally in UTC - and thus avoid all the conversions on read, and all > the conversions on write too: leaving just a few conversions for > getters/setters - which (I hope) there are fewer of. the time is stored internally in UTC - because that's what it's written to and read from the storage file. but maybe I'm misunderstanding. > $ grep '<bookmark:application ' ~/.recently-used.xbel | wc -l > 711 > > is also interesting - do we really need an ever-growing recently-used file ? > :-) it's not "ever-growing" anymore: the list is clamped at 30 days and it can be set to fewer (or more) days using GtkSettings:gtk-recent-files-max-age.
Oh ? I assumed that it was converted to localtime on read & back again (via gmtime) on write for some reason: but it's not you're right :-) Having said that - if we are calling gmtime ourselves anyway ... why don't we just write out the time information ourselves with a custom printf ? it's not as if we need much complexity to get from: struct tm { int tm_sec; /* seconds */ int tm_min; /* minutes */ int tm_hour; /* hours */ int tm_mday; /* day of the month */ int tm_mon; /* month */ int tm_year; /* year */ ... to something like: ISO_8601_FORMAT "%Y-%m-%dT%H:%M:%SZ" 2007-11-09T15:30:47Z surely something like a g_strdup_printf ("%4d-%.2d-%.2dT%.2d:%.2d:%.2dZ") with the relevant fields would do just as well - and also produce a true ISO date - AFAICS from reading glibc: strftime seems to do some quite funky things around (eg.) 'eras' [ could an Arabic based date pop out of it ? ]. glibc/time/strftime_l.c (my_strftime) seems to have: /* POSIX.1 requires that local time zone information is used as though strftime called tzset. */ # if HAVE_TZSET tzset (); # endif meaning glibc prolly can't do much better for us here.
Emmanuele, whats the status of this ?
I have some changes in the GBookmarkFile format I'm going to commit - plus some changes in the spec that I agreed with the KDE developers, as they are supporting the bookmark file format in their file selector. I'll attach the patch to this bug, which contains the time() change Michael suggested, and a change in the format to reduce the number of strdup_printf() calls when writing to the file.
Created attachment 114848 [details] [review] [PATCH] Fixed for GBookmarkFile the patch contains multiple fixes for GBookmarkFile: - remove the Unix timestamp usage inside the metadata section; this was the last usage of a g_strdup_printf() inside the serialization code - see bug 518160 and bug 523877 - rework g_time_val_to_iso8601() to avoid the strftime() call and use the struct tm fields, as the format is fixed and we know how to serialize it. - avoid calling time() multiple times.
looks lovely to me :-) thakns Emmanuele.
committed to trunk 2008-07-21 Emmanuele Bassi <ebassi@gnome.org> * glib/gbookmarkfile.c: (bookmark_app_info_new): Do not set the timestamp value using time(), as it will be overwritten anyway. (#535223, Michael Meeks) (parse_application_element), (bookmark_app_info_dump): Support the "modified" attribute, which takes an ISO-formatted string instead of a Unix time stamp, to keep the number of g_strdup_printf() calls to a minimum. * glib/gtimer.c: (g_time_val_to_iso8601): Do not use strftime(): we know the format and contents of the ISO 8601 date format we use. * tests/bookmarks/valid-03.xbel: Add a test file for the modified attribute.