GNOME Bugzilla – Bug 309510
factor out eel_filename_strip_extensions
Last modified: 2008-09-01 10:25:39 UTC
There are many places in nautilus where there is custom code to strip the extension from a filename and most of them get it wrong (ignore the .tar.gz case and leak the extension itself by setting the extension start to \0 in the original string). The attached patches factor out a eel_filename_strip_extensions in eel-vfs-extensions (not sure if there is a better file) and use it in nautilus instead of the custom code. There may be some other places in nautilus that I may have missed...
Created attachment 48664 [details] [review] eel patch
Created attachment 48665 [details] [review] nautilus patch
I think a simple strchr should be enough. this.is.ok.tar.gz isn't really common. If I remember correctly, we already came to the conclusion that the first dot almost always separates the basename from the extension. I think moving strip_extension to eel should be enough.
actually I don't agree with just looking for the first ".", sure "this.is.ok.tar.gz" may not be that common, but I don't see what's wrong in using a more accurate euristic espacially when the code is fairly simple.
Paolo: I just don't see why we should use this algorithm when we almost never encounter such patterns. Also, it produces bad results for filenames like foo.xbel.gz by randomly assuming that each non-dotted sequence of the extension is max. 3 chars long.
every euristic to detect the extension will fail in some cases, what I really don't like of using the first "." is that it gives false positives: if a user sees just the .gz of foo.xbel.gz highlighted he may understand while eaying half of his filename is worse. There are some programs and devices which output files containing dots, counting on the fact that on windows the simple euristic is to search the *last* ".". Also, just looking for the first "." fails on dot files (~/.foo). Now, I'm not claiming that my simple function is the best thing since sliced bread, the whole point of my patch is to abstract the "find the extension" concept in a utility function instead of hardcoding it everywhere, so that if needed we can change the function in just on place and so that the behavior is always the same instead of looking for the first "." in some places and for the last "." in other. Note also that eel-vfs-extensions already contains (just above my function) which is even more complicated, special casing tar.gz etc.
> There are some programs and devices which output files containing dots, counting on the fact that on windows the simple euristic is to search the *last* ".". Maybe you could get concrete here? Windows doesn't usually have nested archives (everything is .zip, .doc, etc.). We haven't received any bug report either about the current renaming behavior.
I seem to recall that some cams name pics with filenames including dots and I'm pretty sure there are other examples, but no, I' can't get more specific or I would have done it in the first place. I really don't see why you have all this resistance to factor the code out in a utility function, it's a clear win. Even if the function would simply do strchr, it would be a win from the code cleanliness, readability and maintainability point of view instead of having strchr everywhere. Especially since as said above plain strchr gets wrong ~/.foo and may give false positives for some filenames which is bad no matter how rare they are. To me it sounds obvious. Whatever. I'm not going to argue this anymore.
Yes, you are totally right that factoring out this function is a good idea. I was just critisizing the proposed algorithm.
This is still valid for SVN trunk. Paolo, could you please update your patches if necessary and send them for review to nautilus-list@gnome.org? Thanks!
I did not really want to set a GNOME target on this.
Created attachment 111950 [details] [review] update nautilus patch Updated patch applies cleanly to head. No changes from comment 2.
Created attachment 111951 [details] [review] new eel patch, using original stripping algorithm Stripping algorithm is directly factored out from nautilus with no changes.
The patches look fine to me. Christian?
Yeah, they are fine. Please commit them to trunk Jared. It may also be a good idea to detect nested file types (i.e. .tar.gz) like eel_filename_get_rename_region().
Thanks for the review. I don't currently have a working copy of nautilus and eel SVN, so it would probably be better if somebody else committed these. In regards to detecting nested file types, it looks like eel_filename_strip_extension could instead be implemented like so: char * eel_filename_strip_extension (const char* string_to_strip) { char *result_str; int start, end; if (string_to_strip == NULL) { return NULL; } result_str = g_strdup(string_to_strip); eel_filename_get_rename_region(result_str, &start, &end); result_str[end] = '\0'; return result_str; } I don't have the time to test & commit any of this at the moment. If you'd like me to do that myself, I'll try to find time for it sometime later this week.
> In regards to detecting nested file types, it looks like eel_filename_strip_extension could instead be implemented (on top of eel_filename_get_rename_region). I don't really mind, if you want to reuse code maybe eel_filename_get_rename_region() should be written on top of eel_filename_strip_extension() rather than vice versa. I'll cook a patch and commit it.
Thanks for your efforts, I committed the Nautilus patch to trunk and rewrote the eel patch: http://svn.gnome.org/viewvc/nautilus?view=revision&revision=14555 http://svn.gnome.org/viewvc/eel?view=revision&revision=2156 Closing.