GNOME Bugzilla – Bug 755795
2.46 considers empty files as octet-stream rather than text (leads to gedit not opening those)
Last modified: 2016-08-19 10:15:07 UTC
That has been reported on https://bugs.launchpad.net/ubuntu/+source/xdg-utils/+bug/1497170 and is likely due to https://git.gnome.org/browse/glib/commit/?id=b6fc1df022a0326e7c36122b1416061bf796c98f "GLocalFileInfo: don't content-sniff zero-length files" in those case would it make sense to return "text/plain" so editors can open those?
Created attachment 312365 [details] [review] GLocalFile: return text/plain for empty files Previously, GLib returned text/plain for empty files. This is important because people may want to open empty (eg: just-created) text files with the text editor. An unintended side-effect of b6fc1df022a0326e7c36122b1416061bf796c98f caused GLib to start returning application/octet-stream instead of text/plain for these files. This commit is essentially a revert of that commit, with a different solution: we move the special-case up a bit in the function and hard-code it to text/plain. This change does not exactly maintain the old behaviour: previously, a "fast" lookup would have returned application/octet-stream on an empty file and now it will return text/plain. I consider this to be an improvement (since we're returning better data) and don't expect it to cause problems.
Review of attachment 312365 [details] [review]: LGTM, just one comment comment. ::: gio/glocalfileinfo.c @@ +1246,3 @@ + * that appear normal but are not (eg: files in /proc and /sys) + */ + return g_content_type_from_mime_type ("text/plain"); I think it's worth maintaining a link to this bug report in the comment here, and/or: /* We have historically returned text/plain for 0 length files */
Attachment 312365 [details] pushed as 202a9c3 - GLocalFile: return text/plain for empty files
(and backported to 2.46, since this is a compatibility fix)
It seems that historically we have trusted the extension and only returned text/plain for 0-length files with no extension. With glib 2.44 -rw-rw-r-- 1 laney laney 0 Sep 29 17:23 /home/laney/temp/a.ini standard::content-type: application/x-wine-extension-ini standard::fast-content-type: application/x-wine-extension-ini and with master standard::content-type: text/plain standard::fast-content-type: text/plain Maybe move it after g_content_type_guess and do this if result_uncertain? That seems to get the old behaviour back for me at least.
Reopening based on comment #5
I'm not convinced the old behavior is worth preserving, really. Or is it ?
I'd flip that around - what's the value of changing behavior?
Created attachment 312415 [details] [review] Only fallback to text/plain for 0-length files if we don't have an idea what it is For example, 0-length foo.exe has historically been application/x-ms-dos-executable
Created attachment 312416 [details] [review] Only fallback to text/plain for 0-length files if we don't have an idea what it is For example, 0-length foo.exe has historically been application/x-ms-dos-executable
I don't know. An empty file is rarely a valid anything, other than a text file. For instance, GKeyFile will complain if you feed it an empty .ini file.
and ms dos would complain if you tried to make it run an 0-length .exe file
I think it would be better to do behaviour changes on purpose instead of accidentally. Don't know what might rely on this previous behaviour but I wouldn't at all be surprised if something did, and we shouldn't randomly change it without being sure it's the right thing to do.
Would someone decide what to do please? If we are going to change the behaviour back then we should do it before .1 if possible.
Get your colleague to comment - afair, desrt considered this change an improvement.
kind of annoying that "Create New Text Document" won't work if I happen to choose a bad name....
if you can find some actual real case this is broken by this I say let's go with your change. Otherwise, I'm still fairly convinced that this is an improvement.
No I don't have one right now. I'm just saying that it won't be surprising to me to find something that expects the old behaviour. If you don't mind this chance then just close the bug and then when something comes up you get to tell people to fix their code.
I think we're convinced that the behaviour, as it stands, is the best it can be.
The correct mime-type for empty files is application/x-zerosize GEdit can then assign itself that.
This change has broken our code, unfortunately. Previously, for a file called "empty.mp3", g_file_query_info(..., G_FILE_ATTRIBUTE_STANDARD_FAST_CONTENT_TYPE, ...) returned "audio/mpeg". Now, for the same file, we get "text/plain". Would it be better to pay attention to the file extension first and use that as the initial guess, and then do the zero-size check after that? This would be more compatible with the old behavior.
Why is there an empty file called "empty.mp3"? This is, objectively, not an mp3 file...
Also: can you please describe the problem that this causes in terms of an actual piece of software that used to work and no longer works correctly?
It broke one of our unit tests. The tests make sure that we deal with corrupt MP3 files correctly, so one test uses an empty file called "empty.mp3" to check this. Because of the changed content type, the code took an unexpected path that triggered the unit test failure. In terms of fixing it, it's not such a big deal. The code now takes both old and new behavior into account. But it was a needless break. The real question here is how to best deal with empty files. What is the content type of empty files that are called empty.txt, empty.tiff, empty.mp3, etc? I think it's reasonable to just take the file extension and base the guess on that. Looking whether the file is empty before doing anything else and concluding that it is a text file is reasonable too, IMO (although it seems a little strange to say that a file called x.mp3 is a text file just because it's empty). It's a borderline case, I understand. What broke our code is that the content type *changed* from the previous version. If it had been "text/plain" all along, we could have handled that too. One final thought: what should the content type of a file called "myprog.cpp" be if that file is empty? (An empty file is a legal C++ source file that compiles just fine.) I guess one could come up with other semi-plausible cases along the same lines.
application/x-zerosize! It should be application/x-zerosize (unless you only have the filename obviously, no way to know it's empty by looking at just the name).
application/x-zerosize works for me. As long as what is returned for empty files is well-defined and doesn't change without warning :-) (As an aside, our code uses G_FILE_ATTRIBUTE_STANDARD_FAST_CONTENT_TYPE first and, if it comes up with a file type that we are interested in, we believe the answer. Otherwise, if it says "application/octet-stream", we do a deep inspection of the file to find out what's inside.) Maybe doing a stat of the file and checking for zero-size should be done only if the file name doesn't produce a guess?
As an aside, when I call with G_FILE_ATTRIBUTE_STANDARD_FAST_CONTENT_TYPE, it would be ultra-nice to state in the doc what will be returned if no guess can be made. Right now, it just says: "A key in the "standard" namespace for getting the fast content type. The fast content type isn't as reliable as the regular one, as it only uses the filename to guess it, but it is faster to calculate than the regular content type. Corresponding GFileAttributeType is G_FILE_ATTRIBUTE_TYPE_STRING."
(In reply to Bastien Nocera from comment #25) > application/x-zerosize! It should be application/x-zerosize (unless you only > have the filename obviously, no way to know it's empty by looking at just > the name). application/x-zerosize fails the most important use case - can I create an empty file and open it in my text editor
(In reply to Matthias Clasen from comment #28) > (In reply to Bastien Nocera from comment #25) > > application/x-zerosize! It should be application/x-zerosize (unless you only > > have the filename obviously, no way to know it's empty by looking at just > > the name). > > application/x-zerosize fails the most important use case - can I create an > empty file and open it in my text editor Add application/x-zerosize to your text editor's mime-types?
I'm sorry I broke your testsuite, Michi, and thanks for the feedback on the change. Given that you've already fixed the problem there, and that I continue to believe that the new behaviour is more logical, I'm going to close this again until someone comes with new information or better ideas. I don't consider application/x-zerosize to be reasonable since this would be an even larger deviation from existing behaviour and it wouldn't improve the situation with regards to text editors (today).
It wasn't a big deal to fix, so no harm done. On reflection, I agree with rejecting application/x-zerosize, given the backward compatibility concern. My one comment would be that the following would deal a little better with the borderline case: For empty files only: - report "text/plain" if the file does not have an extension - report the file type implied by the extension if the extension is recognized - report "text/plain" if the extension is not recognized I think this would be less surprising.
Users of Nautilus reported about this. I think saying that it's text/plain is wrong because seems people use templates like empty files with extensions, so opening gedit for test.odt goes against the expectations. I would say either do the mime type guess here https://git.gnome.org/browse/glib/tree/gio/glocalfileinfo.c#n1239 with g_content_type_guess so it reports the application appropriate for that extension, or return application/x-zerosize and let the clients like Nautilus figure out what to do.
(In reply to Carlos Soriano from comment #32) > Users of Nautilus reported about this. > > I think saying that it's text/plain is wrong because seems people use > templates like empty files with extensions, so opening gedit for test.odt > goes against the expectations. To be fair, the appropriate way to create a template file is to create an actual "blank" file with the application and save it into Templates, not use `touch somerandomefile.somerandomextension`. > I would say either do the mime type guess here > https://git.gnome.org/browse/glib/tree/gio/glocalfileinfo.c#n1239 with > g_content_type_guess so it reports the application appropriate for that > extension, or return application/x-zerosize and let the clients like > Nautilus figure out what to do. Both strategies have issues with backward compatibility.
(In reply to Emmanuele Bassi (:ebassi) from comment #33) > (In reply to Carlos Soriano from comment #32) > > Users of Nautilus reported about this. > > > > I think saying that it's text/plain is wrong because seems people use > > templates like empty files with extensions, so opening gedit for test.odt > > goes against the expectations. > > To be fair, the appropriate way to create a template file is to create an > actual "blank" file with the application and save it into Templates, not use > `touch somerandomefile.somerandomextension`. I completely agree, kinda makes me angry when it's not done in that way :) but we can try to still do our best. > > > I would say either do the mime type guess here > > https://git.gnome.org/browse/glib/tree/gio/glocalfileinfo.c#n1239 with > > g_content_type_guess so it reports the application appropriate for that > > extension, or return application/x-zerosize and let the clients like > > Nautilus figure out what to do. > > Both strategies have issues with backward compatibility. Right, although it looks more like a bug than anything else.