GNOME Bugzilla – Bug 590747
Composer autosave can easily lose data
Last modified: 2013-09-13 01:04:50 UTC
The code in composer/e-composer-autosave.c always reuses the same filename to save the current state of a drafted mail. It will: 1) open the file (or reuse the same file descriptor) 2) truncate the file 3) get camel to write the data 4) close the duplicated file descriptor used by camel If evolution crashes/gets killed at any point between 2 and 4 (and maybe even after, seeing as we don't ensure that the data is on-disk when saving is done), then your mail is completely lost, and you'll have a 0-length file where your afternoon's work used to be.
Confirming. Need to rewrite autosaving to use GIO asynchronously. Plenty of example code in the attachment rewrite. The shitty thing about Camel streams is they're synchronous-only, so you either have to double-buffer the contents in memory before writing to disk or use a thread. We have too many threads already.
Created attachment 139942 [details] [review] Prevent data loss when crashing while autosaving e-mails Prevent data loss when crashing half-way through autosaving e-mails by writing to a temporary file, then renaming it to the actual autosave file. Also refactors autosaving slightly so that the bulk of the work is done in an idle handler, so it doesn't block the interface as much.
Definitely a step in the right direction, though I'd prefer to use g_file_replace(), which handles all this temporary file business for you. See e_attachment_save_async() and its various callbacks -- especially attachment_save_got_output_stream() -- for example code.
I think you'll also need some fsync action, see Bug 575555 Thanks for working on that!
(In reply to comment #3) > Definitely a step in the right direction, though I'd prefer to use > g_file_replace(), which handles all this temporary file business for you. If I use that, I can't see any way to get the name of the temporary file, so it would be impossible to recover the backup autosave file if we crashed between deleting the old autosave file, and renaming the new one (deep in the innards of g_file_replace()). Unless g_file_replace() guarantees that the original file will always be present? I can't tell from the documentation.
(In reply to comment #5) > (In reply to comment #3) > > Definitely a step in the right direction, though I'd prefer to use > > g_file_replace(), which handles all this temporary file business for you. > > If I use that, I can't see any way to get the name of the temporary file, so it > would be impossible to recover the backup autosave file if we crashed between > deleting the old autosave file, and renaming the new one (deep in the innards > of g_file_replace()). > > Unless g_file_replace() guarantees that the original file will always be > present? I can't tell from the documentation. That's the whole point of g_file_replace, either you have the new data, or you have the old data, you get nothing in between.
Right. I don't think it explicitly deletes the old file. It overwrites the old file by renaming the temporary file, which is atomic on most filesystems. So like Bastien said, no change of getting left in an in-between state.
Created attachment 140026 [details] [review] Prevent data loss when crashing while autosaving e-mails (updated) Rewritten to use g_file_replace() and be even more asynchronous. This doesn't do anything with fsync(), since that's in GIO 2.20 by my reckoning.
Now we're talking. Patch is looking good. Have not tested it yet, but I spotted one problem while reading it: #define AUTOSAVE_SEED AUTOSAVE_PREFIX "-XXXXXX" ... path = g_build_filename (e_get_user_data_dir (), AUTOSAVE_SEED, NULL); This creates a temporary file -template- (with the XXXXXX in the name), which is not the actual name of the temporary file. g_mkstemp() converts the XXXXXX to some unique mix of characters, but problem is it never tells you the actual name it's chosen. It just hands you back a file descriptor. A few lines down, after closing the temporary file descriptor, you create a GFile for the template, not the actual temporary file: /* Create the GFile */ state->file = g_file_new_for_path (path); At first glance, g_file_open_tmp() looks like a possible alternative to g_mkstemp(). It hands you back both a file descriptor and the temporary file name it has chosen. Problem is it doesn't allow you to specify a directory for the temporary file. It forces you to use the global /tmp (or whatever equivalent your environment has configured), which is not what we want. Sadly, until GIO grows better support for temporary files, you'll have to use our homegrown and unsafe e_mktemp() (e-util/e-mktemp.h), which just takes a temporary file template and hands you back a unique temporary file name. Reason it's unsafe in the general case is because in the time between picking a file name and actually creating the file, that same file name may be created by something else. There's security risks with that if the data is sensitive, plus a risk of data loss. But I think in this particular case the risk is miniscule. I too had to use it in the attachment rewrite.
Created attachment 140048 [details] [review] Prevent data loss when crashing while autosaving e-mails (updated again) Updated to use e_mktemp() as suggested. I haven't tested this update to the patch, but it should still work.
It turns out the advice I gave you was slightly faulty, as I forgot that e_mktemp() is hard-coded to use ~/.evolution/cache/tmp, which gets cleaned out periodically. We certainly don't want to lose these files to a directory purge after all the trouble we're going through, so I adjusted the patch to use plain old mktemp() and write the autosave file to ~/.evolution as it currently does. Same risks still apply. Otherwise, looks awesome. Exactly what I had in mind. Thanks! http://git.gnome.org/cgit/evolution/commit/?id=22686195e6e1ca774b8fc9e0c34aab241b40d218
*** Bug 248353 has been marked as a duplicate of this bug. ***