Overwriting a file without the risk of a corrupt file

Question

So often my applications want to save files to load again later. Having recently got unlucky with a crash, I want to write the operation in such a way that I am guaranteed to either have the new data, or the original data, but no a corrupted mess.

My first idea was to do something along the lines of (to save a file called example.dat):

Come up with a unique file name for the target directory, e.g. example.dat.tmp
Create that file and write my data to it.
Delete the original file (example.dat)
Rename ("Move") the temp file to where the original was (example.dat.tmp -> example.dat).

Then at load time the application can follow the following rules:

If no "example.dat" and no "example.dat.tmp", first run / new project, so load in the defaults / create new file.
If "example.dat" and no "example.dat.tmp", then load example.dat (normal load case)
If "example.dat.tmp" exists offer the user the chance to potentially recover data. If "example.dat" also exists, do not overwrite it without explicit user constant.

However, having done a little research, I found that as well as OS caching which I may be able to override with the file flush methods, some disk drives still then cache internally and may even lie to the OS saying they are done, so 4. could complete, the write is not actually written, and if the system goes down I have lost my data...

I am not sure the disk problem is actually solvable by an application, but are the general rules above the correct thing to do? Should I keep an old recovery copy of the file for longer to be sure, what are the guidelines regarding such things (e.g. acceptable disk usage, should the user choose, where to put such files, etc.).

Also how should I avoid potential conflict the user and other programs for "example.dat.tmp". I recall seeing a "~example.dat" sometimes from some other software, is that a better convention?

James Kanze · Accepted Answer

If the disk drives report back to the OS that the data is physically on the disk, and it's not, then there's not much you can do about it. A lot of disks do cache a certain number of writes, and report them done, but such disks should have a battery backup, and finish the physical writes no matter what (and they won't loose data in case of a system crash, since they won't even see it).

For the rest, you say you've done some research, so you no doubt know that you can't use std::ofstream (nor FILE*) for this; you have to do the actual writes at the system level, and open the files with special attributes for them to ensure full synchronization. Otherwise, the operations can stick around in the OS buffering for a while. And that as far as I know, there's no way of ensuring such synchronization for a rename. (But I'm not sure that it's necessary, if you always keep two versions: my usual convention in such cases is to write to a file "example.dat.new", then when I'm done writing, delete any file named "example.dat.bak", rename "example.dat" to "example.dat.bak", and then rename "example.dat.new" to "example.dat". Given this, you should be able to figure out what did or did not happen, and find the correct file (interactively, if need be, or insert an initial line with the timestamp).

Overwriting a file without the risk of a corrupt file

Answers (2)

Related Questions