Alexey Starinsky
Alexey Starinsky

Reputation: 4295

How to protect a file with Win32 API from being corrupted if the power is reset?

In a C++ Win32 app I write a large file by appending blocks about 64K using a code like this:

    auto h = ::CreateFile(
        "uncommited.dat",
        FILE_APPEND_DATA,       // open for writing
        FILE_SHARE_READ,        // share for reading
        NULL,                   // default security
        CREATE_NEW,             // create new file only
        FILE_ATTRIBUTE_NORMAL,  // normal file
        NULL);                  // no attr. template

    for (int i = 0; i < 10000; ++i) { ::WriteFile(h, 64K);}

As far as I see if the process is terminated unexpectedly, some blocks with numbers i >= N are lost, but blocks with numbers i < N are valid, and I can read them when the app restarts, because the blocks themselves are not corrupted.

But what happens if the power is reset? Is it true that entire file can be corrupted, or even have zero length?

Is it a good idea to do

FlushFileBuffers(h);
MoveFile("uncommited.dat", "commited.dat");

assuming that MoveFile is some kind of an atomic operation, and when the app restarts open "commited.dat" as valid and delete "uncommited.dat" as corrupted. Or is there a better way?

Upvotes: 1

Views: 525

Answers (2)

Daniel Sęk
Daniel Sęk

Reputation: 2769

For append only scenario you can split data in blocks (constant or variable size). Each block should be accompanied with some form of checksum (SHA, MD5, CRC).

After crash you can read sequentially each block and verify it's checksum. First damaged block and all following it should be treated as lost (eventually you can inspect them and recover manually).

To append more data, truncate file to the end of last correct block.

You can write two copies in parallel and after crash select one with more good blocks.

Upvotes: 0

Jerry Coffin
Jerry Coffin

Reputation: 490108

MoveFile can work all right in the right situation. It has a few problems though--for example, you can't have an existing file by the new name.

If that might occur (you're basically updating an existing file you want to assure won't get corrupted by making a copy, modifying the copy, then replacing the old with the new), rather than MoveFile you probably want to use ReplaceFile.

With ReplaceFile, you write your data to the uncommitted.dat (or whatever name you prefer). Then yes, you probably want to do FlushFileBuffers, and finally ReplaceFile to replace the old file with the new one. This makes use of the NTFS journaling (which applies to file system metadata, not the contents of your files), assuring that only one of two possibilities can happen: either you have the old file (entirely intact) or else the new one (also entirely intact). If power dies in the middle of making a change, NTFS will use its journal to roll back the transaction.

NTFS does also support transactions, but Microsoft generally recommends against applications trying to use this directly. It apparently hasn't been used much since they added it (in Windows Vista), and MSDN hints that it's likely to be removed in some future version of Windows.

Upvotes: 1

Related Questions