42cornflakes
42cornflakes

Reputation: 303

Can write to a file fail in the middle of a flush operation

I am writing contents to a csv file one line at a time and was wondering if write can fail in the middle of a flush operation

My purpose is to understand the different ways a write to a file can be corrupted

Assume the following is the structure of the csv file after a successful write to it

 1,1001210212
 2,8941321654
 6,845646
 17,564968896

Say an error occurs while flushing the 3rd line to file, could it result in the following form

 1,1001210212\n       //showing \n just for understanding purpose. Not actually visible as "\n" in the file
 2,8941321654\n 
 6,845

Or

 1,1001210212\n
 2,8941321654\n

or

1,1001210212\n
2,8941321654\n
6,845646\n              //Fails right after writing the line. Not sure if such an error is even possible

or

1,1001210212\n
2,8941321654\n
6,845646

My code if needed

something.cpp

hash_map<int, unsigned long long> tableData;

//assume hash map is populated here

string outFile = "C:\outFile.csv";
FILE *fout = NULL;

if (outFile.length())
{
    fout = fopen(outFile.c_str(), "a");
}   
if (fout != NULL)
{
    for (vector<uint32_t>::iterator keyValue = keys.begin(); 
    keyValue != keys.end(); ++keyValue)
    {
        fprintf(fout,"%d,",*keyValue);
        fprintf(fout,"%lld\n",tableData[*keyValue]); 
        fflush(fout); 
    }
}

Upvotes: 0

Views: 682

Answers (1)

Brian Bi
Brian Bi

Reputation: 119517

The C standard has the following to say about the fflush function:

If stream points to an output stream or an update stream in which the most recent operation was not input, the fflush function causes any unwritten data for that stream to be delivered to the host environment to be written to the file; otherwise, the behavior is undefined.

Notice the careful wording: the data is delivered to the host environment. What the host environment does with it is beyond the scope of the C standard.

The fflush function may fail. In this case it returns a nonzero value. What actually happens to the file when a flush operation fails is defined by the operating system, not the C programming language.

C is tightly integrated with POSIX. On a POSIX system, fflush is likely to invoke write. POSIX says,

If a write() requests that more bytes be written than there is room for (for example, the process' file size limit or the physical end of a medium), only as many bytes as there is room for shall be written.

In addition, write may be interrupted by a signal. So if a process is terminated by a signal during a flush operation, only part of the data might be written.

Thus, on a Unix-like system, you should indeed be prepared for the possibility that fflush may leave the file in a truncated state.

If it is critical to never leave the file in such a state, then write updates to a temporary file instead, check that all writes succeed at the language level, check that the flush succeeds, and finally rename it atomically to the actual file you want to update.

If you're running on Windows, I have no idea what kind of guarantee is provided.

Upvotes: 4

Related Questions