Reputation: 303
I am writing contents to a csv file one line at a time and was wondering if write can fail in the middle of a flush operation
My purpose is to understand the different ways a write to a file can be corrupted
Assume the following is the structure of the csv file after a successful write to it
1,1001210212
2,8941321654
6,845646
17,564968896
Say an error occurs while flushing the 3rd line to file, could it result in the following form
1,1001210212\n //showing \n just for understanding purpose. Not actually visible as "\n" in the file
2,8941321654\n
6,845
Or
1,1001210212\n
2,8941321654\n
or
1,1001210212\n
2,8941321654\n
6,845646\n //Fails right after writing the line. Not sure if such an error is even possible
or
1,1001210212\n
2,8941321654\n
6,845646
My code if needed
something.cpp
hash_map<int, unsigned long long> tableData;
//assume hash map is populated here
string outFile = "C:\outFile.csv";
FILE *fout = NULL;
if (outFile.length())
{
fout = fopen(outFile.c_str(), "a");
}
if (fout != NULL)
{
for (vector<uint32_t>::iterator keyValue = keys.begin();
keyValue != keys.end(); ++keyValue)
{
fprintf(fout,"%d,",*keyValue);
fprintf(fout,"%lld\n",tableData[*keyValue]);
fflush(fout);
}
}
Upvotes: 0
Views: 682
Reputation: 119517
The C standard has the following to say about the fflush
function:
If
stream
points to an output stream or an update stream in which the most recent operation was not input, thefflush
function causes any unwritten data for that stream to be delivered to the host environment to be written to the file; otherwise, the behavior is undefined.
Notice the careful wording: the data is delivered to the host environment. What the host environment does with it is beyond the scope of the C standard.
The fflush
function may fail. In this case it returns a nonzero value. What actually happens to the file when a flush operation fails is defined by the operating system, not the C programming language.
C is tightly integrated with POSIX. On a POSIX system, fflush
is likely to invoke write
. POSIX says,
If a write() requests that more bytes be written than there is room for (for example, the process' file size limit or the physical end of a medium), only as many bytes as there is room for shall be written.
In addition, write
may be interrupted by a signal. So if a process is terminated by a signal during a flush operation, only part of the data might be written.
Thus, on a Unix-like system, you should indeed be prepared for the possibility that fflush
may leave the file in a truncated state.
If it is critical to never leave the file in such a state, then write updates to a temporary file instead, check that all writes succeed at the language level, check that the flush succeeds, and finally rename it atomically to the actual file you want to update.
If you're running on Windows, I have no idea what kind of guarantee is provided.
Upvotes: 4