user0002128
user0002128

Reputation: 2921

Multi-processing and file operations?

In windows-based OS, assuming there are several different processes that may read and/or write a file freqently by using fopen/fopen_s/fwrite etc, in such case, do I need to consider data-races, or the OS can handle this automatically to ensure the file can only be opened/updated by a single process here at any given time whilst the rest fopen attemp will fail? And what about linux-based OS on this matter?

Upvotes: 0

Views: 1219

Answers (3)

Hayri Uğur Koltuk
Hayri Uğur Koltuk

Reputation: 3020

In Windows it depends on how you open the file.

see some possible values for uStyle parameter in case of OpenFile and dwShareMode in case of CreateFile.

Please note that OpenFile is kind of deprecated though so better use CreateFile.

Upvotes: 1

James Kanze
James Kanze

Reputation: 153909

Maybe. If you're talking about different processes (and not threads), the conventional data race conditions which apply to threads don't apply. However (and there is no difference between Unix and Windows here):

  • Any single write/WriteFile operation will be atomic. (I'm not 100% sure concerning Windows, but I can't imagine it otherwise.) However, if you're using iostream or the older FILE* functions, you don't have direct control of when those operations take place. Normally, they will only occur when the stream's buffer is full. You'll want to ensure that the buffer is big enough, and explicitly flush after each output. (If you're outputting lines of a reasonable length, say 80 characters at the most, it's a safe bet that the buffer will hold a complete line. In this case, just use std::endl to terminate the lines in iostreams; for the C style functions, you'll have to call setvbuf( stream, NULL, _IOLBF, 0 ) before the first output.

  • Each open file in the process has its own idea of where to write in the file, and its own idea of where the end of file is. If you want all writes to go to the end of file, you'll need to open it with std::ios_base::app in C++, or "a" in C. Just std::ios_base::out/"w" is not enough. (Also, of course, with just std::ios_base::out or "w", the file will be truncated when it is opened. Having several different processes truncating the file could result in loss of data.)

  • When reading a file that other processes are writing to: when you reach end of file, the stream or FILE goes into an error state, and will not try to read further, even if other processes are appending data. In C, clearerr should (I think) undo this, but it's not clear what happens next; in C++, clearing the error in the stream doesn't necessarily mean that further reads will not immediately encounter end of file either. In both cases, the safest bet is to memorize where you were before each read, and if the read fails, close the file, then later reopen it, seek to where you were, and start reading from there.

  • Random access, writing other than at the end of file, will also work, as long as all writes are atomic (see above); you should always get a consistent state. If what you write depends on what you have read, however, and other processes are doing something similar, you'll need file locking, which isn't available at the iostream/FILE* level.

Upvotes: 0

Mats Petersson
Mats Petersson

Reputation: 129364

You will have to take care to not open the same file from multiple threads simultaneously - as it's entirely possible to open the file multiple times, and the OS may or may not do what you expect, depending on the mode you are opening the file in - e.g. if you create a new file it will definitely create two different files (one of which will disappear when it gets closed, because it was deleted by the other thread, great, eh?). The rules are pretty complex, and the worst part is that if you don't take extra care, you'll get "mixed up output to the same file" - so lines or even parts of lines get mixed from the two threads.

Even if the OS stops you from opening the same file twice, you will still have to deal with the consequences of "FILE * came back as NULL". What do you do then? Go back and try again, or fail, or?

I'm not sure I can make a good suggestion as to HOW to solve this problem, since you haven't described very well what you are doing to these files. There are a few different things that come to mind:

  1. Keep a "register" of file-names, and a mutex for each file that has to be held to be able to open the file.
  2. Use a single "file-thread" to read/write data on files, and just queue "I want to write this stuff to file aa.txt", and let the worker write as it goes along.
  3. Use lower level file system calls, and use "exclusive" access to the files, with some sort of "backoff" behaviour in case of collision.

I'm sure there are dozens of other ways to solve the problem - it really depends on what you are trying to do.

Upvotes: 0

Related Questions