code_fodder
code_fodder

Reputation: 16381

c++ multiple processes writing to the same file - Interprocess mutex?

My question is this: what is the best way (or at least an effective way) to write to a file from multiple processes?

Note: I am using c++11 and I want this to run on any platform (i.e. pure c++ code only).

I have done some research and here is what I have concluded:

  1. In my processes I have multiple threads. This is easily handled within each process using a mutex to serialise access to the file.
  2. A c++/c++11 mutex or conditional variable cannot be used to serialise between processes.
  3. I need some sort of external semaphore / lock file to act as a "mutex"... but I am not sure how to go about doing this.

I have seen applications use things like creating a ".lock" file when in use. But for multiple rapid access it seems like this may not work (i.e. after one process has decided the file does not exist another could create it and then the first process will also try to create it) because the operation to test and create the file is not atomic.

Note: Each process always writes one entire line at a time. I had thought that this might be enough to make the operation "atomic" (in that a whole line would get buffered before the next one), but this does not appear to be the case (unless I have my code wrong) since I (rarely) get a mangled line. Here is a code snippet of how I am doing a write (in case it is relevant):

// in c'tor
m_osFile.open("test.txt", std::fstream::out | std::fstream::app)

// in write func (std::string data)
osFile << data<< std::endl;

This must be a common-ish issue, but I have not yet found a workable solution to it. Any code snippets would be welcome.

Upvotes: 3

Views: 12821

Answers (4)

Sigi
Sigi

Reputation: 4926

My question is this: what is the best way (or at least an effective way) to write to a file from multiple processes?

The best way is... don't do it!

This really seems a sort of log (appending). I would just let every process write its own file and then merge them when needed. This is the common approach at least, and here it is the rationale.

Any kind of intra-process locking is not going to work. Open files have buffering at OS level, even after being closed on some OSes (windows).

You cannot perform file locking, if you want a portable solution ("I want this to run on any platform"): you are going to meet even possible performance penalties/undefined behavior depending on the filesystem being used (eg: samba, NFS).

Writing concurrently and reliably to a single file is in fact a system-dependent activity, today.

I don't mean that it is not possible - DB engines and other applications do it reliably, but it's a customized operation.

As a good alternative, you can let one process act as a collector (as proposed by Gem Taylor), all the rest as producers, but this is not going to be a reliable alternative: logs need to get to disk "simply": if a bug can let the logs not to be written, the log purpose is going to be lost.

However you can think to use this approach, decoupling the processes and letting the messages between them to be exchanged reliably and efficiently: if this is the case you can think to use a messaging solution like RabbitMQ.

In this case all the processes publish their "lines" to the message broker, and one more process consumes such messages and write them to file.

Upvotes: 3

Churam
Churam

Reputation: 1

You could declare your file descriptor and a mutex (condition?) associated with it in a shared memory between all the processes.

Upvotes: 0

Alexander James Pane
Alexander James Pane

Reputation: 648

Well I can imagine two scenarios. Since you didn't specify in your questions how the processes are spawned, I imagine two situations:

  1. Your first process spawns the second process (e.g. using fork()).
  2. The two processes are generated separately in your environment.

In the first scenario, a simple mutual exclusion access to the wanted resource (mutex) between your processes should work fine. This will prevent to a process to access a resource that is being used by the other process.

The second scenario is a bit more complex, it would require that each process acknowledges the existence of the other. A similar issue has already been discussed here, where it's present an interesting link on how to avoid race conditions. I would also consider checking the O_EXCL and O_CREAT flags for this purpose

Upvotes: 1

Jodocus
Jodocus

Reputation: 7601

Usually the operating system provides special functions for locking files that are guaranteed to be atomic (like lockf on Linux or LockFile(Ex) on Windows). As by now, the C++ standard library provides no such functionality, so a plattform-independent approach to such facilities is provided by e.g. Boost.Interprocess.

Upvotes: 2

Related Questions