ithinkyes
ithinkyes

Reputation: 13

Concurrent file usage in C#

I have one application that will read from a folder and wait for a file to appear in this folder. When this file appear, the application shall read the content, execute a few functions to external systems with the data from the file and then delete the file (and in turn wait for next file).

Now, I want to run this application on two different machines but both listen in the same folder. So it’s the exact same application but two instances. Let’s call it instance A and instance B.

So when a new file appear, both A and B will find the file, and both will try to read it. This will lead to some sort of race condition between the two instances. I want that if A started read the file before B, B shall simply skip the file and let A process and delete it. Same thing if B finds the file first, A shall do nothing.

Now how can I implement this, setting a lock on the file is not sufficient I guess because lets say A started to read the file, it is then locked by A, then A will unlock it in order to delete it. During that time B might try to read the file. In that case the file is processed twice, which is not acceptable.

So to summarize, I have two instances of one program and one folder / network share, whenever a file appear in the folder. I want EITHER instance A or instance B process the file. NEVER both, any ideas of how I can implement such functionality in C#?

Upvotes: 1

Views: 7700

Answers (4)

Yaugen Vlasau
Yaugen Vlasau

Reputation: 2218

Instead getting deep in file access change, I would suggest to use a functionality-server approach. Additional argument for this approach is file usage from different computers. This particular thing goes deep in access and permission administration.

My suggestion is about to have a single point of file access (Files repository) that implements the following functionality:

  1. Get files list. (gets a list of available files)
  2. Checkout file. (proprietary grab access to the file so that the owner of the checkout was authorized to modify the file)
  3. Modify file. (update file content or delete it)
  4. Check-in changes to the repository

There are a lot of ways to implement the approach. (Use API of a files a file versioning system; implement a service; use a database, ...)

An easy one (requires a database that supports transactions, triggers or stored procedures)

  1. Get files list. (SQL SELECT from an "available files table")
  2. Checkout file. (SQL UPDATE or Update stored procedure. By update in the trigger or in the stored procedure define an "raise error" state in case of multiple checkout)
  3. Modify file. (update file content or delete it. Please keep in mind that is till better to do over a functionality "server". In this case you would need to implement security policy once)
  4. Check-in changes to the repository (Release the "Checked Out" filed of the particular file entry. Implement the Check-In in transaction)

Upvotes: 1

Polyfun
Polyfun

Reputation: 9639

The correct way to do this is to open the file with a write lock (e.g., System.IO.FileAccess.Write, and a read share (e.g., System.IO.FileShare.Read). If one of the processes tries to open the file when the other process already has it open, then the open command will throw an exception, which you need to catch and handle as you see fit (e.g., log and retry). By using a write lock for the file open, you guarantee that the opening and locking are atomic and therefore synchronised between the two processes, and there is no race condition.

So something like this:

try
{
    using (FileStream fileStream = new FileStream(FileName, FileMode.Open, FileAccess.Write, FileShare.Read))
    {
        // Read from or write to file.
    }
}
catch (IOException ex)
{
    // The file is locked by the other process. 
    // Some options here:
    // Log exception.
    // Ignore exception and carry on.
    // Implement a retry mechanism to try opening the file again.
}

You can use FileShare.None if you do not want other processes to be able to access the file at all when your program has it open. I prefer FileShare.Read because it allows me to monitor what is happening in the file (e.g., open it in Notepad).

To cater for deleting the file is a similar principle: first rename/move the file and catch the IOException that occurs if the other process has already renamed it/moved it, then open the renamed/moved file. You rename/move the file to indicate that the file is already being processed and should be ignored by the other process. E.g., rename it with a .pending file extension, or move it to a Pending directory.

try
{
    // This will throw an exception if the other process has already moved the file - 
    // either FileName no longer exists, or it is locked.
    File.Move(FileName, PendingFileName);
    // If we get this far we know we have exclusive access to the pending file.
    using (FileStream fileStream = new FileStream(PendingFileName, FileMode.Open, FileAccess.Write, FileShare.Read))
    {
        // Read from or write to file.
    }
    File.Delete(PendingFileName);
}
catch (IOException ex)
{
    // The file is locked by the other process. 
    // Some options here:
    // Log exception.
    // Ignore exception and carry on.
    // Implement a retry mechanism to try moving the file again.
}

As with opening files, File.Move is atomic and protected by locks, therefore it is guaranteed that if you have multiple concurrent threads/processes attempting to move the file, only one will succeed and the others will throw an exception. See here for a similar question: Atomicity of File.Move.

Upvotes: 1

oleksa
oleksa

Reputation: 4037

So if you are going to apply lock you can try to use file name as a lock object. You can try to rename file in special way (like by adding dot in front of file name) and first service that was lucky to rename file will continue with it. And second one (slow) will get exception that file does not exist.

And you have to add check to your file processing logic that service will not try to "lock" file that is "locked" already (have a name started with dot).

UPD may be it is better to include special set of characters (like a mark) and some service identificator (machinename concatenated with PID) because i'm not sure how file rename will work in the concurrent mode. So if you have got file.txt in the shared folder

  • first of all you have to check is there .lock string in the file name already
  • if no service can try to rename it to the file.txt.lockDevhost345 (where .lock - special marker, Devhost - name of current computer and 345 is a PID (process identifier)
  • then service have to check is there file.txt.lockDevhost345 file available

if yes - it was locked by current service instance and can be used if no - it was "stolen" by concurrent service so it should not be processed.

If you do not have write permission you can use another network share and try to create additional file lock marker, for example for file.txt service can try to create (and hold write lock) new file like file.txt.lock First service that has created lock file is taking care about original file and removes lock only when original file was processed.

Upvotes: 0

jason.kaisersmith
jason.kaisersmith

Reputation: 9610

I can think of two quick solutions to this;

Distribute the load

Have your 2 processes so that they only work on some files. How you do this could be based on the filename, or the date/time. E.g. Process 1 reads files which have a time stamp ending in an odd number, and process 2 reads the ones with an even number.

Database as lock

The other alternative is that you use some kind of database as a lock.
Process 1 reads a file and does an insert into a database table based on the filename (must be unique). If the insert works, then it is responsible for the file and continues processing it, else if the insert fails, then the other process has already inserted it so it is responsible and process 1 ignores the file.

The database has to be accessible to both processes, and this will incur some overhead. But might be a better option if you want to scale this out to more processes.

Upvotes: 0

Related Questions