Reputation: 13
I have one application that will read from a folder and wait for a file to appear in this folder. When this file appear, the application shall read the content, execute a few functions to external systems with the data from the file and then delete the file (and in turn wait for next file).
Now, I want to run this application on two different machines but both listen in the same folder. So it’s the exact same application but two instances. Let’s call it instance A and instance B.
So when a new file appear, both A and B will find the file, and both will try to read it. This will lead to some sort of race condition between the two instances. I want that if A started read the file before B, B shall simply skip the file and let A process and delete it. Same thing if B finds the file first, A shall do nothing.
Now how can I implement this, setting a lock on the file is not sufficient I guess because lets say A started to read the file, it is then locked by A, then A will unlock it in order to delete it. During that time B might try to read the file. In that case the file is processed twice, which is not acceptable.
So to summarize, I have two instances of one program and one folder / network share, whenever a file appear in the folder. I want EITHER instance A or instance B process the file. NEVER both, any ideas of how I can implement such functionality in C#?
Upvotes: 1
Views: 7700
Reputation: 2218
Instead getting deep in file access change, I would suggest to use a functionality-server approach. Additional argument for this approach is file usage from different computers. This particular thing goes deep in access and permission administration.
My suggestion is about to have a single point of file access (Files repository) that implements the following functionality:
There are a lot of ways to implement the approach. (Use API of a files a file versioning system; implement a service; use a database, ...)
An easy one (requires a database that supports transactions, triggers or stored procedures)
Upvotes: 1
Reputation: 9639
The correct way to do this is to open the file with a write lock (e.g., System.IO.FileAccess.Write, and a read share (e.g., System.IO.FileShare.Read). If one of the processes tries to open the file when the other process already has it open, then the open command will throw an exception, which you need to catch and handle as you see fit (e.g., log and retry). By using a write lock for the file open, you guarantee that the opening and locking are atomic and therefore synchronised between the two processes, and there is no race condition.
So something like this:
try
{
using (FileStream fileStream = new FileStream(FileName, FileMode.Open, FileAccess.Write, FileShare.Read))
{
// Read from or write to file.
}
}
catch (IOException ex)
{
// The file is locked by the other process.
// Some options here:
// Log exception.
// Ignore exception and carry on.
// Implement a retry mechanism to try opening the file again.
}
You can use FileShare.None if you do not want other processes to be able to access the file at all when your program has it open. I prefer FileShare.Read because it allows me to monitor what is happening in the file (e.g., open it in Notepad).
To cater for deleting the file is a similar principle: first rename/move the file and catch the IOException that occurs if the other process has already renamed it/moved it, then open the renamed/moved file. You rename/move the file to indicate that the file is already being processed and should be ignored by the other process. E.g., rename it with a .pending file extension, or move it to a Pending directory.
try
{
// This will throw an exception if the other process has already moved the file -
// either FileName no longer exists, or it is locked.
File.Move(FileName, PendingFileName);
// If we get this far we know we have exclusive access to the pending file.
using (FileStream fileStream = new FileStream(PendingFileName, FileMode.Open, FileAccess.Write, FileShare.Read))
{
// Read from or write to file.
}
File.Delete(PendingFileName);
}
catch (IOException ex)
{
// The file is locked by the other process.
// Some options here:
// Log exception.
// Ignore exception and carry on.
// Implement a retry mechanism to try moving the file again.
}
As with opening files, File.Move is atomic and protected by locks, therefore it is guaranteed that if you have multiple concurrent threads/processes attempting to move the file, only one will succeed and the others will throw an exception. See here for a similar question: Atomicity of File.Move.
Upvotes: 1
Reputation: 4037
So if you are going to apply lock you can try to use file name as a lock object. You can try to rename file in special way (like by adding dot in front of file name) and first service that was lucky to rename file will continue with it. And second one (slow) will get exception that file does not exist.
And you have to add check to your file processing logic that service will not try to "lock" file that is "locked" already (have a name started with dot).
UPD may be it is better to include special set of characters (like a mark) and some service identificator (machinename concatenated with PID)
because i'm not sure how file rename will work in the concurrent mode.
So if you have got file.txt
in the shared folder
.lock
- special marker, Devhost
- name of current computer and 345
is a PID (process identifier)file.txt.lockDevhost345
file
availableif yes - it was locked by current service instance and can be used if no - it was "stolen" by concurrent service so it should not be processed.
If you do not have write permission you can use another network share and try to create additional file lock marker, for example for file.txt
service can try to create (and hold write lock) new file like file.txt.lock
First service that has created lock file is taking care about original file and removes lock only when original file was processed.
Upvotes: 0
Reputation: 9610
I can think of two quick solutions to this;
Distribute the load
Have your 2 processes so that they only work on some files. How you do this could be based on the filename, or the date/time. E.g. Process 1 reads files which have a time stamp ending in an odd number, and process 2 reads the ones with an even number.
Database as lock
The other alternative is that you use some kind of database as a lock.
Process 1 reads a file and does an insert into a database table based on the filename (must be unique). If the insert works, then it is responsible for the file and continues processing it, else if the insert fails, then the other process has already inserted it so it is responsible and process 1 ignores the file.
The database has to be accessible to both processes, and this will incur some overhead. But might be a better option if you want to scale this out to more processes.
Upvotes: 0