aherrick
aherrick

Reputation: 20179

C# File Move (Rename) Multi Thread Windows OS

A scenario where I have 10,000 XML files that I want to read in and save to a database. What i have is 5 Windows Services that are all hitting the folder to try and process.

My technique is to first try and rename (File.Move) the file with an extension that is specific to the given Service Instance.

This is working 99% of the time. However what I am seeing is the file system will .01% of the time allow two requests to try and rename at EXACTLY the same time.

How can I prevent this? Does this make sense? See the following code snippet to get an idea. I end up with about 10 files that are IO Exceptions.

 string[] sourceFiles = Directory.GetFiles(InputPath, string.Format(LocaleHelper.Culture, "*.{0}", Extention))
                                            .OrderBy(d => new FileInfo(d).CreationTime).ToArray();


            foreach (string file in sourceFiles)
            {
                var newFileName = string.Format(LocaleHelper.Culture, "{0}.{1}", file, CacheFlushManager.GetInstanceName);


                try
                {
                    // first we'll rename // however at this point the file may not even exist
                    // it will throw an exception and move onto the next file if it exists


                    File.Move(file, newFileName);



                    var xml = File.ReadAllText(newFileName);

                    // write to DB at this point we know its unique
                }
                catch (FileNotFoundException ex)
                {
                    Logger.LogDebug(string.Format(LocaleHelper.Culture, "{0} Couldn't read file : {1}", CacheFlushManager.GetInstanceName, newFileName));
                }
                catch (IOException ex)
                {
                    Logger.LogDebug(string.Format(LocaleHelper.Culture, "{0} Couldn't process file : {1}", CacheFlushManager.GetInstanceName, newFileName));
                }
                catch (Exception ex)
                {
                    Logger.LogError("Execute: Error", ex);

                    try
                    {
                        File.Move(newFileName, string.Format(LocaleHelper.Culture, "{0}.badfile", newFileName));
                    }
                    catch (Exception ex_deep)
                    {
                        Logger.LogError(string.Format("{0} Execute: Error Deep could not move bad file {1}", CacheFlushManager.GetInstanceName, newFileName), ex_deep);
                    }
                }

EDIT 1

Below is the exact error as an example of what I'm seeing. I'm very confused on how the file act that exact time could be in used based on the code I'm using? Am I completely out in the weeds with this?

[7220] TransactionFileServiceProcess [11:28:32]: Service4 Couldn't process file : C:\temp\Input\yap804.xml.Service4 System.IO.IOException: The process cannot access the file 'C:\temp\Input\yap804.xml.Service4' because it is being used by another process.

EDIT 2

Here is a look at what is going on from a "debug" perspective. How could both Service's 2 & 3 get to "END RENAME?" I think this is the crux of the issue... thoughts?

The problem is at the file yap620.xml.Service3 ultimately will just sit out there because of the File operation error.

[6708] TransactionFileServiceProcess [10:54:38]: Service3 Start Rename: C:\temp\Input\yap620.xml.Service3 TransactionFileServiceProcess.Execute => BHSLogger.LogDebug =>     LoggerImpl.Write E[]

[4956] TransactionFileServiceProcess [10:54:38]: Service2 Start Rename: C:\temp\Input\yap620.xml.Service2 TransactionFileServiceProcess.Execute => BHSLogger.LogDebug => LoggerImpl.Write E[]

[7416] TransactionFileServiceProcess [10:54:38]: Service4 Start Rename: C:\temp\Input\yap620.xml.Service4 TransactionFileServiceProcess.Execute => BHSLogger.LogDebug => LoggerImpl.Write E[]

[6708] TransactionFileServiceProcess [10:54:38]: Service3 End Rename: C:\temp\Input\yap620.xml.Service3 TransactionFileServiceProcess.Execute => BHSLogger.LogDebug => LoggerImpl.Write E[]

[6708] TransactionFileServiceProcess [10:54:38]: Service3 Start Read: C:\temp\Input\yap620.xml.Service3 TransactionFileServiceProcess.Execute => BHSLogger.LogDebug => LoggerImpl.Write E[]

[4956] TransactionFileServiceProcess [10:54:38]: Service2 End Rename: C:\temp\Input\yap620.xml.Service2 TransactionFileServiceProcess.Execute => BHSLogger.LogDebug => LoggerImpl.Write E[]

[4956] TransactionFileServiceProcess [10:54:38]: Service2 Start Read: C:\temp\Input\yap620.xml.Service2 TransactionFileServiceProcess.Execute => BHSLogger.LogDebug => LoggerImpl.Write E[]

[6708] TransactionFileServiceProcess [10:54:38]: Service3 Couldn't process file : C:\temp \Input\yap620.xml.Service3 TransactionFileServiceProcess.Execute => BHSLogger.LogDebug => LoggerImpl.Write E[]

Upvotes: 3

Views: 2321

Answers (2)

Jim Mischel
Jim Mischel

Reputation: 134125

I don't see where the issue is. You have multiple threads that get a list of files, and then try to process those files. Sometimes the file that the thread is trying to rename doesn't exist, and sometimes the file exists but it is in the process of being renamed by another thread. Neither one of those two should be a problem. In either case, the thread that gets the error should just assume that some other thread is processing the file, and move on.

Assuming, of course, that you don't have some other process that's accessing files in that directory.

Why you'd want five separate service instances doing this is beyond me. You could simplify things quite a bit and cut down on unnecessary I/O by having just one process do a Parallel.ForEach. For example:

string[] sourceFiles = Directory.GetFiles(
    InputPath,
    string.Format(LocaleHelper.Culture, "*.{0}", Extention))
    .OrderBy(d => new FileInfo(d).CreationTime).ToArray();

Parallel.Foreach(sourceFiles, (file) =>
{
    // do file processing here
});

The TPL will allocate multiple threads to do the processing, and assign work items to the threads. So there's no chance that a file will be open by multiple threads.

Upvotes: 1

FlyingStreudel
FlyingStreudel

Reputation: 4464

Do you have multiple threads running in the same service? Or multiple independent services?

If you have multiple threads in the same service just create a Queue<FileInfo> or something similar and remove items from the queue when threads are free to process. I believe the standard Queue is thread safe so you should never be processing the same file twice.

If you have multiple independent services you could look at using LockFile or File.Open with FileShare.None specified.

edit:

I misunderstood what you were trying to do. I thought you wanted all files to be processed by each of the services. You really need to run these a multiple threads in the same service or allow some method of communication that allows the different services to ascertain which of the files have already been processed.

Upvotes: 0

Related Questions