Guarav T.
Guarav T.

Reputation: 458

Asynchronous read/write multiple files

We are building a custom CMS in which we have to overwrite the content of multiple HTML files(around 1000) on server through my admin panel in one go. I am new to asynchronous and parallel programming So after some R&D I decide to use Parallel programming (TPL) for my problem and below is sample code I am using to overwrite the files with some text.

Now the problem is I have to read the multiple files too in parallel, in my example I am using a simple variable string text = "In file " + index.ToString(), but in actual each file which is being overwritten will be replaced by a static template(for each page) and some dynamic values(for CMS elements) from database. I don't understand how to do this multiple read/write for multiple files in parallel :

    static async void ProcessWriteMultAsync()
    {
        string folder = @"E:\test\";
        string[] items = { "Site1", "Site2", "Site3", "Site4", "Site5", "Site6", "Site7", "Site8", "Site9", "Site10", "Site11", "Site12", "Site13", "Site14",
        "Site15","Site16","Site17","Site18","Site19","Site20",};
        List<Task> tasks = new List<Task>();
        List<FileStream> sourceStreams = new List<FileStream>();

        try
        {
            for (int index = 0; index < items.Length; index++)
            {
                string text = "In file " + index.ToString();

                string filePath = folder + items[index] + "\\ProcurementTemplate.html";

                byte[] encodedText = Encoding.Unicode.GetBytes(text);

                FileStream sourceStream = new FileStream(filePath,
                    FileMode.Create, FileAccess.Write, FileShare.None,
                    bufferSize: 4096, useAsync: true);

                Task theTask = sourceStream.WriteAsync(encodedText, 0, encodedText.Length);
                sourceStreams.Add(sourceStream);

                tasks.Add(theTask);
            }

            await Task.WhenAll(tasks);
        }

        finally
        {
            foreach (FileStream sourceStream in sourceStreams)
            {
                sourceStream.Close();
            }
        }
    } 

Upvotes: 1

Views: 5959

Answers (2)

Waldemar
Waldemar

Reputation: 5513

You may create an array of tasks - one for each file - and then run them all using Task.WaitAll:

public async Task DoWorkAsync(string text, string file)
{
    using(FileStream sourceStream = new FileStream(file, FileMode.Create, FileAccess.Write))
    {
        byte[] encodedText = Encoding.Unicode.GetBytes(text);
        await sourceStream.WriteAsync(encodedText, 0, encodedText.Length);
    }
}

IEnumerable<string> fileNames = new string[] { "file1.txt", "file2.txt" };
Task[] writingTasks = fileNames
                          .Select(fileName => DoWorkAsync("some text", fileName))
                          .ToArray();
await Task.WhenAll(writingTasks);

Upvotes: 0

usr
usr

Reputation: 171246

First, pull the logic to write a file into a method:

async Task Write(string text, string filePath) {
                byte[] encodedText = Encoding.Unicode.GetBytes(text);

                using (FileStream sourceStream = new FileStream(filePath)) {
                await sourceStream.WriteAsync(encodedText, 0, encodedText.Length);
 }
}

Then use Stephen Toubs ForEachAsync to process all items. You need to experimentally determine the right degree of parallelism. It certainly will not be 1000 as it is now in your code. The right DOP depends on the IO system and on how much data the OS buffers.

items.ForEachAsync(async (item) => await Write(item, GetPath(...)), dop: 8);

The old code basically works but it is quite low level and verbose.

Upvotes: 2

Related Questions