Reputation: 137
When executing the following code.
using System;
using System.Collections.Generic;
using System.Data;
using System.Threading.Tasks;
namespace AsyncDataRow
{
internal class Program
{
private static void Main(string[] args)
{
var program = new Program();
try
{
Console.WriteLine("Execute");
for (int i = 0; i < 100; i++)
{
program.Execute();
}
}
catch (AggregateException aggregateException)
{
foreach (var exception in aggregateException.InnerExceptions)
{
Console.WriteLine("AggregateException.InnerExceptions: " + exception.Message);
}
}
catch (Exception ex)
{
Console.WriteLine("Exception.Message: " + ex.Message);
}
Console.ReadKey();
}
private void Execute()
{
DataTable dataTable = new DataTable();
dataTable.Columns.Add("ID", typeof(int));
dataTable.Columns.Add("Name", typeof(string));
for (int i = 0; i < 1000; i++)
{
dataTable.Rows.Add(i, i.ToString());
}
var taskList = new List<Task>();
foreach (DataRow dataRow in dataTable.Rows)
{
taskList.Add(ChangeStringValue(dataRow));
}
Task.WaitAll(taskList.ToArray());
}
private async Task ChangeStringValue(DataRow dataRow)
{
var id = (int)dataRow["ID"];
var newValue = await Task.Run(() => CreateNewValue(id));
dataRow["Name"] = newValue;
}
private string CreateNewValue(int id)
{
Console.WriteLine(string.Format("{0} - CreateNewValue", id));
return id.ToString() + "New StringValue";
}
}
}
Intermittently a System.ArgumentOutOfRangeException is thrown on this line of code
dataRow["Name"] = newValue
Exception message:
Exception thrown: 'System.ArgumentOutOfRangeException' in mscorlib.dll
Additional information: Index was out of range. Must be non-negative and less than the size of the collection.
I'm trying to be more specific identifying the issue, but the closest I got was to simplify the code to reproduce the error.
I think it has something to do with the DataRow reference type being passed into the async method, but I would like to know why this code is not thread-safe.
Upvotes: 2
Views: 1119
Reputation: 100547
The number of thread-safe classes in .Net framework is very small and each would explicitly call out its safety (most are found in System.Collecions.Concurrent
namespace). For all other types you need to provide your own thread safety mechanisms. DB related classes definitely fall into category of non-thread safe for write. Frequently classes explicitly allow multiple thread-safe reads.
In your case DataRow
can be read from multiple threads and you only need to synchronize writes:
Thread Safety
This type is safe for multithreaded read operations. You must synchronize any write operations.
Standard approach for building a lot items - split items in ranges and generate each independently and then merge after all are done.
Partial sample based on original code, MyRowResult
is class to hold all data for row that needs to be updated.
var taskList = new List<Task<MyDataRow>();
foreach (DataRow dataRow in dataTable.Rows)
{
taskList.Add(ChangeStringValue(dataRow));
}
// await for proper async code, WaitAll is fin for console app.
await Task.WhenAll(taskList.ToArray());
// back to single thread - safe to update rows
foreach (var task in taskList)
{
task.Result.DataRow["Name"] = task.Result.Name;
}
}
private async Task<MyRowResult> ChangeStringValue(DataRow dataRow)
{
// on multiple threads - perform read only operations of rows
// which is safe as explicitly called out in the documentation
// "This type is safe for multithreaded read operations."
var id = (int)dataRow["ID"];
var newValue = await Task.Run(() => CreateNewValue(id));
return new MyRowResult { Name = newValue, DataRow = dataRow };
}
Notes:
Parallel
class may simplify running code in parallel (like Parallel.ForEach
).Upvotes: 2