Reputation: 15608
Edited the code to make it thread-safe post comments
Please see the updated question at the end.
Can you please help me understand if this code is thread-safe or how it can be made thread safe?
Setup
My system has a very simple class called WorkItem.
public class WorkItem
{
public int Id {get;set;}
public string Name {get;set;}
public DateTime DateCreated {get;set;}
public IList<object> CalculatedValues {get;set;}
}
There is an interface ICalculator which has a method that takes a work item, performs a calculation and returns true.
public interface ICalculator
{
bool Calculate(WorkItem WorkItem);
}
Let's say we have two implementations of ICalculator.
public class BasicCalculator: ICalculator
{
public bool Calculate(WorkItem WorkItem)
{
//calculate some value on the WorkItem and populate CalculatedValues property
return true;
}
}
Another calculator:
public class AnotherCalculator: ICalculator
{
public bool Calculate(WorkItem WorkItem)
{
//calculate some value on the WorkItem and populate CalculatedValues property
//some complex calculation on work item
if (somevalue==0) return false;
return true;
}
}
There is a calculator handler class. Its responsibility is to execute calculators sequentially.
public class CalculatorHandler
{
public bool ExecuteAllCalculators(WorkItem task, ICalculator[] calculators)
{
bool final = true;
//call all calculators in a loop
foreach(var calculator in calculators)
{
var calculatedValue = calculator.Calculate(WorkItem);
final = final && calculatedValue;
}
return final;
}
}
Finally, in my client class, I inject ICalculators[] which are relevant for the run. I then instantiate ExecuteCalculators() method.
Now I have a large number of work items and I want to perform calculations on them so I create a list of Task, where each task is responsible of instantiating CalculatorHandler instance and then takes a work item and performs calculations by doing a WaitAll() on all of the tasks, e.g.
public class Client
{
private ICalculators[] _myCalculators;
public Client(ICalculators[] calculators)
{
_myCalculators = calculators;
}
public void ExecuteCalculators()
{
var list = new List<Task>();
for(int i =0; i <10;i++)
{
Task task = new Task(() =>
var handler = new CalculatorHandler();
var WorkItem = new WorkItem(){
Id=i,
Name="TestTask",
DateCreated=DateTime.Now
};
var result = handler.ExecuteAllCalculators(WorkItem, _myCalculators);
);
list.Add(task);
}
Task.WaitAll(list);
}
}
This is a simplied version of the system. Actual system has a range of calculators and Calculators and CalculatorHandler are injected via IoC etc.
My questions are - help me understand these points:
Each task creates a new instance of CalculatorHandler. Does this mean anything that happens in CalculatorHandler is thread safe as it does not have any public properties and simply loops over calculators?
Calculators are shared amongst all tasks because they are member variable of Client class but they are passed into CalculatorHandler which is instantiated for each task. Does it mean that when all tasks run, as new instance of CalculatorHandler is created therefore Calculators are automatically thread safe and we will not experience any threading issues e.g. deadlocks etc?
Can you please suggest how I can make the code threadsafe? Is it best to pass in a Func<'ICalculators>'[] to Client class and then within each task, we can execute Func<'ICalculator'>() and then pass those instances to ICalculator there? Func<'ICalculator'> will return instance of ICalculator.
Is it true that calculators are passed in as private method variable therefore other instances of CalulatorHandler cannot run the same instance of calculator? Or because calculators are reference types, we are bound to get multi thread issues?
Can you please help me understand if this updated code is thread-safe or how it can be made thread safe?
Setup
My system has a very simple class called WorkItem. It has getter public properties except 1 property "CalculatedValues".
public class WorkItem
{
public int Id {get;}
public string Name {get;}
public DateTime DateCreated {get;}
public IList<object> CalculatedValues {get;set;}
public WorkItem(int id, string name, DateTime dateCreated)
{
Id = id,
Name = name,
DateCreated = dateCreated
}
}
There is an interface ICalculator which has a method that takes a work item, performs a calculation and returns a IList. It does not change the state of work item.
public interface ICalculator
{
IList<object> Calculate(WorkItem WorkItem);
}
Let's say we have two implementations of ICalculator.
public class BasicCalculator: ICalculator
{
public IList<object>Calculate(WorkItem WorkItem)
{
//calculate some value and return List<object>
return List<object>{"A", 1};
}
}
Another calculator:
public class AnotherCalculator: ICalculator
{
public bool Calculate(WorkItem WorkItem)
{
//calculate some value and return List<object>
return List<object>{"A", 1, workItem.Name};
}
}
There is a calculator handler class. Its responsibility is to execute calculators sequentially. Note, it takes in ICalculators in its constructor when it is instantiated. It has a private static lock object too when it updates work item instance.
public class CalculatorHandler
{
private ICalculators[] _calculators;
public CalculatorHandler(ICalculators[] calculators)
{
_calculators = calculators;
}
//static lock
private static object _lock = new object();
public bool ExecuteAllCalculators(WorkItem workItem, ICalculator[] calculators)
{
bool final = true;
//call all calculators in a loop
foreach(var calculator in calculators)
{
var calculatedValues = calculator.Calculate(workItem);
//within a lock, work item is updated
lock(_lock)
{
workItem.CalculatedValues = calculatedValues;
}
}
return final;
}
}
Finally, in my client class, I execute CalculatorHandler.
Now I have a large number of work items and I want to perform calculations on them so I create a list of Task, where each task is responsible of instantiating CalculatorHandler instance and then takes a work item and performs calculations by doing a WaitAll() on all of the tasks, e.g.
public class Client
{
public void ExecuteCalculators()
{
var list = new List<Task>();
for(int i =0; i <10;i++)
{
Task task = new Task(() =>
//new handler instance and new calculator instances
var handler = new CalculatorHandler(new[]{
new BasicCalculator(), new AnotherCalculator()
});
var WorkItem = new WorkItem(
i,
"TestTask",
DateTime.Now
};
var result = handler.ExecuteAllCalculators(WorkItem);
);
list.Add(task);
}
Task.WaitAll(list);
}
}
This is a simplied version of the system. Actual system has a range of calculators and Calculators and CalculatorHandler are injected via IoC etc.
My questions are - help me understand these points:
Each task creates a new instance of CalculatorHandler and new instances of ICalculators. Calculators do not perform any I/O operations and only create a new private IList. Is calculator handler and calculator instances now thread safe?
CalculatorHandler updates work item but within a lock. Lock is a static private object. Does it mean all instances of CalculatorHandler will share one single lock and therefore at one point, only one thread can update the work item?
Work item has all public getter properties except its CalculatedValues property. CalculatedValues is only set within a static lock. Is this code now thread-safe?
Upvotes: 0
Views: 133
Reputation: 117154
No, it is not thread-safe. If there is any shared state in any calculation then the it is possible to have threading issues. The only way to avoid threading issues is to ensure you are not updating any shared state. That means read-only objects and/or using "pure" functions.
You've used the word "shared" - that means not thread-safe by virtue of sharing state. Unless you mean "distributed" rather than "shared".
Exclusively use read-only objects.
They are reference types so they may be shared amongst separate threads - hence not thread-safe - unless they are read-only.
Here's an example of a read-only object:
public sealed class WorkItem : IEquatable<WorkItem>
{
private readonly int _id;
private readonly string _name;
private readonly DateTime _dateCreated;
public int Id { get { return _id; } }
public string Name { get { return _name; } }
public DateTime DateCreated { get { return _dateCreated; } }
public WorkItem(int id, string name, DateTime dateCreated)
{
_id = id;
_name = name;
_dateCreated = dateCreated;
}
public override bool Equals(object obj)
{
if (obj is WorkItem)
return Equals((WorkItem)obj);
return false;
}
public bool Equals(WorkItem obj)
{
if (obj == null) return false;
if (!EqualityComparer<int>.Default.Equals(_id, obj._id)) return false;
if (!EqualityComparer<string>.Default.Equals(_name, obj._name)) return false;
if (!EqualityComparer<DateTime>.Default.Equals(_dateCreated, obj._dateCreated)) return false;
return true;
}
public override int GetHashCode()
{
int hash = 0;
hash ^= EqualityComparer<int>.Default.GetHashCode(_id);
hash ^= EqualityComparer<string>.Default.GetHashCode(_name);
hash ^= EqualityComparer<DateTime>.Default.GetHashCode(_dateCreated);
return hash;
}
public override string ToString()
{
return String.Format("{{ Id = {0}, Name = {1}, DateCreated = {2} }}", _id, _name, _dateCreated);
}
public static bool operator ==(WorkItem left, WorkItem right)
{
if (object.ReferenceEquals(left, null))
{
return object.ReferenceEquals(right, null);
}
return left.Equals(right);
}
public static bool operator !=(WorkItem left, WorkItem right)
{
return !(left == right);
}
}
Once created it can't be modified so thread-safety is no longer an issue.
Now, if I can assume that each ICalculator
is also implemented without state, and thus is a pure function, then the calculation is thread-safe. However, there is nothing in your question that let's me know that I can make this assumption. There is no way, because of that, that anyone can tell you that your code is thread-safe.
So, given the read-only WorkItem
and the pure ICalculator
function, then the rest of your code then looks like it would be perfectly fine.
Upvotes: 0
Reputation: 4777
1) Creating a new instance of a class, even one without public properties does not provide any guarantee of thread safety. The problem is that ExecuteAllCalculators takes two object parameters. The WorkItem object contains mutable properties and the same WorkItem object is used for all ICalculator calls. Suppose one of the calculators decides to call Clear() on WorkItem.CalculatedValues. Or suppose one calculator sets WorkItem.Name to null and the next decides to do a WorkItem.Name.Length. This isn't technically a "threading" issue because those problems can occur without multiple threads involved.
2) Calculator objects shared across threads is definitely not thread safe. Suppose one of the calculator instances uses a class level variable. Unless that variable is somehow thread protected (example: lock {...}), then it would be possible to produce inconsistent results. Depending how "creative" the implementer of the calculator instances were a deadlock could be possible.
3) Any time your code accepts interfaces you are inviting people to "play in your sandbox". It allows code that you have little control of to be executed. One of the best ways to handle this is to use immutable objects. Unfortunately, you can't change the WorkItem definition without breaking your interface contract.
4) Calculators are passed by reference. The code shows that _myCalculators is shared across all tasks created. This doesn't guarantee that you will have problems, it only makes it possible that you might have problems.
Upvotes: 0