Reputation: 334
class Student
{
public int ID { get; set; }
public string Name { get; set; }
public string Email { get; set; }
}
class Program
{
private static object _lockObj = new object();
static void Main(string[] args)
{
List<int> collection = Enumerable.Range(1, 100000).ToList();
List<Student> students = new List<Student>(100000);
var options = new ParallelOptions()
{
MaxDegreeOfParallelism = 1
};
var sp = System.Diagnostics.Stopwatch.StartNew();
sp.Start();
Parallel.ForEach(collection, options, action =>
{
lock (_lockObj)
{
var dt = collection.FirstOrDefault(x => x == action);
if (dt > 0)
{
Student student = new Student();
student.ID = dt;
student.Name = "Zoyeb";
student.Email = "[email protected]";
students.Add(student);
Console.WriteLine(@"value of i = {0}, thread = {1}",
action,Thread.CurrentThread.ManagedThreadId);
}
}
});
sp.Stop();
double data = Convert.ToDouble(sp.ElapsedMilliseconds / 1000);
Console.WriteLine(data);
}
}
I want to loop through 100000 records as quickly as possible i tried foreach loop but it is not quite good for loop through 100000 records then after i tried to implement Parallel.ForEach() that improved my performance , in real scenario i will have collection of Ids and i need to lookup into collection whether id exits or not if exits then add. performance is hitting in condition when i comment condition it took around 3 seconds to execute and when i uncomment condition it took around 24 seconds so my question is there any way i can boost my performance by looking up id in collection
//var dt = collection.FirstOrDefault(x => x == action);
//if (dt > 0)
//{
Student student = new Student();
student.ID = 1;
student.Name = "Zoyeb";
student.Email = "[email protected]";
students.Add(student);
Console.WriteLine(@"value of i = {0}, thread = {1}",
action,Thread.CurrentThread.ManagedThreadId);
//}
Upvotes: 1
Views: 3470
Reputation: 117029
Your original code is doing a lock
inside a Parallel.ForEach
. That's essentially taking the parallel code and forcing it to run in series.
It takes 40 seconds on my machine.
It is really the equivalent of doing this:
foreach (var action in collection)
{
var dt = collection.FirstOrDefault(x => x == action);
if (dt > 0)
{
Student student = new Student();
student.ID = dt;
student.Name = "Zoyeb";
student.Email = "[email protected]";
students.Add(student);
}
}
Which also takes 40 seconds.
However, if you just do this:
foreach (var action in collection)
{
Student student = new Student();
student.ID = action;
student.Name = "Zoyeb";
student.Email = "[email protected]";
students.Add(student);
}
That takes 1 millisecond to run. It's roughly 40,000 times quicker.
In this case you can get much faster loops by iterating your collection once, not in a nested way and not using Parallel.ForEach
.
My ap0ologies for missing that the bit about the id not existing.
Try this:
HashSet<int> hashSet = new HashSet<int>(collection);
List<Student> students = new List<Student>(100000);
var sp = System.Diagnostics.Stopwatch.StartNew();
sp.Start();
foreach (var action in collection)
{
if (hashSet.Contains(action))
{
Student student = new Student();
student.ID = action;
student.Name = "Zoyeb";
student.Email = "[email protected]";
students.Add(student);
}
}
sp.Stop();
That runs in 3 milliseconds.
An alternative is to use a join
like this:
foreach (var action in
from c in collection
join dt in collection on c equals dt
select dt)
{
Student student = new Student();
student.ID = action;
student.Name = "Zoyeb";
student.Email = "[email protected]";
students.Add(student);
}
That runs in 25 milliseconds.
Upvotes: 1
Reputation: 6111
You are using a lock instead of a concurrent collection in a parallel foreach. Forcing the parallel foreach to wait, to access the lock and thus execution will be one-by-one.
Change your list to a ConcurrentBag, and remove the lock
from the ParallelForEach
// using System.Collections.Concurrent; // at the top
var students = new ConcurrentBag<Student>()
FirstOrDefault()
Isn't very performant if you want to select by Id. Use a Dictionary. Since the dictionary does a Hashmatch its much faster then FirstOrDefault. See this questoin.
Change your collection to be a dictionary:
var collection = Enumerable.Range(1, 100000)
.ToDictionary(x=> x);
Change the access in your loop to:
if(collection.TryGetValue(action, out var dt))
{
//....
}
Stopwatch is not a benchmarking tool. Please use Benchmark.Net or a different library.
Upvotes: 0