Reputation: 805
I am kind of stuck again as I am unable to understand this.
So I have a class named CSVItem
:
public class CSVItem
{
public int SortedAccountNumber { get; set; }
public DateTime Date { get; set; }
public int SNO { get; set; }
public string AccountNumber { get; set; }
public double Value { get; set; }
public int Year
{
get
{
if (Date.Month > MainWindow.fiscalMonth)
{
return Date.Year+1;
}
return Date.Year;
}
}
public int StaticCounter { get { return 1; } }
public CSVItem(string accNo, DateTime date, double value, int sNo)
{
Value = value;
Date = date;
AccountNumber = accNo;
SNO = sNo;
}
}
I read a CSV, and I make a List of Type CSV Item with about 500k items. Then I try to sort using the default Order By method of the list, and try to return the list from the sorted collection. Here is the code:
List<CSVItem> items = new List<CSVItem>();
// ---- some code to read csv and load into items collection
List<CSVItem> vItems = items.OrderBy(r1 => r1.AccountNumber).ThenBy(r1 => r1.Date).ToList();
It is like taking forever to sort and then convert the collection back to a list. Well I have certainly tried loading about a million records previously and never had such -no response- from Linq Sorting ever and it is kind of driving me crazy. Any help or suggestion on where I can look for finding the problem?
Upvotes: 1
Views: 2488
Reputation: 1561
You can use AsParallel()
to your advantage.
List<CSVItem> vItems = items.AsParallel().OrderBy(r1 => r1.AccountNumber).ThenBy(r1 => r1.Date).ToList();
The question arised, if the parallelization of OrderBy()
does have side-effects if it's followed by a ThenBy()
.
When does the AsParallel()
split the IEnumerable
? There are 2 possible answers. Let's take the given query:
items.AsParallel().OrderBy(x=>x.Age).ThenBy(x=>x.Size)
Option 1
The items get split, each part gets ordered by age, then by size and finally merge back into 1 list. Obviously not what we want.
Option 2
The items get split, each part gets ordered by age, the items merge back into 1 list. After that, the items get split again, ordered by size and merge back into 1 list. That's what we want.
I created a little example to check, which one is true.
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
static void Main(string[] args)
{
List<TestItem> items = new List<TestItem>();
List<TestItem> itemsNonParallel = new List<TestItem>();
items.Add(new TestItem() { Age = 1, Size = 12 });
items.Add(new TestItem() { Age = 2, Size = 1 });
items.Add(new TestItem() { Age = 5, Size = 155 });
items.Add(new TestItem() { Age = 23, Size = 42 });
items.Add(new TestItem() { Age = 7, Size = 32 });
items.Add(new TestItem() { Age = 9, Size = 22 });
items.Add(new TestItem() { Age = 34, Size = 11 });
items.Add(new TestItem() { Age = 56, Size = 142 });
items.Add(new TestItem() { Age = 300, Size = 13 });
itemsNonParallel.Add(new TestItem() { Age = 1, Size = 12 });
itemsNonParallel.Add(new TestItem() { Age = 2, Size = 1 });
itemsNonParallel.Add(new TestItem() { Age = 5, Size = 155 });
itemsNonParallel.Add(new TestItem() { Age = 23, Size = 42 });
itemsNonParallel.Add(new TestItem() { Age = 7, Size = 32 });
itemsNonParallel.Add(new TestItem() { Age = 9, Size = 22 });
itemsNonParallel.Add(new TestItem() { Age = 34, Size = 11 });
itemsNonParallel.Add(new TestItem() { Age = 56, Size = 142 });
itemsNonParallel.Add(new TestItem() { Age = 300, Size = 13 });
foreach (var item in items.AsParallel().OrderBy(x => x.Age).ThenBy(x => x.Size))
{
Console.WriteLine($"Age: {item.Age} Size: {item.Size}");
}
Console.WriteLine("---------------------------");
foreach (var item in itemsNonParallel.OrderBy(x => x.Age).ThenBy(x => x.Size))
{
Console.WriteLine($"Age: {item.Age} Size: {item.Size}");
}
Console.ReadLine();
}
}
public class TestItem
{
public int Age { get; set; }
public int Size { get; set; }
}
Result
AsParallel()
does what we want. It first processes the OrderBy()
parallel, merges back the list and then moves on to the next query, in our case ThenBy()
. I tested this multiple times and always the same result.
Upvotes: 2