Reputation: 113
I have a datatable containing thousands of rows. in the table there is a serial number column and a test number column. If a serial is tested more than one the test number increments. I need to be able to select the most recent test for each serial from my datatable and insert it into another datatable. Currently I am using this:
DataTable newdata = data.AsEnumerable().Where(x => x.Field<Int16>("Test") ==
data.AsEnumerable().Where(y => y.Field<string>("Serial") ==
x.Field<string>("SerialNumber")).Select(y =>
y.Field<Int16>("Test")).Max()).Select(x => x).CopyToDataTable();
This does do the job however as it is quite clear it is incredibly inefficient. Is there a more efficient way to select the top row of data for each serial number?
Thank you
Solution
So following on from Cam Bruce's answer I implemented the following code with a Dictionary rather than with a join:
//Get all of the serial numbers and there max test numbers
Dictionary<string, Int16> dict = data.AsEnumerable().GroupBy(x => x.Field<string>("SerialNumber")).ToDictionary(x => x.Key, x => x.Max(y => y.Field<Int16>("Test")));
//Create a datatable with only the max rows
DataTable newdata = data.AsEnumerable().Where(x => x.Field<Int16>("Test") ==
dict[x.Field<string>("SerialNumber")]).Select(x => x).CopyToDataTable();
//Clear the dictionary
dict.Clear();
Upvotes: 0
Views: 198
Reputation: 5689
This will give you each serial number, and the Max
test. You can then join that result set back to the DataTable
to get all the max rows.
var maxTest= data.AsEnumerable()
.GroupBy(g=> g.Field<string>("SerialNumber"))
.Select(d=> new
{
SerialNumber = g.Key
Test = g.Max(g.Field<Int16>("Field"))
};
var maxRows = from d in data.AsEnumerable()
join m in maxTest
on new { S = d.Field<string>("SerialNumber"), T = d.Field<Int16>("Test") }
equals new { S = m.SerialNumber, T = m.Test }
select d;
Upvotes: 1