Reputation: 583
This Linq is very slow:
IEnumerable<string> iedrDataRecordIDs = dt1.AsEnumerable()
.Where(x => x.Field<string>(InputDataSet.Column_Arguments_Name) == sArgumentName
&& x.Field<string>(InputDataSet.Column_Arguments_Value) == sArgumentValue)
.Select(x => x.Field<string>(InputDataSet.Column_Arguments_RecordID));
IEnumerable<string> iedrDataRecordIDs_Filtered = dt2.AsEnumerable()
.Where(x => iedrDataRecordIDs.Contains(
x.Field<string>(InputDataSet.Column_DataRecordFields_RecordID))
&& x.Field<string>(InputDataSet.Column_DataRecordFields_Field)
== sDataRecordFieldField
&& x.Field<string>(InputDataSet.Column_DataRecordFields_Value)
== sDataRecordFieldValue)
.Select(x => x.Field<string>(InputDataSet.Column_DataRecordFields_RecordID));
IEnumerable<string> ieValue = dt2.AsEnumerable()
.Where(x => x.Field<string>(InputDataSet.Column_DataRecordFields_RecordID)
== iedrDataRecordIDs_Filtered.FirstOrDefault()
&& x.Field<string>(InputDataSet.Column_DataRecordFields_Field) == sFieldName)
.Select(x => x.Field<string>(InputDataSet.Column_DataRecordFields_Value));
if (!ieValue.Any()) //very slow at this point
return iedrDataRecordIDs_Filtered.FirstOrDefault();
This change accelerates it by a factor of 10 or more
string sRecordID = dt2.AsEnumerable()
.Where(x => iedrDataRecordIDs.Contains(
x.Field<string>(InputDataSet.Column_DataRecordFields_RecordID))
&& x.Field<string>(InputDataSet.Column_DataRecordFields_Field)
== sDataRecordFieldField
&& x.Field<string>(InputDataSet.Column_DataRecordFields_Value)
== sDataRecordFieldValue)
.Select(x => x.Field<string>(InputDataSet.Column_DataRecordFields_RecordID))
.FirstOrDefault();
IEnumerable<string> ieValue = dt2.AsEnumerable()
.Where(x => x.Field<string>(InputDataSet.Column_DataRecordFields_RecordID) == sRecordID
&& x.Field<string>(InputDataSet.Column_DataRecordFields_Field) == sFieldName)
.Select(x => x.Field<string>(InputDataSet.Column_DataRecordFields_Value));
if (!ieValue.Any()) //very fast at this point
return iedrDataRecordIDs_Filtered.FirstOrDefault();
The only change is that I store the result directly in a new variable and use create the where clause with this value instead of a LINQ query (which should be calculated when needed). But LINQ seems to calculate it in a bad way here or am I doing something wrong?
Here some values of my data
dt1.Rows.Count 142
dt1.Columns.Count 3
dt2.Rows.Count 159
dt2.Columns.Count 3
iedrDataRecordIDs.Count() 1
iedrDataRecordIDs_Filtered.Count() 1
ieValue.Count() 1
Upvotes: 1
Views: 356
Reputation: 28708
You're asking why
IEnumerable<string> iedrDataRecordIDs_Filtered = data;
foreach (var item in collection)
{
// do something with
iedrDataRecordIDs_Filtered.FirstOrDefault();
}
is slower than
string sRecordID = data.FirstOrDefault();
foreach (var item in collection)
{
// do something with
sRecordID;
}
Very simply because you're evaluating the iedrDataRecordIDs
collection every time you get the FirstOrDefault
. This isn't a concrete object, it's an enumerable set. That's really just a function that returns some objects. Every time you query it the function will be called and you'll pay that execution cost.
If you change
IEnumerable<string> iedrDataRecordIDs_Filtered = dt2.AsEnumerable()...
var recordIDs = iedrDataRecordIDs_Filtered.ToList();
and then use recordIDs.FirstOrDefault()
you'll see a huge performance increase.
Upvotes: 3