Reputation: 1003
Please help me to optimize the below code. I have tried different methods but I am not getting a significant performance improvement. There are around 30k entries in database and it's taking around 1 min to load in local.
var alarms = from healthIssue in _context.HealthIssues.AsNoTracking()
join asset in _context.Assets.AsNoTracking() on healthIssue.AssetNumber equals asset.SerialNumber into joinedTable
from data in joinedTable.DefaultIfEmpty()
select new
{
ID = healthIssue.ID,
AssetNumber = healthIssue.AssetNumber,
AlarmName = healthIssue.AlarmName,
Crew = data.Crew,
};
//alarmsViewModelList count is 30k
var alarmsViewModelList = await alarms.ToListAsync();
//groupedData count = 12k
var groupedData = alarmsViewModelList.Select(c => new { c.AssetNumber,c.AlarmName}).Distinct().ToList();
// filteralarms' count = 20k
var filteralarms = (alarmsViewModelList.Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown).ToList());
for (int j = 0; j < groupedData.Count; j++)
{
var alarm = groupedData[j];
//The line is actually slowing the code.
var alarmlist = filteralarms.AsEnumerable().Where(c => c.AlarmName == alarm.AlarmName && c.AssetNumber == alarm.AssetNumber
).Select
(c => new
{
HealthIssueID = c.ID,
AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
AlarmSeverityLevel = c.AlarmSeverityLevel,
}).OrderByDescending(c =>c.AlarmLastUpdateDateTime).ToList();
int alarmCount = alarmlist.Count;
if (alarmCount > 1)
{
businessLogicFunction(alarmlist);
}
}
Upvotes: 4
Views: 1834
Reputation: 744
If you populate a dictionary with the keys being the predicate values concatenated and the value being the selected alarm properties you should be able to do much better.
var alarms = from healthIssue in _context.HealthIssues.AsNoTracking()
join asset in _context.Assets.AsNoTracking() on healthIssue.AssetNumber equals asset.SerialNumber into joinedTable
from data in joinedTable.DefaultIfEmpty()
select new
{
ID = healthIssue.ID,
AssetNumber = healthIssue.AssetNumber,
AlarmName = healthIssue.AlarmName,
Crew = data.Crew,
};
//alarmsViewModelList count is 30k
var alarmsViewModelList = await alarms.ToListAsync();
//groupedData count = 12k
var groupedData = alarmsViewModelList
.Select(c => new { c.AssetNumber,c.AlarmName})
.Distinct()
.ToList();
// filteralarms' count = 20k
var filteralarms = alarmsViewModelList
.Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown).ToList();
//populate an in memory dictionary with a key that is the where clause predicate
var alarmDict = new Dictioanry<string, Alarm>();
foreach (var c in filterAlarms) {
var key = c.AlarmName+"|"+c.AssetNumber;
if (!alarmDict.TryGetValue(key, out var list)) {
alarmDict[key] = new List<Alarm>();
}
var alarm = new {
HealthIssueID = c.ID,
AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
AlarmSeverityLevel = c.AlarmSeverityLevel};
alarmDict[key].Add(alarm);
}
for (int j = 0; j < groupedData.Count; j++)
{
var alarm = groupedData[j];
//use dictionary for faster results. building the dictionary is now the more expensive operation
var key = alarm.AlarmName+"|"+alarm.AssetNumber;
if (alarmDict.TryGetValue(key, out var alarms)) {
var alarmlist = alarms
.OrderByDescending(c => c.AlarmLastUpdateDateTime)
.ToList();
int alarmCount = alarmlist.Count;
if (alarmCount > 1)
{
businessLogicFunction(alarmlist);
}
}
}
Upvotes: 0
Reputation: 116785
You are creating two lists derived from alarmsViewModelList
:
groupedData
which are the distinct values of { alarm.AssetNumber, alarm.AlarmName }
filteralarms
which are all alarms with AlarmSeverityLevel != AlarmSeverityLevel.Unknown
.After creating these two lists, you loop through the first and attempt to cross-reference it with the values in the second via a linear search. This is an n-squared operation. But since the two lists were originally created from the same source data alarmsViewModelList
, you can maintain a list of original objects for each grouping key by using Enumerable.GroupBy()
instead of Distinct()
. Doing so should completely eliminate the need for the n-squared cross-reference.
In addition, since you only want to pass alarms with a known severity level to your business logic function, you can filter for them upfront, before doing the grouping. This should improve performance linearly, depending on the number of alarms that are skipped.
Thus your code should look something like:
var groupedData = alarmsViewModelList
.Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown)
.GroupBy(c => new { c.AssetNumber, c.AlarmName })
.Select(g => g.Select(c =>
new {
HealthIssueID = c.ID,
AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
AlarmSeverityLevel = c.AlarmSeverityLevel,
}).OrderByDescending(c =>c.AlarmLastUpdateDateTime).ToList())
.Where(l => l.Count > 0);
foreach (var alarmList in groupedData)
businessLogicFunction(alarmList);
Notes:
ToListAsync()
you are fetching all the data locally, You might want to try doing the filtering or grouping on the server side rather than the client side.Demo fiddles using synthetic data generated using Enumerable.Range()
here:
The original code with elapsed time = 4.2335498 seconds here.
I had to tweak the code somewhat just to make it compile.
The modified code with with elapsed time = 0.0393871 seconds here.
Upvotes: 1
Reputation: 4099
I am going say that your question is the wrong question. It is good to write efficient business logic for processing your data, and often an efficient algorithm can improve performance by orders of magnitude.(Sorting lists is a great example here)
However, if you want to make this go faster, you might look at optimizing your data as you pull it from the db. Modern databases are highly tuned to filter and join data very fast, by people who are thinking about nothing but this. Provided that you have some decent indexing on your tables / blobs /graph /whatever, you could include some clauses in your db query to filter out records that don't need to be processed.
Pulling 30k records and sending it over the wire is going to take (in db time scales) a lot of effort. I would hope that you can grab all this in a single query, as retrieving it in multiple pulls will take even longer.
I have no data on how log your query is running for, or what the load time is to transfer and de-serialize your data. I will be willing to bet you money though, that if you filter dead records in your query and say cut your payload in half that you will gain a huge performance boost. Don't load records that you don't need to process. Also, if you can properly filter on the metal, your linq will probably just turn into a list that can be processed in o(n) time. Why tune your Linq, if you can tune the data it runs on?
Good luck.
Upvotes: 0
Reputation: 1
I'm thinking the issue is that you're looping over a Where clause and ToList 12k times - that's slowing you down. What if you change out this block:
for (int j = 0; j < groupedData.Count; j++)
{
var alarm = groupedData[j];
//The line is actually slowing the code.
var alarmlist = filteralarms.AsEnumerable().Where(c => c.AlarmName == alarm.AlarmName && c.AssetNumber == alarm.AssetNumber
).Select
(c => new
{
HealthIssueID = c.ID,
AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
AlarmSeverityLevel = c.AlarmSeverityLevel,
}).OrderByDescending(c =>c.AlarmLastUpdateDateTime).ToList();
int alarmCount = alarmlist.Count;
if (alarmCount > 1)
{
businessLogicFunction(alarmlist);
}
}
to a join? This will significantly improve your performance. Since you're ok with using query syntax, try using a more natural-looking join like this:
var alarmlist = from g in groupedData
join f in filteralarms on g.AssetNumber equals f.AssetNumber
where g.AlarmName == f.AlarmName
select new
{
HealthIssueID = f.ID,
AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(f.AlarmLastUpdatedTime),
AlarmSeverityLevel = f.AlarmSeverityLevel,
};
int alarmCount = alarmlist.Count();
if (alarmCount > 1)
{
businessLogicFunction(alarmlist.OrderByDescending(o => o.AlarmLastUpdateDateTime).ToList());
}
Hope that helps!
Upvotes: 0
Reputation: 5042
I don't see any answers for optimising of getting alarmsViewModelList
query because you already fetch 30K records from database and then in the application memory trying to filter alarms but why not filter them on the database side.
It is worth to mention the AsNoTracking()
is useless here because you use anonymous type for result and EF will track only entities.
I think OP didn't clear these ones:
AlarmSeverityLevel
and AlarmLastUpdateDateTime
come from which tables? (but I assume it's from first table)Crew
but not used later, why? we can remove it so?So the statement could be like this:
var alarms = Context.HealthIssues
.Where(x => x.AlarmSeverityLevel != AlarmSeverityLevel.Unknown)
.Select(x => new { HealthIssueID= x.Id, x.AssetNumber , x.AlarmName ,x.AlarmSeverityLevel ,x.AlarmLastUpdateDateTime })
.GroupBy(x => new { x.AssetNumber, x.AlarmName })
.Select(x => new
{
ItemsCount = x.Count(),
Items = x.Select(s => new
{
AlarmSeverityLevel = s.AlarmSeverityLevel,
AlarmLastUpdatedTime = s.AlarmLastUpdateDateTime,//for not generating translate error, can change later,
HealthIssueID = s.HealthIssueID
}).OrderByDescending(o => o.AlarmLastUpdatedTime)
}).ToList();
for (int i = 0; i < alarms.Count; i++)
{
var alarm = alarms[i];
//you can update AlarmLastUpdatedTime here for alarm.Items
if(alarm.ItemsCount > 1)
businessLogicFunction(alarm. Items);
}
I can't run this query please let me know if there is any issues.
Upvotes: 0
Reputation: 155
Avoid using ToList()
and AsEnumerable()
unnecessarily, as these operations can be expensive.try to use the IQueryable
interface to filter and sort the data, so that the database can handle these operations for you eg:
var alarms = from healthIssue in _context.HealthIssues.AsNoTracking()
join asset in _context.Assets.AsNoTracking() on healthIssue.AssetNumber equals asset.SerialNumber into joinedTable
from data in joinedTable.DefaultIfEmpty()
select new
{
ID = healthIssue.ID,
AssetNumber = healthIssue.AssetNumber,
AlarmName = healthIssue.AlarmName,
Crew = data.Crew,
};
Instead of using a for loop
to iterate over groupedData
, you can use foreach
is generally more efficient because it avoids the overhead
if possible, try to move the businessLogicFunction
call outside of the loop, so that it is only called once for each distinct alarm.AlarmName
and alarm.AssetNumber
combination.
// filteralarms count = 20k
var filteralarms = alarms.Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown);
// groupedData count = 12k
var groupedData = filteralarms.Select(c => new { c.AssetNumber, c.AlarmName }).Distinct();
foreach (var alarm in groupedData)
{
// alarmlist count = ?
var alarmlist = filteralarms.Where(c => c.AlarmName == alarm.AlarmName && c.AssetNumber == alarm.AssetNumber
).Select
(c => new
{
HealthIssueID = c.ID,
AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
AlarmSeverityLevel = c.AlarmSeverityLevel,
}).OrderByDescending(c => c.AlarmLastUpdateDateTime);
if (alarmlist.Any()
{
businessLogicFunction(alarmlist);
}
}
Upvotes: 0
Reputation: 101483
You basically group data by AlarmName + AssetNumber, filter our alarms with severity level Unknown
then run business function on grouped batches (after minor adjustment). More efficient approach to this would be something like this:
var grouped = alarmsViewModelList
// throw away unknown, you are not using them anywhere
.Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown)
// group by AssetNumber + AlarmName
.GroupBy(c => new { c.AssetNumber, c.AlarmName })
.Select(gr => new
{
gr.Key.AlarmName,
gr.Key.AssetNumber,
// convert batch of this group to the desired form
Items = gr.Select(c => new
{
HealthIssueID = c.ID,
AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
AlarmSeverityLevel = c.AlarmSeverityLevel,
}).OrderByDescending(c => c.AlarmLastUpdateDateTime).ToList()
});
foreach (var data in grouped) {
if (data.Items.Count > 1) {
businessLogicFunction(data.Items);
}
}
Upvotes: 4
Reputation: 1
I would go with these three optimizations in your code.
var groupedData = alarmsViewModelList.GroupBy(c => new { c.AssetNumber,c.AlarmName }).ToListAsync();
var filteralarms = await alarmsViewModelList.Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown).ToListAsync();
foreach (var alarm in groupedData)
var alarmlist = filteralarms.AsEnumerable().Where(c => c.AlarmName == alarm.Key.AlarmName && c.AssetNumber == alarm.Key.AssetNumber)
.Select(c => new
{
HealthIssueID = c.ID,
AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
AlarmSeverityLevel = c.AlarmSeverityLevel,
}).OrderByDescending(c => c.AlarmLastUpdateDateTime).ToList();
Upvotes: 0
Reputation: 154
This is what I can make with linq.
//alarmsViewModelList count is 30k
var alarmsViewModelList = await alarms.ToListAsync();
//groupedData is almost 12k
var groupedData = alarmsViewModelList.Select(c => new { c.AssetNumber,c.AlarmName}).Distinct().ToList();
// filteralarms' count is almost 20k
var filteralarms = alarmsViewModelList.Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown).OrderByDescending(c => DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdateDateTime));
for (int j = 0; j < groupedData.Count; j++)
{
var alarm = groupedData[j];
//The line is actually slowing the code.
var alarmlist = filteralarms.Where(c => c.AlarmName == alarm.AlarmName && c.AssetNumber == alarm.AssetNumber);
if (alarmlist.Count() > 1)
{
businessLogicFunction(alarmlist.Select
(c => new
{
HealthIssueID = c.ID,
AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
AlarmSeverityLevel = c.AlarmSeverityLevel,
}).ToList());
}
filteralarms = filteralarms.Where(c => c.AlarmName != alarm.AlarmName || c.AssetNumber != alarm.AssetNumber).ToList();
}
Above code O(2n) I think. And if you can, you can make it faster by removing ToList() in businessLogicFunction like.
businessLogicFunction(alarmlist.Select
(c => new
{
HealthIssueID = c.ID,
AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
AlarmSeverityLevel = c.AlarmSeverityLevel,
}));
Changed it so don't use skip insted index that way way faster Even faster approach is order the lists and skip the rest like this:
//alarmsViewModelList count is 30k
var alarmsViewModelList = alarms.ToList();
// here the groupedData list look like this {(1,1),(2,1),(3,1),(4,1),(5,1),(6,1)}. because the list is orderd by assetNumber then by alarmName
var groupedData = alarmsViewModelList.Select(c => new { c.AssetNumber, c.AlarmName }).Distinct().OrderBy(c => c.AssetNumber ).ThenBy(c => c.AlarmName).ToList();
// here the filteralarms list look like this {(1,1), (1,1) (1,1), (2,1),(2,1),(3,1),(3,1),(3,1),(4,1)...}
var filteralarms = alarmsViewModelList.Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown).OrderBy(c => c.AssetNumber).ThenBy(c => c.AlarmName).AsEnumerable();
int k = 0;
for (int j = 0; j < groupedData.Count; j++)
{
var alarm = groupedData[j];
//The line is actually slowing the code.
var alarmlist = new List<Alarm>();
for(; k<filteralarms.Count();k++)
{
if (filteralarms[k].AlarmName == alarm.AlarmName && filteralarms[k].AssetNumber == alarm.AssetNumber)
{
alarmlist.Add(filteralarms[k]);
}
else
{
break;
}
}
if (alarmlist.Count() > 1)
{
businessLogicFunction(alarmlist.Select
(c => new
{
HealthIssueID = c.ID,
AlarmLastUpdateDateTime = c.AlarmLastUpdatedTime,
AlarmSeverityLevel = c.AlarmSeverityLevel,
}).OrderByDescending(c => c.AlarmLastUpdateDateTime).ToList());
}
Above code is O(n) I think.
Upvotes: 3
Reputation: 26
A simple logical performance gain would be to remove resolved alarms as you go:
...
var alarm = groupedData[j];
//The line is actually slowing the code.
var matchingAlarms = filteralarms.Where(c => c.AlarmName == alarm.AlarmName && c.AssetNumber == alarm.AssetNumber);
var alarmlist = filteralarms.Except(matchingAlarms
).Select
(c => new
{
...
Upvotes: 0
Reputation: 81
Try it with AsNoTracking, if there is an OrderBy,OrderByDescending record, use it at the end
var iq_filteralarms = alarmsViewModelList.Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown).AsNoTracking(); /* IQueryable */
foreach (var item in alarmsViewModelList.Select(c => new
{
c.AssetNumber,
c.AlarmName
}).Distinct())
{
var iq_alarmlist = iq_filteralarms.Where(c => c.AlarmName == item.AlarmName && c.AssetNumber == item.AssetNumber).Select(c=> new {
c.ID,
c.AlarmLastUpdatedTime,
c.AlarmSeverityLevel
});
if (iq_alarmlist.Count() > 1)
{
businessLogicFunction(iq_alarmlist.AsEnumerable().Select(c => new
{
HealthIssueID = c.ID,
AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
AlarmSeverityLevel = c.AlarmSeverityLevel,
}).OrderByDescending(c => c.AlarmLastUpdateDateTime));
}
}
Upvotes: 0