Reputation: 2636
Morning guys.
Using C sharp .net4, and MS Visual Studio 2010.
I have Developed a duplication checker for my windows form program. It works perfectly and Is virtually Instant on my Datagrid when there are a couple hundred records.
The problem I've noticed is that when there are 6000 records displayed it is not efficient enough at all and takes minutes.
I was wandering if anyone has some good tips to make this method a lot faster either improving upon the existing design or, a different method all together that I've over looked.
Your help is once again much appreciated!
Here's the code:
public void CheckForDuplicate()
{
DataGridViewRowCollection coll = ParetoGrid.Rows;
DataGridViewRowCollection colls = ParetoGrid.Rows;
IList<String> listParts = new List<String>();
int count = 0;
foreach (DataGridViewRow item in coll)
{
foreach (DataGridViewRow items in colls)
{
count++;
if ((items.Cells["NewPareto"].Value != null) && (items.Cells["NewPareto"].Value != DBNull.Value))
{
if ((items.Cells["NewPareto"].Value != DBNull.Value) && (items.Cells["NewPareto"].Value != null) && (items.Cells["NewPareto"].Value.Equals(item.Cells["NewPareto"].Value)))
{
if ((items.Cells["Part"].Value != DBNull.Value) && (items.Cells["Part"].Value != null) && !(items.Cells["Part"].Value.Equals(item.Cells["Part"].Value)))
{
listParts.Add(items.Cells["Part"].Value.ToString());
dupi = true; //boolean toggle
}
}
}
}
}
MyErrorGrid.DataSource = listParts.Select(x => new { Part = x }).ToList();
}
Any Questions let me know and I will do my best to answer them.
Upvotes: 0
Views: 320
Reputation: 42343
If you can, you should try and do this on the underlying data rather than on the UI objects - however I have a hunch that you're seeding it from a set of DataRows, in which case you might not be able to do that.
I think a big part of the issue here is the repeated dereferencing of the cells by name, and the fact that you repeatedly deference the second set of cells. So do it all up front:
var first = (from row in coll.Cast<DataGridViewRow>()
let newpareto = row.Cells["NewPareto"].Value ?? DBNull.Value
let part = row.Cells["Part"].Value ?? DBNull.Value
where newpareto != DBNull.Value && part != DBNull.Value
select new
{ newpareto = newpareto, part = part }).ToArray();
//identical - so a copy-paste job (if not using anonymous type we could refactor)
var second = (from row in colls.Cast<DataGridViewRow>()
let newpareto = row.Cells["NewPareto"].Value ?? DBNull.Value
let part = row.Cells["Part"].Value ?? DBNull.Value
where newpareto != DBNull.Value && part != DBNull.Value
select new
{ newpareto = newpareto, part = part }).ToArray();
//now produce our list of strings
var listParts = (from f in first
where second.Any(v => v.newpareto.Equals(f.newpareto)
&& !v.part.Equals(f.part))
select f.part.ToString()).ToList(); //if you want it as a list.
Upvotes: 1
Reputation: 61402
There is an approach that will make this much more efficient. You need to compute a hash of each item. Items with different hashes can't possibly be duplicates.
Once you have the hashes, you could either sort by hash or use a data structure with efficient keyed retrieval (like Dictionary<TKey,TValue>
) to find all the duplicates.
Upvotes: 1