Reputation: 7334
class infoContact
{
private string contacts_first_nameField;
private string contacts_middle_nameField;
private string contacts_last_nameField;
private Phonenumber[] phone_numbersField;
private Emailaddress[] emailField;
}
I have a List<infoContact>
The list contains almost 7000 which I get from some other program. In the list out of 7000, 6500 are duplicates. I am looking for a way how to eliminate duplicates.
A infoContact is duplicate if first_name, last_name, emailaddresses, phone numbers are same.
I thought of using a HashSet<infoContact>
and override getHashCode() of infoContact.
I am just curious to know if that is the best way to do. If this is not a good way what is the better way?
Upvotes: 0
Views: 287
Reputation: 1537
Implement your class infoContact
as a derivate of IEquatable<infoContact>
:
class InfoContact : IEquatable<InfoContact> {
string contacts_first_nameField;
string contacts_last_nameField;
object[] phone_numbersField;
object[] emailField;
// other fields
public bool Equals(InfoContact other) {
return contacts_first_nameField.Equals(other.contacts_first_nameField)
&& contacts_last_nameField.Equals(other.contacts_last_nameField)
&& phone_numbersField.Equals(other.phone_numbersField)
&& emailField.Equals(other.emailField);
}
}
and use Linqs Enumerable.Distinct
method in order to filter the duplicates:
var infoContacts = GetInfoContacts().Distinct();
Upvotes: 0
Reputation: 1898
i created a remove deducted items from list class before here is the key for it ,
List<string> list = new List<string>();
foreach (string line in File.ReadAllLines(somefile.txt))
{
if (!list.Contains(line))
{
list.Add(line);
}
}
Upvotes: 1
Reputation: 10400
Firstly think of extracting the unique values. You could use the Distinct() Linq method with a comparer like:
public class infoContactComparer : IEqualityComparer<infoContact>
{
public bool Equals(infoContact x, infoContact y)
{
return x.contacts_first_nameField == y.contacts_first_nameField
&& x.contacts_last_nameField == y.contacts_last_nameField
&& ...
}
public int GetHashCode(infoContact obj)
{
return obj.contacts_first_nameField.GetHashCode();
}
}
Upvotes: 0
Reputation: 21520
The right way is to ovveride the equals method!
In this way, when you add new element in the list, the element don't will be added!
Upvotes: 0
Reputation: 126864
Two options: override GetHashCode
and Equals
if you control the source of infoContact
and your overrides will be true for any particular use of the class.
Otherwise, define a class implementing IEqualityComparer<infoContact>
, which also allows you to define proper Equals
and GetHashCode
methods, and then pass an instance of this into a HashSet<infoContact>
constructor or into a listOfContacts.Distinct
method call (using Linq).
Note: your question seems to be based on the idea that GetHashCode
should determine equality or uniqueness. It shouldn't! It's part of the tool that allows a HashSet to do its job, but it is not required to return unique values for unequal instances. The values should be well distributed, but they can ultimately overlap.
In short, two equal instances should have the same hash code, but two instances sharing the same hash code are not necessarily equal. For more on guidelines for GetHashCode
, please visit this blog.
Upvotes: 0
Reputation: 103740
You can use the Distinct extension method that takes an IEqualityComparer<T>
. Just write a class that implements that interface, and does the comparison, and then you can just do something like this:
var filteredList = oldList.Distinct(new InfoContactComparer());
Upvotes: 5
Reputation: 3055
override an equals method with the parametres you want so you can compare objects through equals
Upvotes: 1