Reputation: 679
I'm trying to write a query to find all potential duplicate records for a user. A duplicate is defined as any of the following:
I have data coming from two separate systems, from system B the externalID on my table will be populated with a GUID, from system A the externalID will be null. I'm only concerned with finding duplicates between the two systems, not duplicates within each system.
The query i have thus far:
Dim duplicates = db.People.Where(Function(c) c.CID = CID And c.DeleteFlag = False). _
Where(Function(p) p.EmailAddress <> Nothing And p.ExternalID <> Nothing) _
.GroupBy(Function(i) New With {i.EmailAddress, i.LastName, i.FirstName}) _
.Where(Function(g) g.Count() > 1).[Select](Function(g) g.Key)
Issues I am having:
Upvotes: 0
Views: 890
Reputation: 460208
Never use <>
or =
to compare with Nothing
, you 'll always get False
(similar as in sql). Nothing
isn't equal or unequal to anything else. Instead use IsNot
or Is Nothing
. Also use AndAlso
instead of And
(and OrElse
instead of Or
).
In VB.NET (as opposed to C#) you also have to specify the key-columns which are used for Equals
and GetHashCode
of the anonymous type.
Dim duplicates = db.People.Where(Function(c) c.CID = CID And c.DeleteFlag = False).
Where(Function(p) p.EmailAddress IsNot Nothing AndAlso p.ExternalID IsNot Nothing).
GroupBy(Function(i) New With {Key i.EmailAddress, Key i.LastName, Key i.FirstName}).
Where(Function(g) g.Count() > 1).
Select(Function(g) g.Key)
Upvotes: 2