Reputation: 31
I'm trying to find duplicated records in CSV file. Let me explain what I am trying to do:
Plate 1 has 2 A800 and 2 B500.
Plate 2 has 2 A800 and 2 B500 it's same as Plate 1. I want to detect this situation.
Plate 3 has 1 A800 and 1 B500.
I tried with 2 nested For Each Loop and stored them inside of 2 dimensional array but if numbers is high this approach is very slow.
What is your advice for this situation?
Upvotes: 0
Views: 127
Reputation: 278
Here's an example of using GroupBy. In your GroupBy, be certain use "Key" before every field. If you use
.GroupBy(Function(F) New With {.Plate = F.Plate,
.Part = F.Part}) _
the grouping will NOT work.
Private Class cPlate_Part
Public Plate As Integer
Public Part As String
Sub New(Plate As Integer, Part As String)
Me.Plate = Plate
Me.Part = Part
End Sub
End Class
Private Plate_Parts() As cPlate_Part =
New cPlate_Part() {New cPlate_Part(1, "A800"),
New cPlate_Part(1, "A800"),
New cPlate_Part(1, "B500"),
New cPlate_Part(1, "B500"),
New cPlate_Part(2, "B500"),
New cPlate_Part(1, "A800"),
New cPlate_Part(2, "A800"),
New cPlate_Part(2, "A800"),
New cPlate_Part(3, "A800")}
Private FindDuplicates()
Dim GroupedParts = Plate_Parts _
.GroupBy(Function(F) New With {Key .Plate = F.Plate,
Key .Part = F.Part}) _
.ToList ' If you don't want it re-evaluated every time you use it
Dim FindDuplicates = GroupedParts _
.Where(Function(F) F.Count > 1) _
.Select(Function(F) F(0))
Dim DistinctParts = GroupedParts _
.Select(Function(F) F(0))
For Each P In DistinctParts
Debug.Print($"Plate={P.Plate} Part={P.Part}")
Next
End Sub
The way GroupBy works is if you have a simple type (such as integer or string), you can simply group by that type. If it is more complex, create a New With and be sure to use keys as I have shown. The result is a iEnumerable of iGrouping, which acts mostly like an iEnumerable, but has a "Key" method.
For example
for each GroupOfParts in GroupedParts
dim Key=GroupOfParts.Key
Dim Group=GroupOfParts.ToList ' Usually Unnecessary
' Some code
next
EDIT:
Here is an alternate group by using alternate syntax
Dim GroupedParts_AlternateSyntax = From Parts In Plate_Parts
Group By Parts.Part, parts.plate Into Group
There is a subtle difference in how it works. You need to use ".Group" method for each item to get the group, or an explicit select
Dim GroupedParts_AlternateSyntax = From Parts In Plate_Parts
Group By Parts.Part, parts.plate Into Group
Select Group
Upvotes: 1