ThickAsAPlank
ThickAsAPlank

Reputation: 23

Count duplicate rows in a datatable using vb.net

I have a datatable called dtstore with 4 columns called section, department, palletnumber and uniquenumber. I am trying to make a new datatable called dtmulti which has an extra column called multi which shows the count for the number of duplicate rows...

dtstore

section | department | palletnumber | batchnumber

---------------------------------------------------

 pipes      2012          1234           21

 taps       2011          5678           345

 pipes      2012          1234           21

 taps       2011          5678           345

 taps       2011          5678           345

 plugs      2009          7643           63


dtmulti

section | department | palletnumber | batchnumber | multi

----------------------------------------------------------

 pipes      2012          1234           21           2

 taps       2011          5678           345          3

I have tried lots of approaches but my code always feels clumsy and bloated, is there an efficient way to do this?

Here is the code I am using:

Dim dupmulti = dataTable.AsEnumerable().GroupBy(Function(i) i).Where(Function(g) g.Count() = 2).Select(Function(g) g.Key)  

For Each row In dupmulti multirow("Section")  = dup("Section") 

multirow("Department") = dup("Department") 
multirow("PalletNumber") = dup("PalletNumber") 
multirow("BatchNumber") = dup("BatchNumber") 
multirow("Multi") = 2
    Next

Upvotes: 0

Views: 2562

Answers (1)

user2480047
user2480047

Reputation:

Assumptions of the code below these lines: the DataTable containing the original information is called dup. It might contain any number of duplicates and all of them can be defined by just looking at the first column.

'Creating final table from the columns in the original table
Dim multirow As DataTable = New DataTable

For Each col As DataColumn In dup.Columns
   multirow.Columns.Add(col.ColumnName, col.DataType)
Next
multirow.Columns.Add("multi", GetType(Integer))

'Looping though the groupped rows (= no duplicates) on account of the first column
For Each groups In dup.AsEnumerable().GroupBy(Function(x) x(0))

    multirow.Rows.Add()

    'Adding all the cells in the corresponding row except the last one
    For c As Integer = 0 To dup.Columns.Count - 1
        multirow(multirow.Rows.Count - 1)(c) = groups(0)(c)
    Next

    'Adding the last cell (duplicates count) 
    multirow(multirow.Rows.Count - 1)(multirow.Columns.Count - 1) = groups.Count

Next

Upvotes: 2

Related Questions