TonyW
TonyW

Reputation: 786

Remove duplicate rows from Data Table, but only if they match a list of strings

This will be a little hard for me to explain, but i'm having an issue removing duplicates (in a unique way, so reading a lot of google doc's isn't helping atm).

Let me show a picture of what i'm trying to do, in an effort to explain my issue a little better.

enter image description here Ok, I want to get every row that = textbox1.text or in this case... \joi\al\users. Once I have those rows, I want to grab the column(8).value string for that row into a list.

Example Circle 2.

Then iterate through every other row that does NOT = \joi\al\users, and grab the column(8).value. If the column(8).value is the same as any value in the previous list, remove that whole row from the datatable. With that being said, I want to ignore any row that = \joi\al\users.

Added a small datatable to give a better example of what i'm looking for per a comment I got.

Dim dt As New DataTable

Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load

    dt.Columns.Add("File Path", GetType(String))
    dt.Columns.Add("Audit", GetType(String))

    dt.Rows.Add("\\joi\al\users", "1")
    dt.Rows.Add("\\joi\al\users", "2")
    dt.Rows.Add("\\joi\al\users", "3")
    dt.Rows.Add("\\joi\al\users", "4")
    dt.Rows.Add("\\joi\al\users\otherfolder0", "1")
    dt.Rows.Add("\\joi\al\users\otherfolder1", "x")
    dt.Rows.Add("\\joi\al\users\otherfolder2", "x")
    dt.Rows.Add("\\joi\al\users\otherfolder3", "1")
    dt.Rows.Add("\\joi\al\users\otherfolder4", "2")
    dt.Rows.Add("\\joi\al\users\otherfolder5", "3")
    dt.Rows.Add("\\joi\al\users\otherfolder6", "4")

    Dim bs1 As New BindingSource
    bs1 = New BindingSource(dt, "")
    DataGridView1.DataSource = bs1

    'Rows with column(FilePath) = "\\joi\al\users" keep, and make note of column(Audit) 1,2,3,4.
    'Iterate through the rest of the rows, and remove any row with column(Audit) 1,2,3,4.

    'In this example, the only rows that would be left would be...

    'dt.Rows.Add("\\joi\al\users", "1")
    'dt.Rows.Add("\\joi\al\users", "2")
    'dt.Rows.Add("\\joi\al\users", "3")
    'dt.Rows.Add("\\joi\al\users", "4")
    'dt.Rows.Add("\\joi\al\users\otherfolder1", "x")
    'dt.Rows.Add("\\joi\al\users\otherfolder2", "x")

End Sub

Upvotes: 1

Views: 1643

Answers (1)

Given test data looking a bit like this:

enter image description here

There are only 2 "FilePath" versions, but that doesn't matter since you are excluding a specific one. Where that text comes from - text box, speech input or burning bush - doesn't matter.

There are 26 entries, 5 are exempt, all the exempts have "excelsior" or "foo" as the Audit item. First, a sort of step-by-step version; some can be combined as shown later:

Dim exempt = "This entry is exempt"

' get the exempt rows
Dim exemptRows = dtX.AsEnumerable.Where(Function(q) q.Field(Of String)(0) = exempt)

' select the Audit data in them
Dim dupeData = exemptRows.Select(Function(s) s.Field(Of String)("Audit")).ToList()

' get the rows that are NOT exempt AndAlso are contained in dupeData
Dim NonDupeRows = dtX.AsEnumerable.
            Where(Function(q) q.Field(Of String)(0) <> exempt AndAlso
                            dupeData.Contains(q.Field(Of String)("Audit")) = False).
                        ToList()
' add the exempt rows back  (dont like this - it reorders them)
NonDupeRows.AddRange(exemptRows.ToArray())
  ' create a table for your viewing pleasure
dgvF.DataSource = NonDupeRows.CopyToDataTable()

enter image description here

I havent checked every singe entry, but the exempt ones are there and there are no "excelsior" or "foo" items in the list; so it looks right. I might add an index/Id column to the original table which might expedite and would prevent them from being reordered in the result. I might also use a LookUp if there are many rows.


This is a shorter, probably faster version with many rows. One benefit is that it prevents the retained rows from being reordered:

' select the Audit data in the non exempt rows
Dim dupeData = dtX.AsEnumerable.Where(Function(z) z.Field(Of String)(0) = exempt).
            Select(Function(s) s.Field(Of String)("Audit")).
            ToLookup(Function(q) q)

' get rows which are exempt OrElse do not have dupe Audit data
Dim NonDupeTbl = dtX.AsEnumerable.
    Where(Function(z) (z.Field(Of String)(0) = exempt) OrElse
              dupeData.Contains(z.Field(Of String)("Audit")) = False).CopyToDataTable()

' create a table for your viewing pleasure
dgvF.DataSource = NonDupeTbl

Using the new sample data in the Edit to the question, the results pass:

enter image description here

Upvotes: 1

Related Questions