MainTank
MainTank

Reputation: 25

How do I delete duplicates between two excel sheets quickly vba

I am using vba and I have two sheets one is named "Do Not Call" and has about 800,000 rows of data in column A. I want to use this data to check column I in the second sheet, named "Sheet1". If it finds a match I want it to delete the whole row in "Sheet1". I have tailored the code I have found from a similar question here: Excel formula to Cross reference 2 sheets, remove duplicates from one sheet and ran it but nothing happens. I am not getting any errors but it is not functioning.

Here is the code I am currently trying and have no idea why it is not working

Option Explicit
Sub CleanDupes()
Dim wsA As Worksheet
Dim wsB As Worksheet
Dim keyColA As String

Dim keyColB As String
Dim rngA As Range
Dim rngB As Range
Dim intRowCounterA As Integer
Dim intRowCounterB As Integer
Dim strValueA As String


keyColA = "A"
keyColB = "I"

intRowCounterA = 1
intRowCounterB = 1

Set wsA = Worksheets("Do Not Call")
Set wsB = Worksheets("Sheet1")

Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")

Do While Not IsEmpty(wsA.Range(keyColA & intRowCounterA).Value)
    Set rngA = wsA.Range(keyColA & intRowCounterA)
    strValueA = rngA.Value
    If Not dict.Exists(strValueA) Then
        dict.Add strValueA, 1
    End If
    intRowCounterA = intRowCounterA + 1
Loop

intRowCounterB = 1
Do While Not IsEmpty(wsB.Range(keyColB & intRowCounterB).Value)
    Set rngB = wsB.Range(keyColB & intRowCounterB)
    If dict.Exists(rngB.Value) Then
         wsB.Rows(intRowCounterB).delete
         intRowCounterB = intRowCounterB - 1
    End If
    intRowCounterB = intRowCounterB + 1
Loop
End Sub

I apologize if the above code is not in a code tag. This is my first time posting code online and I have no idea if I did it correctly.

Upvotes: 2

Views: 7350

Answers (2)

Daniel
Daniel

Reputation: 13142

Because I had the time, here's a rewrite forgoing the Dictionary and instead using a worksheet function. (Inspired by the Vlookup comment). I'm not sure which would be faster.

Sub CleanDupes()
    Dim targetRange As Range, searchRange As Range
    Dim targetArray
    Dim x As Long
    'Update these 4 lines if your target and search ranges change
    Dim TargetSheetName As String: TargetSheetName = "Sheet1"
    Dim TargetSheetColumn As String: TargetSheetColumn = "I"
    Dim SearchSheetName As String: SearchSheetName = "Do Not Call"
    Dim SearchSheetColumn As String: SearchSheetColumn = "A"

    'Load target array
    With Sheets(TargetSheetName)
        Set targetRange = .Range(.Range(TargetSheetColumn & "1"), _
                .Range(TargetSheetColumn & Rows.Count).End(xlUp))
        targetArray = targetRange
    End With
    'Get Search Range
    With Sheets(SearchSheetName)
        Set searchRange = .Range(.Range(SearchSheetColumn & "1"), _
                .Range(SearchSheetColumn & Rows.Count).End(xlUp))
    End With
    If IsArray(targetArray) Then
        For x = UBound(targetArray) To 1 Step -1
            If Application.WorksheetFunction.CountIf(searchRange, _
                                        targetArray(x, 1)) Then
                targetRange.Cells(x).EntireRow.Delete
            End If
        Next
    Else
        If Application.WorksheetFunction.CountIf(searchRange, targetArray) Then
            targetRange.EntireRow.Delete
        End If
    End If
End Sub

Upvotes: 0

Daniel
Daniel

Reputation: 13142

I'm embarrassed to admit that the code you shared confused me... anyway for the practice I rewrote it using arrays instead of looping through the sheet values:

Option Explicit
Sub CleanDupes()
    Dim targetArray, searchArray
    Dim targetRange As Range
    Dim x As Long

    'Update these 4 lines if your target and search ranges change
    Dim TargetSheetName As String: TargetSheetName = "Sheet1"
    Dim TargetSheetColumn As String: TargetSheetColumn = "I"
    Dim SearchSheetName As String: SearchSheetName = "Do Not Call"
    Dim SearchSheetColumn As String: SearchSheetColumn = "A"

    'Load target array
    With Sheets(TargetSheetName)
        Set targetRange = .Range(.Range(TargetSheetColumn & "1"), _
                .Range(TargetSheetColumn & Rows.Count).End(xlUp))
        targetArray = targetRange
    End With
    'Load Search Array
    With Sheets(SearchSheetName)
        searchArray = .Range(.Range(SearchSheetColumn & "1"), _
                .Range(SearchSheetColumn & Rows.Count).End(xlUp))
    End With


    Dim dict As Object
    Set dict = CreateObject("Scripting.Dictionary")
    'Populate dictionary from search array
    If IsArray(searchArray) Then
        For x = 1 To UBound(searchArray)
            If Not dict.exists(searchArray(x, 1)) Then
                dict.Add searchArray(x, 1), 1
            End If
        Next
    Else
        If Not dict.exists(searchArray) Then
            dict.Add searchArray, 1
        End If
    End If

    'Delete rows with values found in dictionary
    If IsArray(targetArray) Then
        'Step backwards to avoid deleting the wrong rows.
        For x = UBound(targetArray) To 1 Step -1
            If dict.exists(targetArray(x, 1)) Then
                targetRange.Cells(x).EntireRow.Delete
            End If
        Next
    Else
        If dict.exists(targetArray) Then
            targetRange.EntireRow.Delete
        End If
    End If
End Sub

Edit: Because it bothered me, I reread the code that you provided. It confuses me because it isn't written the way I'd have expected and fails unless you're checking string values only. I've added comments to indicate what it's doing in this snippet:

'Checks to see if the particular cell is empty.
Do While Not IsEmpty(wsA.Range(keyColA & intRowCounterA).Value)
    'Stores the cell to a range for no good reason.
    Set rngA = wsA.Range(keyColA & intRowCounterA)
    'Converts the value of the cell to a string because strValueA is a string.
    strValueA = rngA.Value
    'Checks to see if the string is in the dictionary.
    If Not dict.Exists(strValueA) Then
        'Adds the string to the dictionary.
        dict.Add strValueA, 1
    End If

Then later:

 'checks the value, not the value converted to a string.
 If dict.Exists(rngB.Value) Then 

This fails because the Scripting Dictionary does not consider a double to equal a string, even if they would be the same if the double were converted to a string.

Two ways to fix the code you posted, either change the line I just showed to this:

If dict.Exists(cstr(rngB.Value)) Then

Or you can change Dim strValueA As String to Dim strValueA.

Upvotes: 4

Related Questions