Shaun Ng
Shaun Ng

Reputation: 11

VBA for deleting rows of first duplicates among many duplicates

I am looking to write a VBA with the following steps:

  1. Copy data from SheetA into SheetB
  2. Deleting rows in SheetB using a reference column in SheetC
  3. Sorting SheetB
  4. Deleting rows of only first duplicates among many duplicates

I have succeeded from steps 1-3. However when I run the code for Step4, one of the values with duplicate was not deleted. This value populated a cell somewhere in the middle of the data.

However, when I run the same formula referencing only the value having errors as mentioned above, the code works fine.

Appreciate any assistance :)

Below is the VBA that I used:

Sub ProcessDataAndRemoveDuplicates()
    Dim targetSheet As Worksheet
    Dim dataSheet As Worksheet
    Dim masterList As Range
    Dim lastRow As Long
    Dim i As Long
    Dim currentVal As Variant
    Dim dictDuplicates As Object
    Set dictDuplicates = CreateObject("Scripting.Dictionary")
    
    ' Set the reference to the target sheet
    Set targetSheet = ThisWorkbook.Sheets("Total Renom") ' Change to the actual name of the target sheet
    
    Application.ScreenUpdating = False
    
    ' Set the reference to the data sheet
    Set dataSheet = ThisWorkbook.Sheets("Active Sheet")
    Set masterList = ThisWorkbook.Sheets("Free Renom").Range("C:C") ' Change to the actual range in Free Renom
    
    ' Copy and paste values from "Total Renom" to "Active Sheet"
    targetSheet.UsedRange.Copy
    dataSheet.Range("A1").PasteSpecial Paste:=xlPasteValues
    
    ' Delete rows where corresponding value in column P appears in the list
    lastRow = dataSheet.Cells(dataSheet.Rows.Count, "P").End(xlUp).Row
    For i = lastRow To 2 Step -1
        valueToCheck = dataSheet.Cells(i, "P").Value ' Assuming column P is 16
        
        If valueToCheck <> "" And Application.WorksheetFunction.CountIf(masterList, valueToCheck) > 0 Then
            dataSheet.Rows(i).Delete
        End If
    Next i
    
    ' Sort the ActiveSheet by column P and then by column N
    With dataSheet.Sort
        .SortFields.Clear
        .SortFields.Add Key:=Range("P:P"), Order:=xlAscending
        .SortFields.Add Key:=Range("N:N"), Order:=xlAscending
        .SetRange Range("A:R") ' Change the range to match your data
        .Header = xlYes
        .Apply
    End With
    
    Application.ScreenUpdating = True ' Ensure screen updating is turned on to see the debug output
    Debug.Print "Debug Output - Start of Duplicate Removal Loop"
    
    ' Delete the first instance of duplicates based on column P
    lastRow = dataSheet.Cells(dataSheet.Rows.Count, "P").End(xlUp).Row

    For i = 2 To lastRow
    currentVal = dataSheet.Cells(i, "P").Value
    
        If currentVal <> "" Then
         If Not dictDuplicates.Exists(currentVal) Then
            dictDuplicates.Add currentVal, i
         Else
            dataSheet.Rows(dictDuplicates(currentVal)).Delete
            dictDuplicates(currentVal) = i
            End If
        End If
    Next i
    
    Application.CutCopyMode = False ' Clear the clipboard
    Application.ScreenUpdating = True
    
    ' Return to the top of the sheet
    dataSheet.Activate
    dataSheet.Cells(1, 1).Select
    
    MsgBox "Done! Data processed, duplicates removed, and returned to the top.", vbInformation
End Sub

Upvotes: 1

Views: 44

Answers (1)

Black cat
Black cat

Reputation: 6192

This checks the previous row content and if different and duplicated then delete first row of identical values.

For i = LastRow To 2 Step -1
    currentval = datasheet.Cells(i, "P").value
    
        If currentval <> "" Then
         If Not dictduplicates.Exists(currentval) Then
            dictduplicates.Add currentval, i
         ElseIf datasheet.Cells(i - 1, "P").value <> currentval Then
         datasheet.Rows(i).Delete
            'dictduplicates(currentval) = i
         End If
        End If
    Next i

Upvotes: 1

Related Questions