Aku
Aku

Reputation: 3

How to delete duplicates, save latest entry and copy data to remaining one?

My goal is to remove duplicates, save the latest entry and copy specific column data to remaining row. So I have columns from A to E. If columns from A to C are duplicate, older row will be deleted and column E information will be copied to remaining row.

Here is an example:

cell1 cell2 cell3 cell4 cell5
 1     2     2     4     tax
 1     3     3     1     extra
 1     2     2     5     
 1     2     3     1     tax 

After macro

cell1 cell2 cell3 cell4 cell5
 1     3     3     1     extra
 1     2     2     5     tax
 1     2     3     1     tax

So now, row 1 has been deleted and its 'cell5' information has been copied to row 3.

Here is the code (Deleting duplicates and saving the latest entry works, I just don't know how to save cell5 information and paste it to remaining row). !!EDITED!! (Code changed according to this post's answer by @RonRosenfeld)

Option Explicit

Sub remDup()
    Dim rRes As Range
    Dim dict As Dictionary
    Dim sKey As String
    Dim vSrc As Variant, vRes As Variant, vRow As Variant
    Dim I As Long, J As Long, v As Variant

'Also changed here rgTable into rRes
Set rRes = ThisWorkbook.Worksheets("Sheet1").Cells(1, 1).CurrentRegion

vSrc = rRes
ReDim vRow(1 To UBound(vSrc, 2))

Set dict = New Dictionary
For I = 2 To UBound(vSrc)
    sKey = vSrc(I, 1) & "|" & vSrc(I, 2) & "|" & vSrc(I, 3)
    For J = 1 To UBound(vSrc, 2)
        vRow(J) = vSrc(I, J)
    Next J

    If Not dict.Exists(sKey) Then
    dict.Add Key:=sKey, Item:=vRow
    Else 

        vRow(UBound(vRow)) = dict(sKey)(UBound(dict(sKey)))
        dict.Remove (sKey)
        dict.Add Key:=sKey, Item:=vRow
    End If
Next I

ReDim vRes(0 To dict.Count, 1 To UBound(vSrc, 2))

For J = 1 To UBound(vSrc, 2)
    vRes(0, J) = vSrc(1, J)
Next J

I = 0
For Each v In dict.Keys
    I = I + 1
    For J = 1 To UBound(vSrc, 2)
        vRes(I, J) = dict(v)(J)
    Next J
Next v

'here I try to get the new table into sheet2
Dim rRes As Range
Set rRes = rRes.Worksheet("Sheet2").Resize(UBound(vRes, 1) + 1, UBound(vRes, 2))

With rRes
    .EntireColumn.Clear
    .Value = vRes
    .Style = "Output"
    .EntireColumn.AutoFit
End With

End Sub

Upvotes: 0

Views: 129

Answers (1)

Ron Rosenfeld
Ron Rosenfeld

Reputation: 60174

  • suggest using a dictionary object, and storing the row contents as an array.
  • then you can
    • write preceding duplicates cell5 content into that array
    • delete the previous key and add it back to re-order it
  • The algorithm assumes that the last column in the duplicates will be blank, or should always be overwritten by the first of the duplicates. If that is not the case, you will need to specify how that should be handled.

Also, I used a different way of defining rgTable and also worked in VBA arrays as it is much faster than read/write from the worksheet.

Finally, I set the results range to be an offset of rgTable, but if you have this working, and prefer, you can overwrite the original, or put it on a different worksheet.

But this should get you started.

'Set reference to microsoft scripting runtime
'   or use late binding
Option Explicit

Sub remDup()
    Dim rgTable As Range, rw As Range
    Dim dict As Dictionary
    Dim sKey As String
    Dim vSrc As Variant, vRes As Variant, vRow As Variant
    Dim I As Long, J As Long, v As Variant

'should use fully qualified references to avoid confusion
'For example:
Set rgTable = ThisWorkbook.Worksheets("sheet2").Cells(1, 1).CurrentRegion

'in VBA, much faster to work with arrays
vSrc = rgTable
ReDim vRow(1 To UBound(vSrc, 2))

Set dict = New Dictionary
For I = 2 To UBound(vSrc) 'skip header row
    sKey = vSrc(I, 1) & "|" & vSrc(I, 2) & "|" & vSrc(I, 3)
    For J = 1 To UBound(vSrc, 2)
        vRow(J) = vSrc(I, J)
    Next J

    If Not dict.Exists(sKey) Then
        dict.Add Key:=sKey, Item:=vRow
    Else 'add previous col5 to vrow
         'remove old key and re-enter to re-order

         'To overwrite a different column instead of the last column, replace
         '(in the line below) Ubound(vRwo)  and  Ubound(dict(sKey)) with the 
         'column number
        vRow(UBound(vRow)) = dict(sKey)(UBound(dict(sKey)))
        dict.Remove (sKey)
        dict.Add Key:=sKey, Item:=vRow
    End If
Next I

'create results array
ReDim vRes(0 To dict.Count, 1 To UBound(vSrc, 2))

'headers
For J = 1 To UBound(vSrc, 2)
    vRes(0, J) = vSrc(1, J)
Next J

'Data
I = 0
For Each v In dict.Keys
    I = I + 1
    For J = 1 To UBound(vSrc, 2)
        vRes(I, J) = dict(v)(J)
    Next J
Next v

'write the results
'once it works ok, can overwrite original

Dim rRes As Range
Set rRes = rgTable.Offset(columnoffset:=2 + rgTable.Columns.Count).Resize(UBound(vRes, 1) + 1, UBound(vRes, 2))

With rRes
    .EntireColumn.Clear
    .Value = vRes
    .Style = "Output"
    .EntireColumn.AutoFit
End With

End Sub

Source Data

enter image description here

Results

enter image description here

Upvotes: 1

Related Questions