Reputation: 65
I'm trying to come up with code that look in Column D for any duplicate text then deletes the entire row that the first duplicate is located in. There are blanks in between the rows so using the code .End(xl)Up
doesn't work unless you're able to target the entire column regardless of the blanks in between
the rows somehow.
I've tried two methods so far but neither have worked to my expectation.
This was my first method which doesn't work since the worksheet has an outline:
Sub test()
ActiveSheet.Range("D:D").RemoveDuplicates Columns:=1, header:=xlNo
End Sub
This was my second method that I got from another site that runs for minutes but doesn't appear to do what I'm trying to achieve.
Sub Row_Dupe_Killer_Keep_Last()
Dim lrow As Long
For lrow = Cells(Rows.Count, "D").End(xlUp).Row To 2 Step -1
If Cells(lrow, "D") = Cells(lrow, "D").Offset(-1, 0) Then
Cells(lrow, "D").Offset(-1, 0).EntireRow.Delete
End If
Next lrow
End Sub
Does anyone have any suggestions or tips? I've been working at it for a few days now with my limited skills and haven't been able to figure a way to do it...Thank you in advance for your time.
Upvotes: 1
Views: 2102
Reputation: 367
This method avoids the use of EntireRow.Delete
, which is notoriously slow. The contents are cleared and the dataset is sorted to remove gaps.
EDIT: switched to For Next to enable searching upwards from the bottom; also cleaned up sort routine generated by the macro recorder...I never seem to have that routine on-hand when I need it :).
Note: this will also not work with an outline...whatever you did to make it work for the other answer will need to be done for this one as well.
I am curious if the Clear/Sort approach works for you and if it speeds up your routine.
Option Explicit
Sub RemoveFirstDuplicate()
Dim myDataRange As Range, iCounter As Long, myDuplicate As Range, lastRow As Long
lastRow = Range("D1000000").End(xlUp).Row
Set myDataRange = Sheets("Sheet1").Range("D1:D" & lastRow)
'searching up to the second row (below the field name assumed to be in row 1)...you may need to adjust where the loop stops
For iCounter = myDataRange.Cells.Count To 2 Step -1
With myDataRange
If WorksheetFunction.CountIf(myDataRange, myDataRange.Item(iCounter)) > 1 Then
Set myDuplicate = .Find(What:=myDataRange.Item(iCounter), After:=myDataRange.Item(iCounter), SearchDirection:=xlPrevious)
Range("D" & myDuplicate.Row).ClearContents
End If
End With
Next iCounter
With ActiveWorkbook.Worksheets("Sheet1").Sort
.SortFields.Clear
.SortFields.Add Key:=myDataRange.Offset(1, 0)
.SetRange myDataRange
.Header = xlYes
.Apply
End With
End Sub
Upvotes: 1
Reputation: 176
Edit: Now ignores blanks
Edit: Modified to have the ability to change the starting row
What you might want to do is pull the data into an array and search the array for duplicates. Excel can process arrays much faster than it can go through each cell.
The following code does just that. It will leave D1 alone (such as in your example code) and will remove the entire row of any duplicates, only leaving the last item.
To deal with deleting the rows, we add all of the duplicates into a range object named rngDelete and delete all the rows at once. This will make it run much faster than deleting one by one.
Sub Row_Dupe_Killer_Keep_Last()
Dim vData As Variant
Dim rngDelete As Range
Dim lrow As Long, lrowSearch As Long
Dim lStartRow as long
'Change this to the row you wish to start with (the top row)
lStartRow = 22
'Get all of the data from the cells into a variant array
'Normally I would prefer to use usedrange, but this method is fine
'(Note: Change the 2 to 1 if you want to include the entire column including Row number 1)
vData = Range(Cells(lStartRow, "D").Address & ":" & Cells(Rows.Count, "D").End(xlUp).Address)
'Search for duplicates
'First, loop through backwards one by one
For lrow = UBound(vData) To LBound(vData) Step -1
'now loop through forwards (up to the point where we have already looked)
For lrowSearch = LBound(vData) To lrow
'Check if we have a duplicate
If Not IsError(vData(lrow, 1)) And Not IsError(vData(lrowSearch, 1)) Then
If lrow <> lrowSearch And vData(lrow, 1) = vData(lrowSearch, 1) And vData(lrow, 1) <> "" Then
'We have a duplicate! Let's add it to our "list to delete"
If rngDelete Is Nothing Then
'if rngDelete isn't set yet...
Set rngDelete = Range("D" & lrowSearch + lStartRow-1)
Else
'if we are adding to rngDelete...
Set rngDelete = Union(rngDelete, Range("D" & lrowSearch + lStartRow-1))
End If
End If
End If
Next lrowSearch
Next lrow
'Delete all of the duplicate rows
If Not rngDelete Is Nothing Then
rngDelete.EntireRow.Delete
End If
End Sub
Upvotes: 2
Reputation: 771
This should help you out.
Option Explicit
Const c_intMaxBlanks As Integer = 5
Const c_AbsoluteMaxRowsInSheet As Integer = 5000
Public Sub RunIt()
Row_Dupe_Killer_Keep_Last ActiveSheet.Range("D:D")
End Sub
Public Sub Row_Dupe_Killer_Keep_Last(rngCells As Range)
Dim iRow As Integer, iCol As Integer
Dim intBlankCnt As Integer
Dim intMaxBlanks As Integer
Dim blnIsDone As Boolean
Dim intSaveStartRow As Integer
Dim blnStartCnt As Boolean
Dim strTemp As String
Dim strCheck As String
Dim intI As Integer
Dim intJ As Integer
Dim intSaveEndRow As Integer
'First, Count the consecutive blanks
blnIsDone = False
blnStartCnt = False
intSaveStartRow = 0
intSaveEndRow = 0
intBlankCnt = 0
iRow = 1
iCol = rngCells.Column
Do While (Not blnIsDone)
'Check for blank Row using length of string
If (Len(Trim(rngCells.Cells(iRow, 1).Value)) < 1) Then
If Not blnStartCnt Then
intSaveStartRow = iRow
blnStartCnt = True
Else
If (intSaveStartRow + intBlankCnt) <> iRow Then
'restart
intSaveStartRow = iRow
intBlankCnt = 0
End If
End If
intBlankCnt = intBlankCnt + 1
Else
'restart
blnStartCnt = False
intBlankCnt = 0
End If
intSaveEndRow = iRow
If intBlankCnt >= c_intMaxBlanks Then blnIsDone = True
'Stop Loop: Maybe Infinite"
If iRow > c_AbsoluteMaxRowsInSheet Then Exit Do
iRow = iRow + 1
Loop
'Now, loop through each row in the column and check values.
For intI = intSaveEndRow To 2 Step -1
strTemp = LCase(Trim(rngCells.Cells(intI, 1).Value))
For intJ = intSaveEndRow To 2 Step -1
If intJ <> intI Then
strCheck = LCase(Trim(rngCells.Cells(intJ, 1).Value))
If strTemp = strCheck Then
'Found a dup, delete it
rngCells.Cells(intJ, 1).EntireRow.Delete
'ElseIf Len(strCheck) < 1 Then
' 'Delete the blank line
' rngCells.Cells(intJ, 1).EntireRow.Delete
End If
End If
Next intJ
Next intI
End Sub
Upvotes: 1