Reputation: 4559
I have a list containing three columns. The first column contains Names and the other two columns have numbers. The macro takes the first name(A1) and then searches down column A for another occurrence.
When it finds it, it deletes the entire row.It then goes to A2 and does the same thing agan. It works ok for about 500 entries, but using 3000 entries slows it down considerably. Is there a way to speed up this code?
Sub Button1_DeleteRow()
Dim i As Integer
Dim j As Integer
Dim Value As Variant
Dim toCompare As Variant
For i = 1 To 3000
Value = Cells(i, 1)
For j = (i + 1) To 3000
toCompare = Cells(j, 1)
If (StrComp(Value, toCompare, vbTextCompare) = 0) Then
Rows(j).EntireRow.Delete
End If
Next j
Next i
End Sub
Upvotes: 2
Views: 686
Reputation: 10679
Sorting the data on column A would then make it trivial to identify and remove the duplicates in a single pass
In response to the comment below, I'll explain why sorting is a useful technique.
By sorting column A into order, duplicate removal simply becomes a matter of comparing adjacent entries in column A. You can then either delete the duplicate rows as you find them or flag them for later deletion.
The process should actually be a lot less tedious as you only have to sort the list (and sorting, being built-in, tends to be very fast) and then do one pass (instead of 4498500) through the list deleting/flagging as you go (obviously you need a subsequent clean-up pass if you go for flagging).
On the issue of changing the order of the list, start by adding an extra column (e.g. column D) and have D2 contain the value 2 (i.e. just the row number). A quick fill-down later and every row is numbered. After sorting and deleting/flagging, restoring the original order is just a matter of re-sorting on column D which could then be deleted.
I use this method when I have to perform some operation or other on the duplicates. In other words, column A has duplicate values but the values in columns B and C are meaningful (for example, I might want to sum these values from all of the entries relating to the specific value of column A). In many cases, however, it would be easier just to use SQL to achieve the same result
Upvotes: 0
Reputation: 23520
Using Bretts technique is a good answer: but to answer your question about why does it take so long:
- Your macro is getting a value from over 4 million cells one by one. This is very slow.
- I don't see that your macro has switched off screenupdating and automatic calculation: every time a row is deleted the screen will refresh and Excel will recalculate. If you have not switched these off it is very slow.
This code should run a lot faster
Option Explicit
Sub Button1_DeleteRow()
Dim i As Long
Dim j As Long
Dim vArr As Variant
Dim iComp As Long
Dim Deletes(1 To 3000) As Boolean
Application.ScreenUpdating = False
iComp = Application.Calculation
Application.Calculation = xlCalculationManual
vArr = Range("a1:A3000")
For i = 1 To 3000
For j = (i + 1) To 3000
If (StrComp(vArr(i, 1), vArr(j, 1), vbTextCompare) = 0) Then
Deletes(j) = True
End If
Next j
Next i
For j = 3000 To 1 Step -1
If Deletes(j) Then Rows(j).EntireRow.Delete
Next j
Application.ScreenUpdating = True
Application.Calculation = iComp
End Sub
Upvotes: 2
Reputation: 38520
To supplement @brettdj's answer, if you are running Excel 2003, you can do this using AdvancedFilter
as follows:
Range("A1:A11").AdvancedFilter Action:=xlFilterInPlace, Unique:=True
Note: AdvancedFilter
assumes that the first row of your range (row A in this example) contains column headers and will not include that row in the filtering.
To do this manually: Data > Filter > Advanced Filter... > Unique records only
Upvotes: 3
Reputation: 55682
If you are running xl07/10 then you can do this with a single line with Remove Duplicates
. If you are running 03 then a solution with AutoFilter will be most efficient (I can provide this if you are on the older version)
Remove Duplicates
Manually
Code
ActiveSheet.Range("$A$1:$A$3000").EntireRow.RemoveDuplicates Columns:=1, Header:=xlNo
Upvotes: 4