Lio Djo
Lio Djo

Reputation: 129

Faster Or Best Alternative for VBA to delete rows not containing specific values?

Quick question to save everybody's time:

I have the code below that works fine but is too slow for my 30,000+ lines.

It basically deletes all the rows not containing the states TX, AR, LA and OK from column AD.

Sub DeleteStateExceptions()
    Dim iLastRow As Long
    Dim i As Long
    iLastRow = Cells(Rows.Count, "AD").End(xlUp).Row
    For i = iLastRow To 2 Step -1
        Select Case Cells(i, "AD").Value
            Case "TX"
            Case "OK"
            Case "AR"
            Case "LA"
            Case Else
                Rows(i).Delete
            End Select
    Next i
    'deletes row when cell in column AD is not TX, OK, AR or LA
End Sub

Any amendment to make it faster? Would you use a different logic?

Upvotes: 2

Views: 1159

Answers (3)

spioter
spioter

Reputation: 1870

I recommend keeping "sheet" specific and "use case specific" logic directly in cell formulas - then you can create more modular functions that can be reused.

In this scenario, if you add another column called "DeleteRow?",then populate it with a formula that returns "#DELETEROW#" when you want to delete the row else any other value* then you could have a reusable sub called "deleteRow" that takes listobject as an input and then sorts the data by column named "DeleteRow?", then filters on value "#DELETEROW#" and then deletes all filtered rows

Going forward, this approach let's you adjust the formula as needed to change which rows to delete without having to alter the vba.

* not tested but i bey if "DeleteRow?" formula returns row() when you want to keep the row, then the current sort will be preserved

Upvotes: 0

FaneDuru
FaneDuru

Reputation: 42236

Please, try the next updated code. It should be very fast:

Sub DeleteStateExceptions()
    Dim iLastRow As Long, arrMark, lastEmptyCol As Long, i As Long, boolDel As Boolean
    iLastRow = cells(rows.count, "AD").End(xlUp).Row
    lastEmptyCol = ActiveSheet.UsedRange.Column + ActiveSheet.UsedRange.Columns.count + 1
    ReDim arrMark(1 To iLastRow - 1, 1 To 1)
    For i = 2 To iLastRow
        Select Case cells(i, "AD").value
            Case "TX", "OK", "AR", "LA"
            Case Else
                boolDel = True  'to delete only if at least a row has been marked
                arrMark(i - 1, 1) = "Del"
            End Select
    Next i
    If boolDel Then
        With cells(2, lastEmptyCol).Resize(UBound(arrMark), 1)
            .value = arrMark
            .SpecialCells(xlCellTypeConstants).EntireRow.Delete
        End With
    End If
End Sub

An alternative would be to create a Union range, but in case of large ranges, creating of this one slows down the speed seriously. You can set a maximum cells limit (iterate backwards), let us say, 100, delete the rows already in the Union range and set it as Nothing.

But the above solution should be the fastest, in my opinion...

Edited:

I promised to come back and supply a solution overpassing the limitation of a specific number of arrays in a discontinuous range. I knew only about the 8192 for versions up to 2007 inclusive. It looks, such a limitation also exists in the newer versions, even if bigger. In order to test the above (much improved) way against the Union range version, I imagined the next testing way:

  1. Place a constant declaration on top of the module keeping the testing code (in the declarations area):
 Private Const arrRepeat As Long = 5000
  1. Copy the next code of a Sub building a similar environment to test the versions in a similar way, plus the sorting one:
3. Copy the improved above version, being extremely fast:
Sub DeleteStateExceptions()
    Dim iLastRow As Long, arrMark, lastEmptyCol As Long, i As Long, boolDel As Boolean
    Dim tm, arrSort
    
    buildTestingRange arrRepeat
    
    tm = Timer
    iLastRow = cells(rows.count, "AD").End(xlUp).Row
    arrSort = Evaluate("ROW(1:" & iLastRow - 1 & ")") 'create an array of necessary existing rows number
    lastEmptyCol = ActiveSheet.UsedRange.Column + ActiveSheet.UsedRange.Columns.count + 1
    cells(1, lastEmptyCol + 1).value = "InitSort"     'place a header to the initial sort column
    cells(2, lastEmptyCol + 1).Resize(UBound(arrSort), 1).value = arrSort 'drop the array content in the column
    
    ReDim arrMark(1 To iLastRow - 1, 1 To 1)
    For i = 2 To iLastRow
        Select Case cells(i, "AD").value
            Case "TX", "OK", "AR", "LA"
            Case Else
                boolDel = True
                arrMark(i - 1, 1) = "Del"
            End Select
    Next i
    If boolDel Then
        With cells(2, lastEmptyCol).Resize(UBound(arrMark), 1)
            Application.ScreenUpdating = False: Application.Calculation = xlCalculationManual 'some optimization...
            .value = arrMark            'drop the arrMark content
            'sort the area where the above array content has been dropped:
             SortByColumn Range("A1", cells(iLastRow, lastEmptyCol + 1)), Range(cells(1, lastEmptyCol), cells(iLastRow, lastEmptyCol))
             .SpecialCells(xlCellTypeConstants).EntireRow.Delete  'delete the rows containing "Del"
             'sort according to the original sheet initial sorting:
             SortByColumn Range("A1", cells(iLastRow, lastEmptyCol + 1)), Range(cells(1, lastEmptyCol + 1), cells(iLastRow, lastEmptyCol + 1)), True
             Range(cells(1, lastEmptyCol), cells(iLastRow, lastEmptyCol + 1)).Clear  'clear the helping column (the original sorting of the sheet)
            Application.ScreenUpdating = True: Application.Calculation = xlCalculationAutomatic
        End With
    End If
    Debug.Print "Markers: Delete rows in " & Round(Timer - tm, 2) & " sec (" & arrRepeat & ")"
End Sub

Sub SortByColumn(rng As Range, rngS As Range, Optional boolAscending As Boolean = False)
    rngS.cells(1).value = "LastColumn"
    ActiveSheet.Sort.SortFields.Clear
    ActiveSheet.Sort.SortFields.Add2 key:=rngS, SortOn:=xlSortOnValues, Order:=IIf(boolAscending, xlAscending, xlDescending), DataOption:= _
        xlSortNormal
    With ActiveSheet.Sort
        .SetRange rng
        .Header = xlYes
        .MatchCase = False
        .Orientation = xlTopToBottom
        .SortMethod = xlPinYin
        .Apply
    End With
End Sub

Then copy the Union range version:

Sub DeleteStateExceptionsUnion()
    Dim iLastRow As Long, rngDel As Range, i As Long
    Dim tm
    
    buildTestingRange arrRepeat
    
    tm = Timer
    iLastRow = cells(rows.count, "AD").End(xlUp).Row
    ReDim arrMark(1 To iLastRow - 1, 1 To 1)
    For i = 2 To iLastRow
        Select Case cells(i, "AD").value
            Case "TX", "OK", "AR", "LA"
            Case Else
                If rngDel Is Nothing Then
                    Set rngDel = cells(i, "AD")
                Else
                    Set rngDel = Union(rngDel, cells(i, "AD"))
                End If
            End Select
    Next i
    Application.ScreenUpdating = False: Application.Calculation = xlCalculationManual
     If Not rngDel Is Nothing Then rngDel.EntireRow.Delete
    Application.ScreenUpdating = True: Application.Calculation = xlCalculationAutomatic
    
    Debug.Print "Union: Delete rows in " & Round(Timer - tm, 2) & " sec (" & arrRepeat & ")"
End Sub

And finally, the version using Union in batches, to avoid the code slowing down when such a range needs to be very large:

Sub DeleteStateExceptionsUnionBatch()
    Dim iLastRow As Long, rngDel As Range, i As Long
    Dim tm, batch As Long, count As Long
    
    buildTestingRange arrRepeat
    
    tm = Timer
    batch = 700
    iLastRow = cells(rows.count, "AD").End(xlUp).Row
    ReDim arrMark(1 To iLastRow - 1, 1 To 1)
    For i = iLastRow To 2 Step -1              'iterate backwards
        Select Case cells(i, "AD").value
            Case "TX", "OK", "AR", "LA"
            Case Else
                count = count + 1
                If rngDel Is Nothing Then
                    Set rngDel = cells(i, "AD")
                Else
                    Set rngDel = Union(rngDel, cells(i, "AD"))
                End If
                If count >= batch Then
                    Application.ScreenUpdating = False: Application.Calculation = xlCalculationManual
                     rngDel.EntireRow.Delete: Set rngDel = Nothing: count = 0
                    Application.ScreenUpdating = True: Application.Calculation = xlCalculationAutomatic
                End If
            End Select
    Next i
    Application.ScreenUpdating = False: Application.Calculation = xlCalculationManual
     If Not rngDel Is Nothing Then rngDel.EntireRow.Delete
    Application.ScreenUpdating = True: Application.Calculation = xlCalculationAutomatic
    Debug.Print "Union batch: Delete rows in " & Round(Timer - tm, 2) & " sec (" & arrRepeat & ") batch: " & batch
End Sub
  1. Now run each of the three versions for the same arrRepeat value. You fistly need to activate an empty sheet...

I obtained (in Immediate Window) the next running times:

Built testing range (5000 rows)
Markers: Delete rows in 0.33 sec (5000)
Built testing range (5000 rows)
Union: Delete rows in 24 sec (5000)
Built testing range (5000 rows)
Union batch: Delete rows in 18.79 sec (5000) batch: 600
Built testing range (5000 rows)
Union batch: Delete rows in 18.97 sec (5000) batch: 500
-------------------------------------------------------
Built testing range (10000 rows)
Markers: Delete rows in 0.43 sec (10000)
Built testing range (10000 rows)
Union: Delete rows in 51.23 sec (10000)
Built testing range (10000 rows)
Union batch: Delete rows in 14.57 sec (10000) batch: 500
--------------------------------------------------------
Built testing range (50000 rows)
Markers: Delete rows in 1.34 sec (50000)
Built testing range (50000 rows)
Union batch: Delete rows in 129.36 sec (50000) batch: 500
Built testing range (50000 rows)
Union batch: Delete rows in 125.47 sec (50000) batch: 600
Built testing range (50000 rows)

I tried Union range version but I had to close Excel after about 15 minutes...

Upvotes: 5

VBasic2008
VBasic2008

Reputation: 54807

Delete Not-Criteria Rows

  • When the criteria column is not sorted, it may take 'forever' to delete hundreds or even tens of thousands of rows.
  • The following will insert and populate two columns, an integer sequence column, and the match column.
  • After the data is sorted by the match column, the now contiguous range of error values will be used to quickly delete the undesired rows.
  • The integer sequence column will be used to finally sort the data to regain the initial order.
Sub DeleteNotCriteriaRowsTEST()
    
    Const CriteriaList As String = "TX,OK,AR,LA"
    Const FirstCellAddress As String = "AD2"
    
    Dim ws As Worksheet: Set ws = ActiveSheet ' improve!
    Dim FirstCell As Range: Set FirstCell = ws.Range(FirstCellAddress)
    
    DeleteNotCriteriaRows FirstCell, CriteriaList

End Sub

Sub DeleteNotCriteriaRows( _
        ByVal FirstCell As Range, _
        ByVal CriteriaList As String, _
        Optional ByVal CriteriaDelimiter As String = ",")
    Const ProcName As String = "DeleteNotCriteriaRows"
    Dim NothingToDelete As Boolean
    On Error GoTo ClearError
    
    Dim Criteria() As String: Criteria = Split(CriteriaList, CriteriaDelimiter)
    
    Dim ws As Worksheet
    Dim rgColumn As Range
    Dim rCount As Long
    
    With FirstCell.Cells(1)
        Set ws = .Worksheet
        Dim lCell As Range: Set lCell = .Resize(ws.Rows.Count _
            - .Row + 1).Find("*", , xlFormulas, , , xlPrevious)
        rCount = lCell.Row - .Row + 1
        Set rgColumn = .Resize(rCount)
    End With
    
    Dim rgTotal As Range
    Set rgTotal = Intersect(ws.UsedRange, rgColumn.EntireRow)
    
    Application.ScreenUpdating = False
    
    Dim rgInsert As Range
    Set rgInsert = rgColumn.Cells(1).Offset(, 1).Resize(, 2).EntireColumn
    rgInsert.Insert xlShiftToRight, xlFormatFromLeftOrAbove
    
    Dim rgIntegerSequence As Range: Set rgIntegerSequence = rgColumn.Offset(, 1)
    With rgIntegerSequence
        .NumberFormat = "0"
        .Formula = "=ROW()"
        .Value = .Value
    End With
    
    Dim rgMatch As Range: Set rgMatch = rgColumn.Offset(, 2)
    With rgMatch
        .NumberFormat = "General"
        .Value = Application.Match(rgColumn, Criteria, 0)
    End With
        
    rgTotal.Sort rgMatch, xlAscending, , , , , , xlNo
    
    Dim rgDelete As Range
    
    On Error Resume Next
        Set rgDelete = Intersect(ws.UsedRange, _
            rgMatch.SpecialCells(xlCellTypeConstants, xlErrors).EntireRow)
    On Error GoTo ClearError
        
    If rgDelete Is Nothing Then
        NothingToDelete = True
    Else
        rgDelete.Delete xlShiftUp
    End If
        
    rgTotal.Sort rgIntegerSequence, xlAscending, , , , , , xlNo
        
    rgInsert.Offset(, -2).Delete xlShiftToLeft

SafeExit:
    Application.ScreenUpdating = True
    
    If NothingToDelete Then
        MsgBox "Nothing deleted.", vbExclamation, ProcName
    Else
        MsgBox "Rows deleted.", vbInformation, ProcName
    End If

    Exit Sub
ClearError:
    Debug.Print "'" & ProcName & "' Run-time error '" _
        & Err.Number & "':" & vbLf & "    " & Err.Description
    NothingToDelete = True
    Resume SafeExit
End Sub

Upvotes: 1

Related Questions