SIO
SIO

Reputation: 404

Subtracting ranges in VBA (Excel)

What I'm trying to do

I'm trying to write a function to subtract Excel ranges. It should take two input parameters: range A and range B. It should return a range object consisting of cells that are part of range A and are not part of range B (as in set subtraction)

What I've tried

I've seen some examples on the web that use a temporary worksheet to do this (fast, but might introduce some issues with protected workbooks and such) and some other examples that go cell by cell through the first range checking for intersections with the second one (extremely slow).

After some thinking I've come up with this code {1}, which works faster, but still is slow. Subtracting from a range representing the whole worksheet takes from 1 to 5 minutes depending on how complex the second range is.

When I looked over that code trying to find ways to make it faster I saw a possibility for applying the divide-and-conquer paradigm, which I did {2}. But that had made my code slower instead. I'm not much of a CS guy, so I might have done something wrong or this algorithm simply is not the one the divide-and-conquer should be used on, I don't know.

I have also tried rewriting it using mostly recursion, but that took forever to finish or (more often) had thrown Out of Stack Space errors. I didn't save the code.

The only (marginally) successful improvement I've been able to do is adding a flip switch {3} and going first through rows, then (in the next call) through columns instead of going through both in the same call, but the effect was not as good as I've hoped. Now I see that even though we don't go through all rows in the first call, in the second call we still loop through the same amount of rows we would in the first one, only these rows are a little bit shorter :)

I would appreciate any help in improving or rewriting this function, thank you!

The solution, based on the accepted answer by Dick Kusleika

Dick Kusleika, thank you very much for providing your answer! I think I'll use it with some modifications I've made:

With these modifications the code runs very fast on the most of common cases. As it's been pointed out, it will still be slow with checkerboard-style huge range which I agree is unavoidable.

I think this code still has a room for improvement and I'll update this post in case I modify it.

Improvement possibilities:

{0} Solution code

BUGGY! See reply from SteveS below

Update 2023-07-07

After posting this code to StackOverflow I started using it and hitting its limitations. Fixed version has become significantly more complex and yet it is not perfect, large merged cells may trip it up. Here is the code I've been using for the past 9 years: https://gist.github.com/sio/7423c651d7367de886bb256c4de5e45e

Original buggy version below:

Public Function SubtractRanges(rFirst As Range, rSecond As Range) As Range
'
' Returns a range of cells that are part of rFirst, but not part of rSecond
' (as in set subtraction)
'
' This function handles big input ranges really well!
'
' The reason for having a separate recursive function is
' handling multi-area rFirst range
'
    Dim rInter As Range
    Dim rReturn As Range
    Dim rArea As Range

    Set rInter = Intersect(rFirst, rSecond)
    Set mrBuild = Nothing

    If rInter Is Nothing Then 'no overlap
        Set rReturn = rFirst
    ElseIf rInter.Address = rFirst.Address Then 'total overlap
        Set rReturn = Nothing
    Else 'partial overlap
        For Each rArea In rFirst.Areas
            Set mrBuild = BuildRange(rArea, rInter) 'recursive
        Next rArea
        Set rReturn = mrBuild
    End If

    Set SubtractRanges = rReturn
End Function


Private Function BuildRange(rArea As Range, rInter As Range, _
Optional mrBuild As Range = Nothing) As Range
'
' Recursive function for SubtractRanges()
'
' Subtracts rInter from rArea and adds the result to mrBuild
'
    Dim rLeft As Range, rRight As Range
    Dim rTop As Range, rBottom As Range
    Dim rInterSub As Range
    Dim GoByColumns As Boolean
    
    Set rInterSub = Intersect(rArea, rInter)
    If rInterSub Is Nothing Then 'no overlap
        If mrBuild Is Nothing Then
            Set mrBuild = rArea
        Else
            Set mrBuild = Union(mrBuild, rArea)
        End If
    ElseIf Not rInterSub.Address = rArea.Address Then 'some overlap
        If Not rArea.Cells.CountLarge = 1 Then 'just in case there is only one cell for some impossible reason
            
            ' Decide whether to go by columns or by rows
            ' (helps when subtracting whole rows/columns)
            If Not rInterSub.Columns.Count = rArea.Columns.Count And _
            ((Not rInterSub.Cells.CountLarge = 1 And _
            (rInterSub.Rows.Count > rInterSub.Columns.Count _
            And rArea.Columns.Count > 1) Or (rInterSub.Rows.Count = 1 _
            And Not rArea.Columns.Count = 1)) Or _
            (rInterSub.Cells.CountLarge = 1 _
            And rArea.Columns.Count > rArea.Rows.Count)) Then
                    GoByColumns = True
            Else
                    GoByColumns = False
            End If
            
            If Not GoByColumns Then
                Set rTop = rArea.Resize(rArea.Rows.Count \ 2) 'split the range top to bottom
                Set rBottom = rArea.Resize(rArea.Rows.Count - rTop.Rows.Count).Offset(rTop.Rows.Count)
                Set mrBuild = BuildRange(rTop, rInterSub, mrBuild) 'rerun it
                Set mrBuild = BuildRange(rBottom, rInterSub, mrBuild)
            Else
                Set rLeft = rArea.Resize(, rArea.Columns.Count \ 2) 'split the range left to right
                Set rRight = rArea.Resize(, rArea.Columns.Count - rLeft.Columns.Count).Offset(, rLeft.Columns.Count)
                Set mrBuild = BuildRange(rLeft, rInterSub, mrBuild) 'rerun it
                Set mrBuild = BuildRange(rRight, rInterSub, mrBuild)
            End If
        End If
    End If
    
    Set BuildRange = mrBuild
End Function

Other code mentioned in the question

{1} Initial code (go row by row, column by column)

Function SubtractRanges(RangeA, RangeB) As Range
'
' Returns a range of cells that are part of RangeA, but not part of RangeB
'
' This function handles big RangeA pretty well (took less than a minute
' on my computer with RangeA = ActiveSheet.Cells)
'
    Dim CommonArea As Range
    Dim Result As Range
   
    Set CommonArea = Intersect(RangeA, RangeB)
    If CommonArea Is Nothing Then
        Set Result = RangeA
    ElseIf CommonArea.Address = RangeA.Address Then
        Set Result = Nothing
    Else
        'a routine to deal with A LOT of cells in RangeA
        'go column by column, then row by row
        Dim GoodCells As Range
        Dim UnworkedCells As Range
       
        For Each Area In RangeA.Areas
            For Each Row In Area.Rows
                Set RowCommonArea = Intersect(Row, CommonArea)
                If Not RowCommonArea Is Nothing Then
                    If Not RowCommonArea.Address = Row.Address Then
                        Set UnworkedCells = AddRanges(UnworkedCells, Row)
                    End If
                Else
                    Set GoodCells = AddRanges(GoodCells, Row)
                End If
            Next Row
           
            For Each Column In Area.Columns
                Set ColumnCommonArea = Intersect(Column, CommonArea)
                If Not ColumnCommonArea Is Nothing Then
                    If Not ColumnCommonArea.Address = Column.Address Then
                        Set UnworkedCells = AddRanges(UnworkedCells, Column)
                    End If
                Else
                    Set GoodCells = AddRanges(GoodCells, Column)
                End If
            Next Column
        Next Area
       
        If Not UnworkedCells Is Nothing Then
            For Each Area In UnworkedCells
                Set GoodCells = AddRanges(GoodCells, SubtractRanges(Area, CommonArea))
            Next Area
        End If
       
        Set Result = GoodCells
    End If
   
    Set SubtractRanges = Result
End Function

{2} Divide and conquer

Function SubtractRanges(RangeA, RangeB) As Range
'
' Returns a range of cells that are part of RangeA, but not part of RangeB
'
    Dim CommonArea As Range
    Dim Result As Range
   
    Set CommonArea = Intersect(RangeA, RangeB)
    If CommonArea Is Nothing Then
        Set Result = RangeA
    ElseIf CommonArea.Address = RangeA.Address Then
        Set Result = Nothing
    Else
        'a routine to deal with A LOT of cells in RangeA
        'go column by column, then row by row
        Dim GoodCells As Range
        Dim UnworkedCells As Range
       
        For Each Area In RangeA.Areas

            RowsNumber = Area.Rows.Count
            If RowsNumber > 1 Then
                Set RowsLeft = Range(Area.Rows(1), Area.Rows(RowsNumber / 2))
                Set RowsRight = Range(Area.Rows(RowsNumber / 2 + 1), Area.Rows(RowsNumber))
            Else
                Set RowsLeft = Area
                Set RowsRight = CommonArea.Cells(1, 1) 'the next best thing to Nothing - will end its cycle rather fast and won't throw an error with For Each statement
            End If
            For Each Row In Array(RowsLeft, RowsRight)
                Set RowCommonArea = Intersect(Row, CommonArea)
                If Not RowCommonArea Is Nothing Then
                    If Not RowCommonArea.Address = Row.Address Then
                        Set UnworkedCells = AddRanges(UnworkedCells, Row)
                    End If
                Else
                    Set GoodCells = AddRanges(GoodCells, Row)
                End If
            Next Row
           
            ColumnsNumber = Area.Columns.Count
            If ColumnsNumber > 1 Then
                Set ColumnsLeft = Range(Area.Columns(1), Area.Columns(ColumnsNumber / 2))
                Set ColumnsRight = Range(Area.Columns(ColumnsNumber / 2 + 1), Area.Columns(ColumnsNumber))
            Else
                Set ColumnsLeft = Area
                Set ColumnsRight = CommonArea.Cells(1, 1)
            End If
            For Each Column In Array(ColumnsLeft, ColumnsRight)
                Set ColumnCommonArea = Intersect(Column, CommonArea)
                If Not ColumnCommonArea Is Nothing Then
                    If Not ColumnCommonArea.Address = Column.Address Then
                        Set UnworkedCells = AddRanges(UnworkedCells, Column)
                    End If
                Else
                    Set GoodCells = AddRanges(GoodCells, Column)
                End If
            Next Column
        Next Area
       
        If Not UnworkedCells Is Nothing Then
            For Each Area In UnworkedCells
                Set GoodCells = AddRanges(GoodCells, SubtractRanges(Area, CommonArea))
            Next Area
        End If
       
        Set Result = GoodCells
    End If
   
    Set SubtractRanges = Result
End Function

{3} Initial code + flip switch (row by row OR column by column in turns)

Function SubtractRanges(RangeA, RangeB, Optional Flip As Boolean = False) As Range
'
' Returns a range of cells that are part of RangeA, but not part of RangeB
'
' This function handles big RangeA pretty well (took less than a minute
' on my computer with RangeA = ActiveSheet.Cells)
'
    Dim CommonArea As Range
    Dim Result As Range
   
    Set CommonArea = Intersect(RangeA, RangeB)
    If CommonArea Is Nothing Then
        Set Result = RangeA
    ElseIf CommonArea.Address = RangeA.Address Then
        Set Result = Nothing
    Else
        'a routine to deal with A LOT of cells in RangeA
        'go column by column, then row by row
        Dim GoodCells As Range
        Dim UnworkedCells As Range
       
        For Each Area In RangeA.Areas
            If Flip Then
                For Each Row In Area.Rows
                    Set RowCommonArea = Intersect(Row, CommonArea)
                    If Not RowCommonArea Is Nothing Then
                        If Not RowCommonArea.Address = Row.Address Then
                            Set UnworkedCells = AddRanges(UnworkedCells, Row)
                        End If
                    Else
                        Set GoodCells = AddRanges(GoodCells, Row)
                    End If
                Next Row
            Else
                For Each Column In Area.Columns
                    Set ColumnCommonArea = Intersect(Column, CommonArea)
                    If Not ColumnCommonArea Is Nothing Then
                        If Not ColumnCommonArea.Address = Column.Address Then
                            Set UnworkedCells = AddRanges(UnworkedCells, Column)
                        End If
                    Else
                        Set GoodCells = AddRanges(GoodCells, Column)
                    End If
                Next Column
            End If
        Next Area
       
        If Not UnworkedCells Is Nothing Then
            For Each Area In UnworkedCells
                Set GoodCells = AddRanges(GoodCells, SubtractRanges(Area, CommonArea, Not Flip))
            Next Area
        End If
       
        Set Result = GoodCells
    End If
   
    Set SubtractRanges = Result
End Function

A little helper function mentioned here and there:

Function AddRanges(RangeA, RangeB)
'
' The same as Union built-in but handles empty ranges fine.
'
    If Not RangeA Is Nothing And Not RangeB Is Nothing Then
        Set AddRanges = Union(RangeA, RangeB)
    ElseIf RangeA Is Nothing And RangeB Is Nothing Then
        Set AddRanges = Nothing
    Else
        If RangeA Is Nothing Then
            Set AddRanges = RangeB
        Else
            Set AddRanges = RangeA
        End If
    End If
End Function

Upvotes: 10

Views: 24847

Answers (4)

Dutch Gemini
Dutch Gemini

Reputation: 79

I've recently written a [rather fast] function in VBA that I named UnionExclusive() which returns the Union between 2 ranges of cells –with multiple areas allowed for each range– with the exclusion of the range of cells they have in common. It practically uses only Application.Union() and Application.Intersect() and does not loop single cells.

[Edit] Note: the code does not [yet] capture situations where the second range intersects multiple times with the first range, as with Application.Intersect(r1, r2).AreasCount > 1 so you better check before calling this function.

Function UnionExclusive(ByRef r1 As Excel.Range, r2 As Excel.Range) As Excel.Range
'
' This function returns the range of cells that is the Union of both ranges with the
' exclusion of the ranges or cells that they have in common.
'
On Error Resume Next
    Dim rngWholeArea      As Excel.Range
    Dim rngIndividualArea As Excel.Range
    Dim rngIntersection   As Excel.Range
    Dim rngIntersectArea  As Excel.Range
    Dim rngUnion          As Excel.Range
    Dim rngSection        As Excel.Range
    Dim rngResultingRange As Excel.Range
    Dim lngWholeTop       As Long
    Dim lngWholeLeft      As Long
    Dim lngWholeBottom    As Long
    Dim lngWholeRight     As Long
    Dim arrIntersection   As Variant
    Dim arrWholeArea      As Variant
'
' Must be on same sheet, return only first range.
'
    If Not r1.Parent Is r2.Parent Then Set UnionExclusive = r1: Exit Function
'
' No overlapping cells, return the union.
'
    If Application.Intersect(r1, r2) Is Nothing Then Set UnionExclusive = Application.Union(r1, r2): Exit Function
'
' Range to subtract must be contiguous. If the second range has multiple areas, loop through all the individual areas.
'
    If (r2.Areas.Count > 1) _
    Then
        Set rngResultingRange = r1
        For Each rngIndividualArea In r2.Areas
            Set rngResultingRange = UnionExclusive(rngResultingRange, rngIndividualArea)
        Next rngIndividualArea
        Set UnionExclusive = rngResultingRange
        Exit Function
    End If
'
' Get the overall size of the Union() since Rows/Columns "Count" is based on the first area only.
'
    Set rngUnion = Application.Union(r1, r2)
    For Each rngIndividualArea In rngUnion.Areas
        If (lngWholeTop = 0) Then lngWholeTop = rngIndividualArea.Row Else lngWholeTop = Application.WorksheetFunction.Min(lngWholeTop, rngIndividualArea.Row)
        If (lngWholeLeft = 0) Then lngWholeLeft = rngIndividualArea.Column Else lngWholeLeft = Application.WorksheetFunction.Min(lngWholeLeft, rngIndividualArea.Column)
        If (lngWholeBottom = 0) Then lngWholeBottom = (rngIndividualArea.Row + rngIndividualArea.Rows.Count - 1) Else lngWholeBottom = Application.WorksheetFunction.Max(lngWholeBottom, (rngIndividualArea.Row + rngIndividualArea.Rows.Count - 1))
        If (lngWholeRight = 0) Then lngWholeRight = (rngIndividualArea.Column + rngIndividualArea.Columns.Count - 1) Else lngWholeRight = Application.WorksheetFunction.Max(lngWholeRight, (rngIndividualArea.Column + rngIndividualArea.Columns.Count - 1))
    Next rngIndividualArea
    arrWholeArea = Array(lngWholeTop, lngWholeLeft, lngWholeBottom, lngWholeRight)
'
' Get the entire area covered by the various areas.
'
    Set rngWholeArea = rngUnion.Parent.Range(rngUnion.Parent.Cells(lngWholeTop, lngWholeLeft), rngUnion.Parent.Cells(lngWholeBottom, lngWholeRight))
'
' Get intersection, this is or are the area(s) to remove.
'
    Set rngIntersection = Application.Intersect(r1, r2)
    For Each rngIntersectArea In rngIntersection.Areas
        arrIntersection = Array(rngIntersectArea.Row, _
                                rngIntersectArea.Column, _
                                rngIntersectArea.Row + rngIntersectArea.Rows.Count - 1, _
                                rngIntersectArea.Column + rngIntersectArea.Columns.Count - 1)
'
' Get the difference. This is the whole area above, left, below and right of the intersection.
' Identify if there is anything above the intersection.
'
        Set rngSection = Nothing
        If (arrWholeArea(0) < arrIntersection(0)) _
        Then Set rngSection = Application.Intersect(rngWholeArea.Parent.Range(rngWholeArea.Parent.Cells(arrWholeArea(0), arrWholeArea(1)), _
                                                                              rngWholeArea.Parent.Cells(arrIntersection(0) - 1, arrWholeArea(3))), _
                                                    rngUnion)
        If Not rngSection Is Nothing _
        Then
            If rngResultingRange Is Nothing _
            Then Set rngResultingRange = rngSection _
            Else Set rngResultingRange = Application.Union(rngResultingRange, rngSection)
        End If
'
' Identify if there is anything left of the intersection.
'
        Set rngSection = Nothing
        If arrWholeArea(1) < arrIntersection(1) _
        Then Set rngSection = Application.Intersect(rngWholeArea.Parent.Range(rngWholeArea.Parent.Cells(arrWholeArea(0), arrWholeArea(1)), _
                                                                              rngWholeArea.Parent.Cells(arrWholeArea(2), arrIntersection(1) - 1)), _
                                                    rngUnion)
        If Not rngSection Is Nothing _
        Then
            If rngResultingRange Is Nothing _
            Then Set rngResultingRange = rngSection _
            Else Set rngResultingRange = Application.Union(rngResultingRange, rngSection)
        End If
'
' Identify if there is anything right of the intersection.
'
        Set rngSection = Nothing
        If arrWholeArea(3) > arrIntersection(3) _
        Then Set rngSection = Application.Intersect(rngWholeArea.Parent.Range(rngWholeArea.Parent.Cells(arrWholeArea(0), arrIntersection(3) + 1), _
                                                                              rngWholeArea.Parent.Cells(arrWholeArea(2), arrWholeArea(3))), _
                                                    rngUnion)
        If Not rngSection Is Nothing _
        Then
            If rngResultingRange Is Nothing _
            Then Set rngResultingRange = rngSection _
            Else Set rngResultingRange = Application.Union(rngResultingRange, rngSection)
        End If
'
' Identify if there is anything below the intersection.
'
        Set rngSection = Nothing
        If arrWholeArea(2) > arrIntersection(2) _
        Then Set rngSection = Application.Intersect(rngWholeArea.Parent.Range(rngWholeArea.Parent.Cells(arrIntersection(2) + 1, arrWholeArea(1)), _
                                                                              rngWholeArea.Parent.Cells(arrWholeArea(2), arrWholeArea(3))), _
                                                    rngUnion)
        If Not rngSection Is Nothing _
        Then
            If rngResultingRange Is Nothing _
            Then Set rngResultingRange = rngSection _
            Else Set rngResultingRange = Application.Union(rngResultingRange, rngSection)
        End If
        Set rngUnion = rngResultingRange
        Set rngResultingRange = Nothing
    Next rngIntersectArea
'
' Return the result. This is the area "around" the intersection.
'
    Set UnionExclusive = rngUnion
End Function

With a little hacking one can modify the code to exclude any area outside the first range passed as a parameter. For me the need was getting everything but the cells in common, i.e. the opposite of a Union.

This is a little test that uses colour marking to show the effect:

Sub Test()
Dim r As Excel.Range

ActiveSheet.Cells.Clear

Set r = UnionExclusive([A2:C10], [B1:B15])
r.Interior.ColorIndex = 6

Set r = UnionExclusive([F2:H11], [G4:H5,G8:H9,J10:J11,F14:J14])
r.Interior.ColorIndex = 7

Set r = UnionExclusive([F17:J26], [G17:G21,G24:G26,I17:I26,J19:J20])
r.Interior.ColorIndex = 43

The entire story can be found here: https://dutchgemini.wordpress.com/2020/02/28/obtain-a-union-exclusive-range-from-excel-via-vba/

Enjoy.

Upvotes: 0

newman
newman

Reputation: 1

Although iterative and not recursive, here's my solution. The function returns the rangeA subtracted by rangeB

public Function SubtractRange(rangeA Range, rangeB as Range) as Range
'rangeA is a range to subtract from
'rangeB is the range we want to subtract

 Dim existingRange As Range
  Dim resultRange As Range
  Set existingRange = rangeA
  Set resultRange = Nothing
  Dim c As Range
  For Each c In existingRange
  If Intersect(c, rangeB) Is Nothing Then
    If resultRange Is Nothing Then
      Set resultRange = c
    Else
      Set resultRange = Union(c, resultRange)
    End If
  End If
  Next c
  Set SubtractRange = resultRange
End Sub

Upvotes: 0

Qbik
Qbik

Reputation: 6157

My solution is shorter but I don't know if it is optimal one:

Sub RangeSubtraction()

    Dim firstRange As Range
    Dim secondRange As Range
    Dim rIntersect As Range
    Dim rOutput As Range
    Dim x As Range

    Set firstRange = Range("A1:B10")
    Set secondRange = Range("A5:B10")

    Set rIntersect = Intersect(firstRange, secondRange)

    For Each x In firstRange
        If Intersect(rIntersect, x) Is Nothing Then
            If rOutput Is Nothing Then 'ugly 'if-else' but needed, can't use Union(Nothing, Range("A1")) etc.
                Set rOutput = x
            Else
                Set rOutput = Application.Union(rOutput, x)
            End If
        End If
    Next x

    Msgbox rOutput.Address

End Sub

Upvotes: 1

Dick Kusleika
Dick Kusleika

Reputation: 33175

Your divide and conquer seems like a good way to go. You need to introduce some recursion and should be reasonably fast

Private mrBuild As Range

Public Function SubtractRanges(rFirst As Range, rSecond As Range) As Range

    Dim rInter As Range
    Dim rReturn As Range
    Dim rArea As Range

    Set rInter = Intersect(rFirst, rSecond)
    Set mrBuild = Nothing

    If rInter Is Nothing Then 'No overlap
        Set rReturn = rFirst
    ElseIf rInter.Address = rFirst.Address Then 'total overlap
        Set rReturn = Nothing
    Else 'partial overlap
        For Each rArea In rFirst.Areas
            BuildRange rArea, rInter
        Next rArea
        Set rReturn = mrBuild
    End If

    Set SubtractRanges = rReturn

End Function

Sub BuildRange(rArea As Range, rInter As Range)

    Dim rLeft As Range, rRight As Range
    Dim rTop As Range, rBottom As Range

    If Intersect(rArea, rInter) Is Nothing Then 'no overlap
        If mrBuild Is Nothing Then
            Set mrBuild = rArea
        Else
            Set mrBuild = Union(mrBuild, rArea)
        End If
    Else 'some overlap
        If rArea.Columns.Count = 1 Then 'we've exhausted columns, so split on rows
            If rArea.Rows.Count > 1 Then 'if one cell left, don't do anything
                Set rTop = rArea.Resize(rArea.Rows.Count \ 2) 'split the range top to bottom
                Set rBottom = rArea.Resize(rArea.Rows.Count - rTop.Rows.Count).Offset(rTop.Rows.Count)
                BuildRange rTop, rInter 'rerun it
                BuildRange rBottom, rInter
            End If
        Else
            Set rLeft = rArea.Resize(, rArea.Columns.Count \ 2) 'split the range left to right
            Set rRight = rArea.Resize(, rArea.Columns.Count - rLeft.Columns.Count).Offset(, rLeft.Columns.Count)
            BuildRange rLeft, rInter 'rerun it
            BuildRange rRight, rInter
        End If
    End If

End Sub

These aren't particularly huge ranges, but they all ran fast

?subtractranges(rangE("A1"),range("a10")).Address
$A$1
?subtractranges(range("a1"),range("a1")) is nothing
True
?subtractranges(range("$B$3,$B$6,$C$8:$W$39"),range("a1:C10")).Address
$C$11:$C$39,$D$8:$W$39
?subtractranges(range("a1:C10"),range("$B$3,$B$6,$C$8:$W$39")).Address
$A$1:$A$10,$B$1:$B$2,$B$4:$B$5,$B$7:$B$10,$C$1:$C$7

Upvotes: 6

Related Questions