Reputation: 404
I'm trying to write a function to subtract Excel ranges. It should take two input parameters: range A and range B. It should return a range object consisting of cells that are part of range A and are not part of range B (as in set subtraction)
I've seen some examples on the web that use a temporary worksheet to do this (fast, but might introduce some issues with protected workbooks and such) and some other examples that go cell by cell through the first range checking for intersections with the second one (extremely slow).
After some thinking I've come up with this code {1}, which works faster, but still is slow. Subtracting from a range representing the whole worksheet takes from 1 to 5 minutes depending on how complex the second range is.
When I looked over that code trying to find ways to make it faster I saw a possibility for applying the divide-and-conquer paradigm, which I did {2}. But that had made my code slower instead. I'm not much of a CS guy, so I might have done something wrong or this algorithm simply is not the one the divide-and-conquer should be used on, I don't know.
I have also tried rewriting it using mostly recursion, but that took forever to finish or (more often) had thrown Out of Stack Space errors. I didn't save the code.
The only (marginally) successful improvement I've been able to do is adding a flip switch {3} and going first through rows, then (in the next call) through columns instead of going through both in the same call, but the effect was not as good as I've hoped. Now I see that even though we don't go through all rows in the first call, in the second call we still loop through the same amount of rows we would in the first one, only these rows are a little bit shorter :)
I would appreciate any help in improving or rewriting this function, thank you!
Dick Kusleika, thank you very much for providing your answer! I think I'll use it with some modifications I've made:
With these modifications the code runs very fast on the most of common cases. As it's been pointed out, it will still be slow with checkerboard-style huge range which I agree is unavoidable.
I think this code still has a room for improvement and I'll update this post in case I modify it.
Improvement possibilities:
{0} Solution code
BUGGY! See reply from SteveS below
Update 2023-07-07
After posting this code to StackOverflow I started using it and hitting its limitations. Fixed version has become significantly more complex and yet it is not perfect, large merged cells may trip it up. Here is the code I've been using for the past 9 years: https://gist.github.com/sio/7423c651d7367de886bb256c4de5e45e
Original buggy version below:
Public Function SubtractRanges(rFirst As Range, rSecond As Range) As Range
'
' Returns a range of cells that are part of rFirst, but not part of rSecond
' (as in set subtraction)
'
' This function handles big input ranges really well!
'
' The reason for having a separate recursive function is
' handling multi-area rFirst range
'
Dim rInter As Range
Dim rReturn As Range
Dim rArea As Range
Set rInter = Intersect(rFirst, rSecond)
Set mrBuild = Nothing
If rInter Is Nothing Then 'no overlap
Set rReturn = rFirst
ElseIf rInter.Address = rFirst.Address Then 'total overlap
Set rReturn = Nothing
Else 'partial overlap
For Each rArea In rFirst.Areas
Set mrBuild = BuildRange(rArea, rInter) 'recursive
Next rArea
Set rReturn = mrBuild
End If
Set SubtractRanges = rReturn
End Function
Private Function BuildRange(rArea As Range, rInter As Range, _
Optional mrBuild As Range = Nothing) As Range
'
' Recursive function for SubtractRanges()
'
' Subtracts rInter from rArea and adds the result to mrBuild
'
Dim rLeft As Range, rRight As Range
Dim rTop As Range, rBottom As Range
Dim rInterSub As Range
Dim GoByColumns As Boolean
Set rInterSub = Intersect(rArea, rInter)
If rInterSub Is Nothing Then 'no overlap
If mrBuild Is Nothing Then
Set mrBuild = rArea
Else
Set mrBuild = Union(mrBuild, rArea)
End If
ElseIf Not rInterSub.Address = rArea.Address Then 'some overlap
If Not rArea.Cells.CountLarge = 1 Then 'just in case there is only one cell for some impossible reason
' Decide whether to go by columns or by rows
' (helps when subtracting whole rows/columns)
If Not rInterSub.Columns.Count = rArea.Columns.Count And _
((Not rInterSub.Cells.CountLarge = 1 And _
(rInterSub.Rows.Count > rInterSub.Columns.Count _
And rArea.Columns.Count > 1) Or (rInterSub.Rows.Count = 1 _
And Not rArea.Columns.Count = 1)) Or _
(rInterSub.Cells.CountLarge = 1 _
And rArea.Columns.Count > rArea.Rows.Count)) Then
GoByColumns = True
Else
GoByColumns = False
End If
If Not GoByColumns Then
Set rTop = rArea.Resize(rArea.Rows.Count \ 2) 'split the range top to bottom
Set rBottom = rArea.Resize(rArea.Rows.Count - rTop.Rows.Count).Offset(rTop.Rows.Count)
Set mrBuild = BuildRange(rTop, rInterSub, mrBuild) 'rerun it
Set mrBuild = BuildRange(rBottom, rInterSub, mrBuild)
Else
Set rLeft = rArea.Resize(, rArea.Columns.Count \ 2) 'split the range left to right
Set rRight = rArea.Resize(, rArea.Columns.Count - rLeft.Columns.Count).Offset(, rLeft.Columns.Count)
Set mrBuild = BuildRange(rLeft, rInterSub, mrBuild) 'rerun it
Set mrBuild = BuildRange(rRight, rInterSub, mrBuild)
End If
End If
End If
Set BuildRange = mrBuild
End Function
{1} Initial code (go row by row, column by column)
Function SubtractRanges(RangeA, RangeB) As Range
'
' Returns a range of cells that are part of RangeA, but not part of RangeB
'
' This function handles big RangeA pretty well (took less than a minute
' on my computer with RangeA = ActiveSheet.Cells)
'
Dim CommonArea As Range
Dim Result As Range
Set CommonArea = Intersect(RangeA, RangeB)
If CommonArea Is Nothing Then
Set Result = RangeA
ElseIf CommonArea.Address = RangeA.Address Then
Set Result = Nothing
Else
'a routine to deal with A LOT of cells in RangeA
'go column by column, then row by row
Dim GoodCells As Range
Dim UnworkedCells As Range
For Each Area In RangeA.Areas
For Each Row In Area.Rows
Set RowCommonArea = Intersect(Row, CommonArea)
If Not RowCommonArea Is Nothing Then
If Not RowCommonArea.Address = Row.Address Then
Set UnworkedCells = AddRanges(UnworkedCells, Row)
End If
Else
Set GoodCells = AddRanges(GoodCells, Row)
End If
Next Row
For Each Column In Area.Columns
Set ColumnCommonArea = Intersect(Column, CommonArea)
If Not ColumnCommonArea Is Nothing Then
If Not ColumnCommonArea.Address = Column.Address Then
Set UnworkedCells = AddRanges(UnworkedCells, Column)
End If
Else
Set GoodCells = AddRanges(GoodCells, Column)
End If
Next Column
Next Area
If Not UnworkedCells Is Nothing Then
For Each Area In UnworkedCells
Set GoodCells = AddRanges(GoodCells, SubtractRanges(Area, CommonArea))
Next Area
End If
Set Result = GoodCells
End If
Set SubtractRanges = Result
End Function
{2} Divide and conquer
Function SubtractRanges(RangeA, RangeB) As Range
'
' Returns a range of cells that are part of RangeA, but not part of RangeB
'
Dim CommonArea As Range
Dim Result As Range
Set CommonArea = Intersect(RangeA, RangeB)
If CommonArea Is Nothing Then
Set Result = RangeA
ElseIf CommonArea.Address = RangeA.Address Then
Set Result = Nothing
Else
'a routine to deal with A LOT of cells in RangeA
'go column by column, then row by row
Dim GoodCells As Range
Dim UnworkedCells As Range
For Each Area In RangeA.Areas
RowsNumber = Area.Rows.Count
If RowsNumber > 1 Then
Set RowsLeft = Range(Area.Rows(1), Area.Rows(RowsNumber / 2))
Set RowsRight = Range(Area.Rows(RowsNumber / 2 + 1), Area.Rows(RowsNumber))
Else
Set RowsLeft = Area
Set RowsRight = CommonArea.Cells(1, 1) 'the next best thing to Nothing - will end its cycle rather fast and won't throw an error with For Each statement
End If
For Each Row In Array(RowsLeft, RowsRight)
Set RowCommonArea = Intersect(Row, CommonArea)
If Not RowCommonArea Is Nothing Then
If Not RowCommonArea.Address = Row.Address Then
Set UnworkedCells = AddRanges(UnworkedCells, Row)
End If
Else
Set GoodCells = AddRanges(GoodCells, Row)
End If
Next Row
ColumnsNumber = Area.Columns.Count
If ColumnsNumber > 1 Then
Set ColumnsLeft = Range(Area.Columns(1), Area.Columns(ColumnsNumber / 2))
Set ColumnsRight = Range(Area.Columns(ColumnsNumber / 2 + 1), Area.Columns(ColumnsNumber))
Else
Set ColumnsLeft = Area
Set ColumnsRight = CommonArea.Cells(1, 1)
End If
For Each Column In Array(ColumnsLeft, ColumnsRight)
Set ColumnCommonArea = Intersect(Column, CommonArea)
If Not ColumnCommonArea Is Nothing Then
If Not ColumnCommonArea.Address = Column.Address Then
Set UnworkedCells = AddRanges(UnworkedCells, Column)
End If
Else
Set GoodCells = AddRanges(GoodCells, Column)
End If
Next Column
Next Area
If Not UnworkedCells Is Nothing Then
For Each Area In UnworkedCells
Set GoodCells = AddRanges(GoodCells, SubtractRanges(Area, CommonArea))
Next Area
End If
Set Result = GoodCells
End If
Set SubtractRanges = Result
End Function
{3} Initial code + flip switch (row by row OR column by column in turns)
Function SubtractRanges(RangeA, RangeB, Optional Flip As Boolean = False) As Range
'
' Returns a range of cells that are part of RangeA, but not part of RangeB
'
' This function handles big RangeA pretty well (took less than a minute
' on my computer with RangeA = ActiveSheet.Cells)
'
Dim CommonArea As Range
Dim Result As Range
Set CommonArea = Intersect(RangeA, RangeB)
If CommonArea Is Nothing Then
Set Result = RangeA
ElseIf CommonArea.Address = RangeA.Address Then
Set Result = Nothing
Else
'a routine to deal with A LOT of cells in RangeA
'go column by column, then row by row
Dim GoodCells As Range
Dim UnworkedCells As Range
For Each Area In RangeA.Areas
If Flip Then
For Each Row In Area.Rows
Set RowCommonArea = Intersect(Row, CommonArea)
If Not RowCommonArea Is Nothing Then
If Not RowCommonArea.Address = Row.Address Then
Set UnworkedCells = AddRanges(UnworkedCells, Row)
End If
Else
Set GoodCells = AddRanges(GoodCells, Row)
End If
Next Row
Else
For Each Column In Area.Columns
Set ColumnCommonArea = Intersect(Column, CommonArea)
If Not ColumnCommonArea Is Nothing Then
If Not ColumnCommonArea.Address = Column.Address Then
Set UnworkedCells = AddRanges(UnworkedCells, Column)
End If
Else
Set GoodCells = AddRanges(GoodCells, Column)
End If
Next Column
End If
Next Area
If Not UnworkedCells Is Nothing Then
For Each Area In UnworkedCells
Set GoodCells = AddRanges(GoodCells, SubtractRanges(Area, CommonArea, Not Flip))
Next Area
End If
Set Result = GoodCells
End If
Set SubtractRanges = Result
End Function
A little helper function mentioned here and there:
Function AddRanges(RangeA, RangeB)
'
' The same as Union built-in but handles empty ranges fine.
'
If Not RangeA Is Nothing And Not RangeB Is Nothing Then
Set AddRanges = Union(RangeA, RangeB)
ElseIf RangeA Is Nothing And RangeB Is Nothing Then
Set AddRanges = Nothing
Else
If RangeA Is Nothing Then
Set AddRanges = RangeB
Else
Set AddRanges = RangeA
End If
End If
End Function
Upvotes: 10
Views: 24847
Reputation: 79
I've recently written a [rather fast] function in VBA that I named UnionExclusive()
which returns the Union between 2 ranges of cells –with multiple areas allowed for each range– with the exclusion of the range of cells they have in common. It practically uses only Application.Union()
and Application.Intersect()
and does not loop single cells.
[Edit] Note: the code does not [yet] capture situations where the second range intersects multiple times with the first range, as with Application.Intersect(r1, r2).AreasCount > 1
so you better check before calling this function.
Function UnionExclusive(ByRef r1 As Excel.Range, r2 As Excel.Range) As Excel.Range
'
' This function returns the range of cells that is the Union of both ranges with the
' exclusion of the ranges or cells that they have in common.
'
On Error Resume Next
Dim rngWholeArea As Excel.Range
Dim rngIndividualArea As Excel.Range
Dim rngIntersection As Excel.Range
Dim rngIntersectArea As Excel.Range
Dim rngUnion As Excel.Range
Dim rngSection As Excel.Range
Dim rngResultingRange As Excel.Range
Dim lngWholeTop As Long
Dim lngWholeLeft As Long
Dim lngWholeBottom As Long
Dim lngWholeRight As Long
Dim arrIntersection As Variant
Dim arrWholeArea As Variant
'
' Must be on same sheet, return only first range.
'
If Not r1.Parent Is r2.Parent Then Set UnionExclusive = r1: Exit Function
'
' No overlapping cells, return the union.
'
If Application.Intersect(r1, r2) Is Nothing Then Set UnionExclusive = Application.Union(r1, r2): Exit Function
'
' Range to subtract must be contiguous. If the second range has multiple areas, loop through all the individual areas.
'
If (r2.Areas.Count > 1) _
Then
Set rngResultingRange = r1
For Each rngIndividualArea In r2.Areas
Set rngResultingRange = UnionExclusive(rngResultingRange, rngIndividualArea)
Next rngIndividualArea
Set UnionExclusive = rngResultingRange
Exit Function
End If
'
' Get the overall size of the Union() since Rows/Columns "Count" is based on the first area only.
'
Set rngUnion = Application.Union(r1, r2)
For Each rngIndividualArea In rngUnion.Areas
If (lngWholeTop = 0) Then lngWholeTop = rngIndividualArea.Row Else lngWholeTop = Application.WorksheetFunction.Min(lngWholeTop, rngIndividualArea.Row)
If (lngWholeLeft = 0) Then lngWholeLeft = rngIndividualArea.Column Else lngWholeLeft = Application.WorksheetFunction.Min(lngWholeLeft, rngIndividualArea.Column)
If (lngWholeBottom = 0) Then lngWholeBottom = (rngIndividualArea.Row + rngIndividualArea.Rows.Count - 1) Else lngWholeBottom = Application.WorksheetFunction.Max(lngWholeBottom, (rngIndividualArea.Row + rngIndividualArea.Rows.Count - 1))
If (lngWholeRight = 0) Then lngWholeRight = (rngIndividualArea.Column + rngIndividualArea.Columns.Count - 1) Else lngWholeRight = Application.WorksheetFunction.Max(lngWholeRight, (rngIndividualArea.Column + rngIndividualArea.Columns.Count - 1))
Next rngIndividualArea
arrWholeArea = Array(lngWholeTop, lngWholeLeft, lngWholeBottom, lngWholeRight)
'
' Get the entire area covered by the various areas.
'
Set rngWholeArea = rngUnion.Parent.Range(rngUnion.Parent.Cells(lngWholeTop, lngWholeLeft), rngUnion.Parent.Cells(lngWholeBottom, lngWholeRight))
'
' Get intersection, this is or are the area(s) to remove.
'
Set rngIntersection = Application.Intersect(r1, r2)
For Each rngIntersectArea In rngIntersection.Areas
arrIntersection = Array(rngIntersectArea.Row, _
rngIntersectArea.Column, _
rngIntersectArea.Row + rngIntersectArea.Rows.Count - 1, _
rngIntersectArea.Column + rngIntersectArea.Columns.Count - 1)
'
' Get the difference. This is the whole area above, left, below and right of the intersection.
' Identify if there is anything above the intersection.
'
Set rngSection = Nothing
If (arrWholeArea(0) < arrIntersection(0)) _
Then Set rngSection = Application.Intersect(rngWholeArea.Parent.Range(rngWholeArea.Parent.Cells(arrWholeArea(0), arrWholeArea(1)), _
rngWholeArea.Parent.Cells(arrIntersection(0) - 1, arrWholeArea(3))), _
rngUnion)
If Not rngSection Is Nothing _
Then
If rngResultingRange Is Nothing _
Then Set rngResultingRange = rngSection _
Else Set rngResultingRange = Application.Union(rngResultingRange, rngSection)
End If
'
' Identify if there is anything left of the intersection.
'
Set rngSection = Nothing
If arrWholeArea(1) < arrIntersection(1) _
Then Set rngSection = Application.Intersect(rngWholeArea.Parent.Range(rngWholeArea.Parent.Cells(arrWholeArea(0), arrWholeArea(1)), _
rngWholeArea.Parent.Cells(arrWholeArea(2), arrIntersection(1) - 1)), _
rngUnion)
If Not rngSection Is Nothing _
Then
If rngResultingRange Is Nothing _
Then Set rngResultingRange = rngSection _
Else Set rngResultingRange = Application.Union(rngResultingRange, rngSection)
End If
'
' Identify if there is anything right of the intersection.
'
Set rngSection = Nothing
If arrWholeArea(3) > arrIntersection(3) _
Then Set rngSection = Application.Intersect(rngWholeArea.Parent.Range(rngWholeArea.Parent.Cells(arrWholeArea(0), arrIntersection(3) + 1), _
rngWholeArea.Parent.Cells(arrWholeArea(2), arrWholeArea(3))), _
rngUnion)
If Not rngSection Is Nothing _
Then
If rngResultingRange Is Nothing _
Then Set rngResultingRange = rngSection _
Else Set rngResultingRange = Application.Union(rngResultingRange, rngSection)
End If
'
' Identify if there is anything below the intersection.
'
Set rngSection = Nothing
If arrWholeArea(2) > arrIntersection(2) _
Then Set rngSection = Application.Intersect(rngWholeArea.Parent.Range(rngWholeArea.Parent.Cells(arrIntersection(2) + 1, arrWholeArea(1)), _
rngWholeArea.Parent.Cells(arrWholeArea(2), arrWholeArea(3))), _
rngUnion)
If Not rngSection Is Nothing _
Then
If rngResultingRange Is Nothing _
Then Set rngResultingRange = rngSection _
Else Set rngResultingRange = Application.Union(rngResultingRange, rngSection)
End If
Set rngUnion = rngResultingRange
Set rngResultingRange = Nothing
Next rngIntersectArea
'
' Return the result. This is the area "around" the intersection.
'
Set UnionExclusive = rngUnion
End Function
With a little hacking one can modify the code to exclude any area outside the first range passed as a parameter. For me the need was getting everything but the cells in common, i.e. the opposite of a Union.
This is a little test that uses colour marking to show the effect:
Sub Test()
Dim r As Excel.Range
ActiveSheet.Cells.Clear
Set r = UnionExclusive([A2:C10], [B1:B15])
r.Interior.ColorIndex = 6
Set r = UnionExclusive([F2:H11], [G4:H5,G8:H9,J10:J11,F14:J14])
r.Interior.ColorIndex = 7
Set r = UnionExclusive([F17:J26], [G17:G21,G24:G26,I17:I26,J19:J20])
r.Interior.ColorIndex = 43
The entire story can be found here: https://dutchgemini.wordpress.com/2020/02/28/obtain-a-union-exclusive-range-from-excel-via-vba/
Enjoy.
Upvotes: 0
Reputation: 1
Although iterative and not recursive, here's my solution.
The function returns the rangeA
subtracted by rangeB
public Function SubtractRange(rangeA Range, rangeB as Range) as Range
'rangeA is a range to subtract from
'rangeB is the range we want to subtract
Dim existingRange As Range
Dim resultRange As Range
Set existingRange = rangeA
Set resultRange = Nothing
Dim c As Range
For Each c In existingRange
If Intersect(c, rangeB) Is Nothing Then
If resultRange Is Nothing Then
Set resultRange = c
Else
Set resultRange = Union(c, resultRange)
End If
End If
Next c
Set SubtractRange = resultRange
End Sub
Upvotes: 0
Reputation: 6157
My solution is shorter but I don't know if it is optimal one:
Sub RangeSubtraction()
Dim firstRange As Range
Dim secondRange As Range
Dim rIntersect As Range
Dim rOutput As Range
Dim x As Range
Set firstRange = Range("A1:B10")
Set secondRange = Range("A5:B10")
Set rIntersect = Intersect(firstRange, secondRange)
For Each x In firstRange
If Intersect(rIntersect, x) Is Nothing Then
If rOutput Is Nothing Then 'ugly 'if-else' but needed, can't use Union(Nothing, Range("A1")) etc.
Set rOutput = x
Else
Set rOutput = Application.Union(rOutput, x)
End If
End If
Next x
Msgbox rOutput.Address
End Sub
Upvotes: 1
Reputation: 33175
Your divide and conquer seems like a good way to go. You need to introduce some recursion and should be reasonably fast
Private mrBuild As Range
Public Function SubtractRanges(rFirst As Range, rSecond As Range) As Range
Dim rInter As Range
Dim rReturn As Range
Dim rArea As Range
Set rInter = Intersect(rFirst, rSecond)
Set mrBuild = Nothing
If rInter Is Nothing Then 'No overlap
Set rReturn = rFirst
ElseIf rInter.Address = rFirst.Address Then 'total overlap
Set rReturn = Nothing
Else 'partial overlap
For Each rArea In rFirst.Areas
BuildRange rArea, rInter
Next rArea
Set rReturn = mrBuild
End If
Set SubtractRanges = rReturn
End Function
Sub BuildRange(rArea As Range, rInter As Range)
Dim rLeft As Range, rRight As Range
Dim rTop As Range, rBottom As Range
If Intersect(rArea, rInter) Is Nothing Then 'no overlap
If mrBuild Is Nothing Then
Set mrBuild = rArea
Else
Set mrBuild = Union(mrBuild, rArea)
End If
Else 'some overlap
If rArea.Columns.Count = 1 Then 'we've exhausted columns, so split on rows
If rArea.Rows.Count > 1 Then 'if one cell left, don't do anything
Set rTop = rArea.Resize(rArea.Rows.Count \ 2) 'split the range top to bottom
Set rBottom = rArea.Resize(rArea.Rows.Count - rTop.Rows.Count).Offset(rTop.Rows.Count)
BuildRange rTop, rInter 'rerun it
BuildRange rBottom, rInter
End If
Else
Set rLeft = rArea.Resize(, rArea.Columns.Count \ 2) 'split the range left to right
Set rRight = rArea.Resize(, rArea.Columns.Count - rLeft.Columns.Count).Offset(, rLeft.Columns.Count)
BuildRange rLeft, rInter 'rerun it
BuildRange rRight, rInter
End If
End If
End Sub
These aren't particularly huge ranges, but they all ran fast
?subtractranges(rangE("A1"),range("a10")).Address
$A$1
?subtractranges(range("a1"),range("a1")) is nothing
True
?subtractranges(range("$B$3,$B$6,$C$8:$W$39"),range("a1:C10")).Address
$C$11:$C$39,$D$8:$W$39
?subtractranges(range("a1:C10"),range("$B$3,$B$6,$C$8:$W$39")).Address
$A$1:$A$10,$B$1:$B$2,$B$4:$B$5,$B$7:$B$10,$C$1:$C$7
Upvotes: 6