Reputation: 1725
I'm looking to create a function that operates very closely to filter
, except that it keeps the non-matching results as well, and maintains sort order. For instance, say you wanted to filter out numbers divisible by 3 in an array and still keep the list of numbers that aren't divisible by 3.
filter
With filter
, you only get the list of numbers divisible by 3 and the original list stays unchanged. You can then filter the original list again with the opposite predicate, but this is an unnecessary second pass. The code looks like this:
let numbers = [1,2,3,4,5,6,7,8,9,10]
let divisibleBy3 = numbers.filter { $0 % 3 == 0 } // [3,6,9]
let theRest = numbers.filter { $0 % 3 != 0 } // [1,2,4,5,7,8,10]
It's true that this is pretty readable, but the fact that it does 2 passes seems inefficient to me, especially if the predicate were more complicated. That's twice as many checks as are actually needed.
separate
function in Collection
extensionMy next attempt was extending Collection
and making a function I called separate
. This function would take the collection and go through the elements one at a time and add them to either the matching list or the non-matching list. The code looks like this:
extension Collection {
func separate(predicate: (Generator.Element) -> Bool) -> (matching: [Generator.Element], notMatching: [Generator.Element]) {
var groups: ([Generator.Element],[Generator.Element]) = ([],[])
for element in self {
if predicate(element) {
groups.0.append(element)
} else {
groups.1.append(element)
}
}
return groups
}
}
I can then use the function like this:
let numbers = [1,2,3,4,5,6,7,8,9,10]
let groups = numbers.separate { $0 % 3 == 0 }
let matching = groups.matching // [3,6,9]
let notMatching = groups.notMatching // [1,2,4,5,7,8,10]
This is also pretty clean, but the only thing I don't like is that I'm using a tuple as a return type. Maybe others will disagree, but I'd prefer returning the same type as self
for chaining. But technically, you can just grab either .matching
or .notMatching
, which is the same type as self
, and you can chain off of either of them.
removeIf
function in Array
extensionMy issue with separate
returning a tuple led me to try to make a function that would modify self
by removing the matches as it found them and adding them to a new list, and returning the matches list at the end. The returned list is your matches, and the array is trimmed of those values. Order is preserved in both arrays. The code looks like this:
extension Array {
mutating func removeIf(predicate: (Element) -> Bool) -> [Element] {
var removedCount: Int = 0
var removed: [Element] = []
for (index,element) in self.enumerated() {
if predicate(element) {
removed.append(self.remove(at: index-removedCount))
removedCount += 1
}
}
return removed
}
}
And it is used like this:
var numbers = [1,2,3,4,5,6,7,8,9,10]
let divisibleBy3 = numbers.removeIf { $0 % 3 == 0 }
// divisibleBy3: [3,6,9]
// numbers: [1,2,4,5,7,8,10]
This function had to be implemented in an extension of Array
, because the concept of removing an element at a particular index doesn't apply to regular Collections
(an Array
is defined as public struct Array<Element> : RandomAccessCollection, MutableCollection
, and it directly defines the remove(at:)
function, rather than getting it from inheritance or a protocol).
I'm a big fan of code reuse, and after coming up with Option 3, I realized I could probably reuse the separate
function from Option 2. I came up with this:
extension Array {
mutating func removeIf(predicate: (Element) -> Bool) -> [Element] {
let groups = self.separate(predicate: predicate)
self = groups.notMatching
return groups.matching
}
}
And it's used just like in Option 3.
I was concerned with performance, so I ran each Option through XCTest's measure
with 1000 iterations. These were the results:
Option 1: 9 ms
Option 2: 7 ms
Option 3: 10 ms
Option 4: 8 ms
I knew about partition
, but I wasn't going to consider it because it didn't preserve the sort order. negaipro's answer is essentially partition
, but it got me thinking. The idea with partition
is to swap elements that match with the pivot point, thus ensuring that everything on one side of the end pivot point will all match the predicate and the other side doesn't. I took that idea and changed the action to "move to the end". So matches are removed from their spot and added to the end.
extension Array {
mutating func swapIfModified(predicate: (Element) -> Bool) -> Int {
var matchCount = 0
var index = 0
while index < (count-matchCount) {
if predicate(self[index]) {
append(remove(at: index))
matchCount += 1
} else {
index += 1
}
}
return count-matchCount
}
}
Under my initial tests using an array with 10 numbers, it was comparable to the other Options. But I was concerned with the performance of the line append(remove(at: index))
. So I tried all the Options again with arrays going from 1 to 1000, and this option was definitely the slowest.
There isn't a big performance difference between these options. And since Option 4 was faster than Option 3 and reuses the code from Option 2, I'm inclined to throw out Option 3. So I'm leaning towards using plain old filter
when I don't care about the unfiltered results (and, likewise, for when I don't care about the filtered results, since it's just using the opposite predicate), and then using either separate
or removeIf
when I care about keeping both the filtered and unfiltered results.
So, am I missing something built into Swift that does this already? Is there a better way to accomplish this? Is my extension syntax missing anything (anything that could make it apply this concept to more areas, for example)?
Upvotes: 42
Views: 15488
Reputation: 875
There is a new Swift Algorithms open-source for sequence and collection algorithms, along with their related types.
You can use the stable partition from there
Methods for performing a stable partition on mutable collections, and for finding the partitioning index in an already partitioned collection.
The standard library’s existing partition(by:)
method, which re-orders the
elements in a collection into two partitions based on a given predicate, doesn’t
guarantee stability for either partition. That is, the order of the elements in
each partition doesn’t necessarily match their relative order in the original
collection. These new methods expand on the existing partition(by:)
by
providing stability for one or both partitions.
// existing partition(by:) - unstable ordering
var numbers = [10, 20, 30, 40, 50, 60, 70, 80]
let p1 = numbers.partition(by: { $0.isMultiple(of: 20) })
// p1 == 4
// numbers == [10, 70, 30, 50, 40, 60, 20, 80]
// new stablePartition(by:) - keeps the relative order of both partitions
numbers = [10, 20, 30, 40, 50, 60, 70, 80]
let p2 = numbers.stablePartition(by: { $0.isMultiple(of: 20) })
// p2 == 4
// numbers == [10, 30, 50, 70, 20, 40, 60, 80]
Since partitioning is frequently used in divide-and-conquer algorithms, we also include a variant that accepts a range parameter to avoid copying when mutating slices, as well as a range-based variant of the existing standard library partition.
The partitioningIndex(where:)
method returns the index of the start of the
second partition when called on an already partitioned collection.
let numbers = [10, 30, 50, 70, 20, 40, 60]
let p = numbers.partitioningIndex(where: { $0.isMultiple(of: 20) })
// numbers[..<p] == [10, 30, 50, 70]
// numbers[p...] = [20, 40, 60]
Upvotes: 5
Reputation:
Technically, this is not guaranteed to preserve order, but it does.
Dictionary(grouping: numbers) { $0.isMultiple(of: 3) }
https://github.com/apple/swift/blob/master/stdlib/public/core/NativeDictionary.swift
Upvotes: 9
Reputation: 15778
Swift 4 Solution
It reorders the origin array and returns start index of subarray satisfies the predicate.
In this example it returns 7.
0..<7 elemets aren't divisible by 3 and 7..n-1 elements are divisible by 3.
var numbers = [1,2,3,4,5,6,7,8,9,10]
let partition = numbers.partition(by: { $0 % 3 == 0 })
let divisibleBy3 = Array(numbers[..<partition]) //[3,6,9]
let theRest = Array(numbers[partition...]) //[1,2,4,5,7,8,10]
Upvotes: 7
Reputation: 48175
In WWDC 2018 session Embracing Algorithm they mention the function stablePartition
, you can take a look here https://github.com/apple/swift/blob/master/test/Prototypes/Algorithms.swift
extension Collection where Self : MutableCollectionAlgorithms {
@discardableResult
mutating func stablePartition(
isSuffixElement: (Element) throws -> Bool
) rethrows -> Index {
return try stablePartition(
count: count, isSuffixElement: isSuffixElement)
}
/// Moves all elements satisfying `isSuffixElement` into a suffix of the collection,
/// preserving their relative order, returning the start of the resulting suffix.
///
/// - Complexity: O(n) where n is the number of elements.
/// - Precondition: `n == self.count`
fileprivate mutating func stablePartition(
count n: Int, isSuffixElement: (Element) throws-> Bool
) rethrows -> Index {
if n == 0 { return startIndex }
if n == 1 {
return try isSuffixElement(self[startIndex]) ? startIndex : endIndex
}
let h = n / 2, i = index(startIndex, offsetBy: h)
let j = try self[..<i].stablePartition(
count: h, isSuffixElement: isSuffixElement)
let k = try self[i...].stablePartition(
count: n - h, isSuffixElement: isSuffixElement)
return self[j..<k].rotate(shiftingToStart: i)
}
}
Upvotes: 2
Reputation: 52221
For fewer elements this may be fastest.
extension Array {
func stablePartition(by condition: (Element) -> Bool) -> ([Element], [Element]) {
var matching = [Element]()
var nonMatching = [Element]()
for element in self {
if condition(element) {
matching.append(element)
} else {
nonMatching.append(element)
}
}
return (matching, nonMatching)
}
}
let numbers = [1,2,3,4,5,6,7,8,9,10]
let (divisibleBy3, theRest) = numbers.stablePartition { $0 % 3 == 0 }
print("divisible by 3: \(divisibleBy3), the rest: \(theRest)")
// divisible by 3: [3, 6, 9], the rest: [1, 2, 4, 5, 7, 8, 10]
For many elements this may be faster, because of fewer allocations. I have not measured performance.
extension Array {
public func stablePartition(by condition: (Element) throws -> Bool) rethrows -> ([Element], [Element]) {
var indexes = Set<Int>()
for (index, element) in self.enumerated() {
if try condition(element) {
indexes.insert(index)
}
}
var matching = [Element]()
matching.reserveCapacity(indexes.count)
var nonMatching = [Element]()
nonMatching.reserveCapacity(self.count - indexes.count)
for (index, element) in self.enumerated() {
if indexes.contains(index) {
matching.append(element)
} else {
nonMatching.append(element)
}
}
return (matching, nonMatching)
}
}
Upvotes: 2
Reputation: 481
// swap and return pivot
extension Array
{
// return pivot
mutating func swapIf(predicate: (Element) -> Bool) -> Int
{
var pivot = 0
for i in 0..<self.count
{
if predicate( self[i] )
{
if i > 0
{
swap(&self[i], &self[pivot])
}
pivot += 1
}
}
return pivot
}
}
This is my code and the concept is.. reduce memory usage.
I checked that 'swapIf' is 4-times faster than 'removeIf'.
Upvotes: 4
Reputation: 11539
let objects: [Int] = Array(1..<11)
let split = objects.reduce(([Int](), [Int]())) { (value, object) -> ([Int], [Int]) in
var value = value
if object % 2 == 0 {
value.1.append(object)
} else {
value.0.append(object)
}
return value
}
Upvotes: 6