barndog
barndog

Reputation: 7183

Swift Array diff

Given two arrays, where one is the old set of values and the other is the new values, I want to find the "diff" of those two arrays such updates to the original array can be represented as:

enum CollectionChange<T: SequenceType> {
    case Initial(T)
    case Update(T, deletions: [Int], insertions: [Int], modifications: [Int]) 
}

I'm trying to build a simpler version of this where the changes object is built based on object equality, instead of indexes as RAC-MutableCollectionProperty is (for which the code is here and what might be the most complicated bit of code I've seen in a while; no documentation doesn't help).

Also important for this project is the ability to be able to observe changes to an array at any level of granularity. For example, a one-dimensional array, restricting T to Equatable, is a relatively easy use case. You can, as RAC-MutableCollectionProperty build up some sort of table that describes the changes, checking for equality on the objects. However once you get down to using two-dimensional arrays and deeper it gets a bit trickier because not only do you have to diff the elements at the lowest level but also describe section-level removals. In practice, no more than 2D arrays is really ever necessary but it'd be nice to have a solution that works regardless of the array depth. I'm not necessarily looking for a solution (although that'd be fantastic), really just any pointers and high level solutions on how to approach this problem.

One way I've thought of to observe multiple array levels is to write a diffing function that works on single dimensional arrays and construct a property such that:

let property: MutableCollectionProperty<MutableCollectionProperty<Int>>

where the property would check if its generic type is of it's own type. I'd have to change the changes description to something closer to

enum Changes<T> {
    case Initial(T)
    case Update(T, deletions: [NSIndexPath], insertions: [NSIndexPath], modifications: [NSIndexPath])
}

or maybe something like

enum Changes<T> {
    case Initial(T)
    case UpdateSections(sections: [T], deletions:[Int], insertions: [Int], modifications: [Int])
    case UpdateIndexes(T, deletions: [Int], insertions: [Int], modifications: [Int])
}

These are just my preliminary thoughts though, I'm open to any solution or suggestion.

BOUNTY EDIT:

The bounty will be awarded to someone who can provide a solution that given the following parameters:

a change set can be generated where a change set describes:

Changes only have to be described at the lowest level of the array (no need to worry about insertion & removal of higher segments, although you'd really earn the 300 rep with that) but change indexes must indicate the nested index path.

For example, if the array is a 3d array and an object at array[0][5][2] was deleted, the resulting index change should be an array [0, 5, 2]. That array describes a single deletion and all the deletions would be of type [[Int]].

Edit:

I'm removing the requirement of the arrays being of any depth. Let's say that they're simply 1d arrays.

Upvotes: 15

Views: 6984

Answers (3)

Robin
Robin

Reputation: 1

If you only need to compute the difference between two arrays, here's an alternative implementation based on shawkinaw answer:

typealias Insertions = [Int]
typealias Deletions = [Int]
typealias ChangeSet = (Insertions, Deletions)

func Diff<T: Equatable>(objects: [T], originalObjects: [T]) -> ChangeSet {
    guard objects.count > 0 && originalObjects.count > 0 else { return ChangeSet([], []) }

    let insertedObjects = objects.filter({ !originalObjects.contains($0) })
    let insertionIndicies = insertedObjects.compactMap({ objects.index(of: $0) })

    let deletedObjects = originalObjects.filter({ !objects.contains($0) })
    let deletionIndicies = deletedObjects.compactMap({ originalObjects.index(of: $0) })

    return ChangeSet(insertionIndicies, deletionIndicies)
}

The insertionIndicies is an array of type Int. Each Int in the array refers to the indicies where the originalObjects array need to insert items from the objects array.

The deletionIndicies is an array of type Int. Each Int in the array refers to the indicies where the originalObjects array has to delete items.

Upvotes: 0

shawkinaw
shawkinaw

Reputation: 3180

I'm not sure this meets all your bounty requirements, but I'll post some code I use for computing arrays differences:

func arrayInsertionDeletionAndNoopIndexes<T: Equatable>(objects: [T], originalObjects: [T]) -> ([Int], [Int], [Int]) {
    let insertions = objects.filter({ !originalObjects.contains($0) }).map({ objects.index(of: $0)! })
    let noops = originalObjects.filter({ objects.contains($0) }).map({ originalObjects.index(of: $0)! })
    let deletions = originalObjects.filter({ !objects.contains($0) }).map({ originalObjects.index(of: $0)! })

    return (insertions, deletions, noops)
}

func arrayInsertionDeletionAndNoopIndexPaths<T: Equatable>(objects: [T], originalObjects: [T], section: Int = 0) -> ([IndexPath], [IndexPath], [IndexPath]) {
    let (insertions, deletions, noops) = arrayInsertionDeletionAndNoopIndexes(objects: objects, originalObjects: originalObjects)

    let insertionIndexPaths = insertions.map({ IndexPath(row: $0, section: section) })
    let deletionIndexPaths = deletions.map({ IndexPath(row: $0, section: section) })
    let noopIndexPaths = noops.map({ IndexPath(row: $0, section: section) })

    return (insertionIndexPaths, deletionIndexPaths, noopIndexPaths)
}

My specific use case is for computing differences to update a UITableView, for which purpose I also have the following:

extension UITableView {

    func insertAndDeleteCellsForObjects<T: Equatable>(objects: [T], originalObjects: [T], section: Int = 0) {
        let (insertions, deletions, _) = arrayInsertionDeletionAndNoopIndexPaths(objects: objects, originalObjects: originalObjects, section: section)

        if insertions.count > 0 || deletions.count > 0 {
            beginUpdates()
            insertRows(at: insertions, with: .automatic)
            deleteRows(at: deletions, with: .automatic)
            endUpdates()
        }
    }

}

Upvotes: 12

bzz
bzz

Reputation: 5596

As of Swift 2.2, this is impossible. You give the following requirements:

  • both arrays of type T: Equatable
  • both arrays can be of any depth

But the ability to make a constrained extension conform to a new protocol is only planned for Swift 3.0, so right now you can't make extension Array where Element: Array<Equatable> conform to Equatable protocol. This means that only 1d arrays can be of of type T: Equatable.

EDIT:

Basically what you need to do is to write an algorithm that solves Longest common subsequence problem. For 1d arrays you can use Dwifft library which solves the problem in the following way:

public extension Array where Element: Equatable {
    public func diff(other: [Element]) -> Diff<Element> {
        let table = MemoizedSequenceComparison.buildTable(self, other, self.count, other.count)
        return Array.diffFromIndices(table, self, other, self.count, other.count)
    }

    private static func diffFromIndices(table: [[Int]], _ x: [Element], _ y: [Element], _ i: Int, _ j: Int) -> Diff<Element> {
        if i == 0 && j == 0 {
            return Diff<Element>(results: [])
        } else if i == 0 {
            return diffFromIndices(table, x, y, i, j-1) + DiffStep.Insert(j-1, y[j-1])
        } else if j == 0 {
            return diffFromIndices(table, x, y, i - 1, j) + DiffStep.Delete(i-1, x[i-1])
        } else if table[i][j] == table[i][j-1] {
            return diffFromIndices(table, x, y, i, j-1) + DiffStep.Insert(j-1, y[j-1])
        } else if table[i][j] == table[i-1][j] {
            return diffFromIndices(table, x, y, i - 1, j) + DiffStep.Delete(i-1, x[i-1])
        } else {
            return diffFromIndices(table, x, y, i-1, j-1)
        }
    }
}

internal struct MemoizedSequenceComparison<T: Equatable> {
    static func buildTable(x: [T], _ y: [T], _ n: Int, _ m: Int) -> [[Int]] {
        var table = Array(count: n + 1, repeatedValue: Array(count: m + 1, repeatedValue: 0))
        for i in 0...n {
            for j in 0...m {
                if (i == 0 || j == 0) {
                    table[i][j] = 0
                }
                else if x[i-1] == y[j-1] {
                    table[i][j] = table[i-1][j-1] + 1
                } else {
                    table[i][j] = max(table[i-1][j], table[i][j-1])
                }
            }
        }
        return table
    }
}

Upvotes: 7

Related Questions