Noah Wilder
Noah Wilder

Reputation: 1574

Custom Index for Linked List in Swift

Custom Index Type for Linked List

Swift 5.0, Xcode 10.3

I recently implemented a doubly linked list type in Swift. When I set out to make it, my goal was to give users most of the same ease of use as using an Array, but with the algorithmic complexities associated with doubly linked lists. With this goal in mind, I decided one of the primary ways I would achieve this would be making the Node type an implementation detail; out of sight and out of mind for the user. I also decided that LinkedList must be implemented as a struct so to provide support proper immutability/mutability.

Deciding on the semantics for the LinkedList type and its private Node type was quite tricky though. This was mainly due to LinkedList being a struct and Node being a class. Because of this, whenever a copy of a LinkedList value was made, mutating the copied linked list would also mutate the initial variable. For example, under the described circumstances, this would occur:

let list1 = LinkedList([1, 2, 3])
var list2 = list1
list2.append(4) // Mutates both list1 and list2
print(list1)
// Prints "[1, 2, 3, 4]"

This was, for obvious reasons, unintended and required me to put thought into the behaviour and semantics associated with making a copy of a LinkedList and mutation thereof. To combat this, in the only two mutating defined on LinkedList that were made accessible to the user, I checked whether or not the head node of the list was known to be uniquely referenced or not. If it was, the mutation would go under way as normal. But if it wasn't, a function would create a copy of all the nodes in list before mutating them. This would prevent mutating operations on an instance of LinkedList from affecting any other instance. This check before mutation effectively implemented copy-on-write semantics for LinkedList's nodes. With this, the previous example perform as expected:

let list1 = LinkedList([1, 2, 3])
var list2 = list1
list2.append(4) // Nodes are copied 
print(list1)
// Prints "[1, 2, 3]"
print(list2)
// Prints "[1, 2, 3, 4]"

This seemed to provide a fix to the problem of shared references to nodes and their mutation therein quite nicely. Unfortunately, to my chagrin, I then realized that I had not tied up all of my loose ends. This is where I am currently stuck. To this point in time, my custom index type, LinkedList.Index, has been defined as following:

extension LinkedList: Collection {

    //... 

    public struct Index: Comparable {
        fileprivate weak var node: Node?
        fileprivate var offset: Int

        fileprivate init(node: Node?, offset: Int) {
            self.node = node
            self.offset = offset
        }

        public static func ==(lhs: Index, rhs: Index) -> Bool {
            return lhs.offset == rhs.offset
        }

        public static func <(lhs: Index, rhs: Index) -> Bool {
            return lhs.offset < rhs.offset
        }
    }
}

Let me break down some of the decisions I made while constructing this... Firstly, the offset property was put in to allow for comparison with other indices and to provide the ability to check if an index was within the bounds of a list. Secondly, the node property was essential to giving each Index actual meaning and usefulness. This meant that both index(after:) and index(before:) could rely on the node property's next and previous properties to provide their respective desired indices in O(1) time. To me, this in it of itself is seems to be a requirement for an index type of a linked list implementation. Presently, I do not think there is any way to circumvent this requirement of associating each index with its respective node. In testing, I also came across a bug wherein the list's node were being copied too frequently and also not being deallocated by ARC. I realized this was because of a strong reference being held to nodes by Index. To combat that, I made node a weak reference.

Before I state the problem I am having, here is my current code for LinkedList:


public struct LinkedList<Element> {

    private var headNode: Node?
    private var tailNode: Node?
    public private(set) var count: Int = 0

    public init() { }

}

//MARK: - LinkedList Node
extension LinkedList {

    fileprivate class Node {
        public var value: Element
        public var next: Node?
        public weak var previous: Node?

        public init(value: Element) {
            self.value = value
        }
    }

}

//MARK: - Initializers
public extension LinkedList {

    private init(_ nodeChain: NodeChain?) {
        guard let chain = nodeChain else {
            return
        }
        headNode = chain.head
        tailNode = chain.tail
        count = chain.count
    }

    init<S>(_ sequence: S) where S: Sequence, S.Element == Element {
        if let linkedList = sequence as? LinkedList<Element> {
            self = linkedList
        } else {
            self = LinkedList(NodeChain(of: sequence))
        }
    }

}

//MARK: NodeChain
extension LinkedList {
    private struct NodeChain {
        let head: Node!
        let tail: Node!
        private(set) var count: Int

        // Creates a chain of nodes from a sequence. Returns `nil` if the sequence is empty.
        init?<S>(of sequence: S) where S: Sequence, S.Element == Element {
            var iterator = sequence.makeIterator()

            guard let firstValue = iterator.next() else {
                return nil
            }

            var currentNode = Node(value: firstValue)
            head = currentNode
            count = 1

            while let nextElement = iterator.next() {
                let nextNode = Node(value: nextElement)
                currentNode.next = nextNode
                nextNode.previous = currentNode
                currentNode = nextNode
                count += 1
            }
            tail = currentNode
        }

    }
}

//MARK: - Copy Nodes
extension LinkedList {

    private mutating func copyNodes(settingNodeAt index: Index, to value: Element) {

        var currentIndex = startIndex
        var currentNode = Node(value: currentIndex == index ? value : currentIndex.node!.value)
        let newHeadNode = currentNode
        currentIndex = self.index(after: currentIndex)

        while currentIndex < endIndex {
            let nextNode = Node(value: currentIndex == index ? value : currentIndex.node!.value)
            currentNode.next = nextNode
            nextNode.previous = currentNode
            currentNode = nextNode
            currentIndex = self.index(after: currentIndex)
        }
        headNode = newHeadNode
        tailNode = currentNode
    }

    @discardableResult
    private mutating func copyNodes(removing range: Range<Index>) -> Range<Index> {

        var currentIndex = startIndex

        while range.contains(currentIndex) {
            currentIndex = index(after: currentIndex)
        }

        guard let headValue = currentIndex.node?.value else {
            self = LinkedList()
            return endIndex..<endIndex
        }

        var currentNode = Node(value: headValue)
        let newHeadNode = currentNode
        var newCount = 1

        var removedRange: Range<Index> = Index(node: currentNode, offset: 0)..<Index(node: currentNode, offset: 0)
        currentIndex = index(after: currentIndex)

        while currentIndex < endIndex {
            guard !range.contains(currentIndex) else {
                currentIndex = index(after: currentIndex)
                continue
            }

            let nextNode = Node(value: currentIndex.node!.value)
            if currentIndex == range.upperBound {
                removedRange = Index(node: nextNode, offset: newCount)..<Index(node: nextNode, offset: newCount)
            }
            currentNode.next = nextNode
            nextNode.previous = currentNode
            currentNode = nextNode
            newCount += 1
            currentIndex = index(after: currentIndex)

        }
        if currentIndex == range.upperBound {
            removedRange = Index(node: nil, offset: newCount)..<Index(node: nil, offset: newCount)
        }
        headNode = newHeadNode
        tailNode = currentNode
        count = newCount
        return removedRange
    }

}


//MARK: - Computed Properties
public extension LinkedList {
    var head: Element? {
        return headNode?.value
    }

    var tail: Element? {
        return tailNode?.value
    }
}

//MARK: - Sequence Conformance
extension LinkedList: Sequence {

    public typealias Element = Element

    public __consuming func makeIterator() -> Iterator {
        return Iterator(node: headNode)
    }

    public struct Iterator: IteratorProtocol {

        private var currentNode: Node?

        fileprivate init(node: Node?) {
            currentNode = node
        }

        public mutating func next() -> Element? {
            guard let node = currentNode else {
                return nil
            }
            currentNode = node.next
            return node.value
        }

    }
}

//MARK: - Collection Conformance
extension LinkedList: Collection {

    public var startIndex: Index {
        return Index(node: headNode, offset: 0)
    }

    public var endIndex: Index {
        return Index(node: nil, offset: count)
    }

    public var first: Element? {
        return head
    }

    public var isEmpty: Bool {
        return count == 0
    }

    public func index(after i: Index) -> Index {
        precondition(i.offset != endIndex.offset, "LinkedList index is out of bounds")
        return Index(node: i.node?.next, offset: i.offset + 1)
    }

    public struct Index: Comparable {
        fileprivate weak var node: Node?
        fileprivate var offset: Int

        fileprivate init(node: Node?, offset: Int) {
            self.node = node
            self.offset = offset
        }

        public static func ==(lhs: Index, rhs: Index) -> Bool {
            return lhs.offset == rhs.offset
        }

        public static func <(lhs: Index, rhs: Index) -> Bool {
            return lhs.offset < rhs.offset
        }
    }

}


//MARK: - MutableCollection Conformance
extension LinkedList: MutableCollection {

    public subscript(position: Index) -> Element {
        get {
            precondition(position.offset != endIndex.offset, "Index out of range")
            guard let node = position.node else {
                preconditionFailure("LinkedList index is invalid")
            }
            return node.value
        }
        set {
            precondition(position.offset != endIndex.offset, "Index out of range")

            // Copy-on-write semantics for nodes
            if !isKnownUniquelyReferenced(&headNode) {
                copyNodes(settingNodeAt: position, to: newValue)
            } else {
                position.node?.value = newValue
            }
        }
    }
}



//MARK: LinkedList Specific Operations
public extension LinkedList {

    mutating func prepend(_ newElement: Element) {
        replaceSubrange(startIndex..<startIndex, with: CollectionOfOne(newElement))
    }

    mutating func prepend<S>(contentsOf newElements: __owned S) where S: Sequence, S.Element == Element {
        replaceSubrange(startIndex..<startIndex, with: newElements)
    }

    @discardableResult
    mutating func popFirst() -> Element? {
        if isEmpty {
            return nil
        }
        return removeFirst()
    }

    @discardableResult
    mutating func popLast() -> Element? {
        guard isEmpty else {
            return nil
        }
        return removeLast()
    }
}

//MARK: - BidirectionalCollection Conformance
extension LinkedList: BidirectionalCollection {
    public var last: Element? {
        return tail
    }

    public func index(before i: Index) -> Index {
        precondition(i.offset != startIndex.offset, "LinkedList index is out of bounds")
        if i.offset == count {
            return Index(node: tailNode, offset: i.offset - 1)
        }
        return Index(node: i.node?.previous, offset: i.offset - 1)
    }

}

//MARK: - RangeReplaceableCollection Conformance
extension LinkedList: RangeReplaceableCollection {

    public mutating func append<S>(contentsOf newElements: __owned S) where S: Sequence, Element == S.Element {
        replaceSubrange(endIndex..<endIndex, with: newElements)
    }

    public mutating func replaceSubrange<S, R>(_ subrange: R, with newElements: __owned S) where S: Sequence, R: RangeExpression, Element == S.Element, Index == R.Bound {

        var range = subrange.relative(to: indices)

        precondition(range.lowerBound >= startIndex && range.upperBound <= endIndex, "Subrange bounds are out of range")

        // If range covers all elements and the new elements are a LinkedList then set references to it
        if range.lowerBound == startIndex, range.upperBound == endIndex, let linkedList = newElements as? LinkedList {
            self = linkedList
            return
        }

        var newElementsCount = 0

        // There are no new elements, so range indicates deletion
        guard let nodeChain = NodeChain(of: newElements) else {

            // If there is nothing in the removal range
            // This also covers the case that the linked list is empty because this is the only possible range
            guard range.lowerBound != range.upperBound else {
                return
            }

            // Deletion range spans all elements
            if range.lowerBound == startIndex && range.upperBound == endIndex {
                headNode = nil
                tailNode = nil
                count = 0
                return
            }

            // Copy-on-write semantics for nodes and remove elements in range
            guard isKnownUniquelyReferenced(&headNode) else {
                copyNodes(removing: range)
                return
            }

            // Update count after mutation to preserve startIndex and endIndex validity
            defer {
                count = count - (range.upperBound.offset - range.lowerBound.offset)
            }

            // Move head up if deletion starts at start index
            if range.lowerBound == startIndex {
                // Can force unwrap node since the upperBound is not the end index
                headNode = range.upperBound.node!
                headNode!.previous = nil

                // Move tail back if deletion ends at end index
            } else if range.upperBound == endIndex {
                // Can force unwrap since lowerBound index must have an associated element
                tailNode = range.lowerBound.node!.previous
                tailNode!.next = nil

                // Deletion range is in the middle of the linked list
            } else {
                // Can force unwrap all bound nodes since they both must have elements
                range.upperBound.node!.previous = range.lowerBound.node!.previous
                range.lowerBound.node!.previous!.next = range.upperBound.node!
            }

            return
        }

        // Obtain the count of the new elements from the node chain composed from them
        newElementsCount = nodeChain.count

        // Replace entire content of list with new elements
        if range.lowerBound == startIndex && range.upperBound == endIndex {
            headNode = nodeChain.head
            tailNode = nodeChain.tail
            count = nodeChain.count
            return
        }

        // Copy-on-write semantics for nodes before mutation
        if !isKnownUniquelyReferenced(&headNode) {
            range = copyNodes(removing: range)
        }

        // Update count after mutation to preserve startIndex and endIndex validity
        defer {
            count += nodeChain.count - (range.upperBound.offset - range.lowerBound.offset)
        }

        // Prepending new elements
        guard range.upperBound != startIndex else {
            headNode?.previous = nodeChain.tail
            nodeChain.tail.next = headNode
            headNode = nodeChain.head
            return
        }

        // Appending new elements
        guard range.lowerBound != endIndex else {
            tailNode?.next = nodeChain.head
            nodeChain.head.previous = tailNode
            tailNode = nodeChain.tail
            return
        }

        if range.lowerBound == startIndex {
            headNode = nodeChain.head
        }
        if range.upperBound == endIndex {
            tailNode = nodeChain.tail
        }

        range.lowerBound.node!.previous!.next = nodeChain.head
        range.upperBound.node!.previous = nodeChain.tail
    }

}

//MARK: - ExpressibleByArrayLiteral Conformance
extension LinkedList: ExpressibleByArrayLiteral {
    public typealias ArrayLiteralElement = Element

    public init(arrayLiteral elements: ArrayLiteralElement...) {
        self.init(elements)
    }
}

//MARK: - CustomStringConvertible Conformance
extension LinkedList: CustomStringConvertible {
    public var description: String {
        return "[" + lazy.map { "\($0)" }.joined(separator: ", ") + "]"
    }
}

Note: if/when I do update my code, the up-to-date version can be found here.


My Problem

The current problem I am experiencing is with Index instances. When an index is provided either to a method, as is the case currently, the method has no way of knowing and checking whether or not that index/node belongs to that specific LinkedList instance. That allows for bugs such as this:

let immutableList: LinkedList = [1, 2, 3, 4]
var mutableList: LinkedList = [5, 6, 7, 8]

let immutableIndex = immutableList.index(after: immutableList.startIndex)

mutableList[immutableIndex] = 0

print("Immutable List:", immutableList)
print("Mutable List:", mutableList)

// Prints:
//   Immutable List: [1, 0, 3, 4]
//   Mutable List: [5, 6, 7, 8]

All methods that deal with Index must have a way to confirm that the index they are dealing with contains a node owned by the current LinkedList instance, though I have no idea how I would be able to do this.

Furthermore, indices' offset invalidate after mutation to their node's parent list, thus causing absurd situations like this wherein the indices are said to be equal:

var list: LinkedList = [1, 2, 3, 4]
let idx1 = list.index(list.startIndex, offsetBy: 2) // Index to node with value of 3 and offset of 2

list.remove(at: list.index(before: idx1))
print(list)
// Prints: "[1, 3, 4]"

let idx2 = list.index(before: list.endIndex) // Index to node with value of 4 and offset of 2
print(idx1 == idx2) 
// Prints: "true"

print(Array(list[idx1...idx2]))
// Prints: "[3]"

In the previous example, because, upon mutation of a LinkedList, instances of its indices' offsets are not updated, though they still have a weak reference to their associated node, many unintended consequences and incorrect behaviour can arise.

When creating the Index type initially, I was hard pressed to find similar examples to mine online and as a result I am not totally sure whether things like startIndex and endIndex and how index(before:) and index(after:) handle them are the optimal/proper way. I am looking for input on how I can go about fixing all of these problems brought on by LinkedList.Index and properly implement it. Any and all input is appreciated!

Upvotes: 4

Views: 900

Answers (1)

Martin R
Martin R

Reputation: 539975

Let's address the second problem first:

Furthermore, indices' offset invalidate after mutation to their node's parent list, thus causing absurd situations ...

That is to be expected with all collections. From Collections:

Saved indices may become invalid as a result of mutating operations.

Using an invalidated index is undefined behavior, and anything can happen: An unexpected result, a fatal error, ... Here is a simple example for Swift strings:

var s = "a🇩🇪bcd"
let i = s.firstIndex(of: "🇩🇪")!
s.remove(at: s.startIndex) // invalidates `i`
s.remove(at: i)
print(s) // \360cd

Now the first (main?) problem:

... the method has no way of knowing and checking whether or not that index/node belongs to that specific LinkedList instance.

Quoting from Collections again:

You can pass only valid indices to collection operations. You can find a complete set of a collection’s valid indices by starting with the collection’s startIndex property and finding every successor up to, and including, the endIndex property. All other values of the Index type, such as the startIndex property of a different collection, are invalid indices for this collection.

In your case

mutableList[immutableIndex] = 0

immutableIndex is not a valid index for mutableList, so this is again undefined behavior. A user of your library can not expect that this does anything sensible.

A possible way to protect against this misuse could be to store in the LinkedList.Index a (weak) pointer to the head node of the linked list, and verify that owner in the accessor methods (subscripts).

Upvotes: 2

Related Questions