Reputation: 1574
I recently implemented a doubly linked list type in Swift. When I set out to make it, my goal was to give users most of the same ease of use as using an Array
, but with the algorithmic complexities associated with doubly linked lists. With this goal in mind, I decided one of the primary ways I would achieve this would be making the Node
type an implementation detail; out of sight and out of mind for the user. I also decided that LinkedList
must be implemented as a struct
so to provide support proper immutability/mutability.
Deciding on the semantics for the LinkedList
type and its private Node
type was quite tricky though. This was mainly due to LinkedList
being a struct
and Node
being a class
. Because of this, whenever a copy of a LinkedList
value was made, mutating the copied linked list would also mutate the initial variable. For example, under the described circumstances, this would occur:
let list1 = LinkedList([1, 2, 3])
var list2 = list1
list2.append(4) // Mutates both list1 and list2
print(list1)
// Prints "[1, 2, 3, 4]"
This was, for obvious reasons, unintended and required me to put thought into the behaviour and semantics associated with making a copy of a LinkedList
and mutation thereof. To combat this, in the only two mutating defined on LinkedList
that were made accessible to the user, I checked whether or not the head node of the list was known to be uniquely referenced or not. If it was, the mutation would go under way as normal. But if it wasn't, a function would create a copy of all the nodes in list before mutating them. This would prevent mutating operations on an instance of LinkedList
from affecting any other instance. This check before mutation effectively implemented copy-on-write semantics for LinkedList
's nodes. With this, the previous example perform as expected:
let list1 = LinkedList([1, 2, 3])
var list2 = list1
list2.append(4) // Nodes are copied
print(list1)
// Prints "[1, 2, 3]"
print(list2)
// Prints "[1, 2, 3, 4]"
This seemed to provide a fix to the problem of shared references to nodes and their mutation therein quite nicely. Unfortunately, to my chagrin, I then realized that I had not tied up all of my loose ends. This is where I am currently stuck. To this point in time, my custom index type, LinkedList.Index
, has been defined as following:
extension LinkedList: Collection {
//...
public struct Index: Comparable {
fileprivate weak var node: Node?
fileprivate var offset: Int
fileprivate init(node: Node?, offset: Int) {
self.node = node
self.offset = offset
}
public static func ==(lhs: Index, rhs: Index) -> Bool {
return lhs.offset == rhs.offset
}
public static func <(lhs: Index, rhs: Index) -> Bool {
return lhs.offset < rhs.offset
}
}
}
Let me break down some of the decisions I made while constructing this... Firstly, the offset
property was put in to allow for comparison with other indices and to provide the ability to check if an index was within the bounds of a list. Secondly, the node
property was essential to giving each Index
actual meaning and usefulness. This meant that both index(after:)
and index(before:)
could rely on the node
property's next
and previous
properties to provide their respective desired indices in O(1) time. To me, this in it of itself is seems to be a requirement for an index type of a linked list implementation. Presently, I do not think there is any way to circumvent this requirement of associating each index with its respective node. In testing, I also came across a bug wherein the list's node were being copied too frequently and also not being deallocated by ARC. I realized this was because of a strong reference being held to nodes by Index
. To combat that, I made node
a weak reference.
Before I state the problem I am having, here is my current code for LinkedList
:
public struct LinkedList<Element> {
private var headNode: Node?
private var tailNode: Node?
public private(set) var count: Int = 0
public init() { }
}
//MARK: - LinkedList Node
extension LinkedList {
fileprivate class Node {
public var value: Element
public var next: Node?
public weak var previous: Node?
public init(value: Element) {
self.value = value
}
}
}
//MARK: - Initializers
public extension LinkedList {
private init(_ nodeChain: NodeChain?) {
guard let chain = nodeChain else {
return
}
headNode = chain.head
tailNode = chain.tail
count = chain.count
}
init<S>(_ sequence: S) where S: Sequence, S.Element == Element {
if let linkedList = sequence as? LinkedList<Element> {
self = linkedList
} else {
self = LinkedList(NodeChain(of: sequence))
}
}
}
//MARK: NodeChain
extension LinkedList {
private struct NodeChain {
let head: Node!
let tail: Node!
private(set) var count: Int
// Creates a chain of nodes from a sequence. Returns `nil` if the sequence is empty.
init?<S>(of sequence: S) where S: Sequence, S.Element == Element {
var iterator = sequence.makeIterator()
guard let firstValue = iterator.next() else {
return nil
}
var currentNode = Node(value: firstValue)
head = currentNode
count = 1
while let nextElement = iterator.next() {
let nextNode = Node(value: nextElement)
currentNode.next = nextNode
nextNode.previous = currentNode
currentNode = nextNode
count += 1
}
tail = currentNode
}
}
}
//MARK: - Copy Nodes
extension LinkedList {
private mutating func copyNodes(settingNodeAt index: Index, to value: Element) {
var currentIndex = startIndex
var currentNode = Node(value: currentIndex == index ? value : currentIndex.node!.value)
let newHeadNode = currentNode
currentIndex = self.index(after: currentIndex)
while currentIndex < endIndex {
let nextNode = Node(value: currentIndex == index ? value : currentIndex.node!.value)
currentNode.next = nextNode
nextNode.previous = currentNode
currentNode = nextNode
currentIndex = self.index(after: currentIndex)
}
headNode = newHeadNode
tailNode = currentNode
}
@discardableResult
private mutating func copyNodes(removing range: Range<Index>) -> Range<Index> {
var currentIndex = startIndex
while range.contains(currentIndex) {
currentIndex = index(after: currentIndex)
}
guard let headValue = currentIndex.node?.value else {
self = LinkedList()
return endIndex..<endIndex
}
var currentNode = Node(value: headValue)
let newHeadNode = currentNode
var newCount = 1
var removedRange: Range<Index> = Index(node: currentNode, offset: 0)..<Index(node: currentNode, offset: 0)
currentIndex = index(after: currentIndex)
while currentIndex < endIndex {
guard !range.contains(currentIndex) else {
currentIndex = index(after: currentIndex)
continue
}
let nextNode = Node(value: currentIndex.node!.value)
if currentIndex == range.upperBound {
removedRange = Index(node: nextNode, offset: newCount)..<Index(node: nextNode, offset: newCount)
}
currentNode.next = nextNode
nextNode.previous = currentNode
currentNode = nextNode
newCount += 1
currentIndex = index(after: currentIndex)
}
if currentIndex == range.upperBound {
removedRange = Index(node: nil, offset: newCount)..<Index(node: nil, offset: newCount)
}
headNode = newHeadNode
tailNode = currentNode
count = newCount
return removedRange
}
}
//MARK: - Computed Properties
public extension LinkedList {
var head: Element? {
return headNode?.value
}
var tail: Element? {
return tailNode?.value
}
}
//MARK: - Sequence Conformance
extension LinkedList: Sequence {
public typealias Element = Element
public __consuming func makeIterator() -> Iterator {
return Iterator(node: headNode)
}
public struct Iterator: IteratorProtocol {
private var currentNode: Node?
fileprivate init(node: Node?) {
currentNode = node
}
public mutating func next() -> Element? {
guard let node = currentNode else {
return nil
}
currentNode = node.next
return node.value
}
}
}
//MARK: - Collection Conformance
extension LinkedList: Collection {
public var startIndex: Index {
return Index(node: headNode, offset: 0)
}
public var endIndex: Index {
return Index(node: nil, offset: count)
}
public var first: Element? {
return head
}
public var isEmpty: Bool {
return count == 0
}
public func index(after i: Index) -> Index {
precondition(i.offset != endIndex.offset, "LinkedList index is out of bounds")
return Index(node: i.node?.next, offset: i.offset + 1)
}
public struct Index: Comparable {
fileprivate weak var node: Node?
fileprivate var offset: Int
fileprivate init(node: Node?, offset: Int) {
self.node = node
self.offset = offset
}
public static func ==(lhs: Index, rhs: Index) -> Bool {
return lhs.offset == rhs.offset
}
public static func <(lhs: Index, rhs: Index) -> Bool {
return lhs.offset < rhs.offset
}
}
}
//MARK: - MutableCollection Conformance
extension LinkedList: MutableCollection {
public subscript(position: Index) -> Element {
get {
precondition(position.offset != endIndex.offset, "Index out of range")
guard let node = position.node else {
preconditionFailure("LinkedList index is invalid")
}
return node.value
}
set {
precondition(position.offset != endIndex.offset, "Index out of range")
// Copy-on-write semantics for nodes
if !isKnownUniquelyReferenced(&headNode) {
copyNodes(settingNodeAt: position, to: newValue)
} else {
position.node?.value = newValue
}
}
}
}
//MARK: LinkedList Specific Operations
public extension LinkedList {
mutating func prepend(_ newElement: Element) {
replaceSubrange(startIndex..<startIndex, with: CollectionOfOne(newElement))
}
mutating func prepend<S>(contentsOf newElements: __owned S) where S: Sequence, S.Element == Element {
replaceSubrange(startIndex..<startIndex, with: newElements)
}
@discardableResult
mutating func popFirst() -> Element? {
if isEmpty {
return nil
}
return removeFirst()
}
@discardableResult
mutating func popLast() -> Element? {
guard isEmpty else {
return nil
}
return removeLast()
}
}
//MARK: - BidirectionalCollection Conformance
extension LinkedList: BidirectionalCollection {
public var last: Element? {
return tail
}
public func index(before i: Index) -> Index {
precondition(i.offset != startIndex.offset, "LinkedList index is out of bounds")
if i.offset == count {
return Index(node: tailNode, offset: i.offset - 1)
}
return Index(node: i.node?.previous, offset: i.offset - 1)
}
}
//MARK: - RangeReplaceableCollection Conformance
extension LinkedList: RangeReplaceableCollection {
public mutating func append<S>(contentsOf newElements: __owned S) where S: Sequence, Element == S.Element {
replaceSubrange(endIndex..<endIndex, with: newElements)
}
public mutating func replaceSubrange<S, R>(_ subrange: R, with newElements: __owned S) where S: Sequence, R: RangeExpression, Element == S.Element, Index == R.Bound {
var range = subrange.relative(to: indices)
precondition(range.lowerBound >= startIndex && range.upperBound <= endIndex, "Subrange bounds are out of range")
// If range covers all elements and the new elements are a LinkedList then set references to it
if range.lowerBound == startIndex, range.upperBound == endIndex, let linkedList = newElements as? LinkedList {
self = linkedList
return
}
var newElementsCount = 0
// There are no new elements, so range indicates deletion
guard let nodeChain = NodeChain(of: newElements) else {
// If there is nothing in the removal range
// This also covers the case that the linked list is empty because this is the only possible range
guard range.lowerBound != range.upperBound else {
return
}
// Deletion range spans all elements
if range.lowerBound == startIndex && range.upperBound == endIndex {
headNode = nil
tailNode = nil
count = 0
return
}
// Copy-on-write semantics for nodes and remove elements in range
guard isKnownUniquelyReferenced(&headNode) else {
copyNodes(removing: range)
return
}
// Update count after mutation to preserve startIndex and endIndex validity
defer {
count = count - (range.upperBound.offset - range.lowerBound.offset)
}
// Move head up if deletion starts at start index
if range.lowerBound == startIndex {
// Can force unwrap node since the upperBound is not the end index
headNode = range.upperBound.node!
headNode!.previous = nil
// Move tail back if deletion ends at end index
} else if range.upperBound == endIndex {
// Can force unwrap since lowerBound index must have an associated element
tailNode = range.lowerBound.node!.previous
tailNode!.next = nil
// Deletion range is in the middle of the linked list
} else {
// Can force unwrap all bound nodes since they both must have elements
range.upperBound.node!.previous = range.lowerBound.node!.previous
range.lowerBound.node!.previous!.next = range.upperBound.node!
}
return
}
// Obtain the count of the new elements from the node chain composed from them
newElementsCount = nodeChain.count
// Replace entire content of list with new elements
if range.lowerBound == startIndex && range.upperBound == endIndex {
headNode = nodeChain.head
tailNode = nodeChain.tail
count = nodeChain.count
return
}
// Copy-on-write semantics for nodes before mutation
if !isKnownUniquelyReferenced(&headNode) {
range = copyNodes(removing: range)
}
// Update count after mutation to preserve startIndex and endIndex validity
defer {
count += nodeChain.count - (range.upperBound.offset - range.lowerBound.offset)
}
// Prepending new elements
guard range.upperBound != startIndex else {
headNode?.previous = nodeChain.tail
nodeChain.tail.next = headNode
headNode = nodeChain.head
return
}
// Appending new elements
guard range.lowerBound != endIndex else {
tailNode?.next = nodeChain.head
nodeChain.head.previous = tailNode
tailNode = nodeChain.tail
return
}
if range.lowerBound == startIndex {
headNode = nodeChain.head
}
if range.upperBound == endIndex {
tailNode = nodeChain.tail
}
range.lowerBound.node!.previous!.next = nodeChain.head
range.upperBound.node!.previous = nodeChain.tail
}
}
//MARK: - ExpressibleByArrayLiteral Conformance
extension LinkedList: ExpressibleByArrayLiteral {
public typealias ArrayLiteralElement = Element
public init(arrayLiteral elements: ArrayLiteralElement...) {
self.init(elements)
}
}
//MARK: - CustomStringConvertible Conformance
extension LinkedList: CustomStringConvertible {
public var description: String {
return "[" + lazy.map { "\($0)" }.joined(separator: ", ") + "]"
}
}
Note: if/when I do update my code, the up-to-date version can be found here.
The current problem I am experiencing is with Index
instances. When an index is provided either to a method, as is the case currently, the method has no way of knowing and checking whether or not that index/node belongs to that specific LinkedList
instance. That allows for bugs such as this:
let immutableList: LinkedList = [1, 2, 3, 4]
var mutableList: LinkedList = [5, 6, 7, 8]
let immutableIndex = immutableList.index(after: immutableList.startIndex)
mutableList[immutableIndex] = 0
print("Immutable List:", immutableList)
print("Mutable List:", mutableList)
// Prints:
// Immutable List: [1, 0, 3, 4]
// Mutable List: [5, 6, 7, 8]
All methods that deal with Index
must have a way to confirm that the index they are dealing with contains a node owned by the current LinkedList
instance, though I have no idea how I would be able to do this.
Furthermore, indices' offset invalidate after mutation to their node's parent list, thus causing absurd situations like this wherein the indices are said to be equal:
var list: LinkedList = [1, 2, 3, 4]
let idx1 = list.index(list.startIndex, offsetBy: 2) // Index to node with value of 3 and offset of 2
list.remove(at: list.index(before: idx1))
print(list)
// Prints: "[1, 3, 4]"
let idx2 = list.index(before: list.endIndex) // Index to node with value of 4 and offset of 2
print(idx1 == idx2)
// Prints: "true"
print(Array(list[idx1...idx2]))
// Prints: "[3]"
In the previous example, because, upon mutation of a LinkedList
, instances of its indices' offsets are not updated, though they still have a weak reference to their associated node, many unintended consequences and incorrect behaviour can arise.
When creating the Index
type initially, I was hard pressed to find similar examples to mine online and as a result I am not totally sure whether things like startIndex
and endIndex
and how index(before:)
and index(after:)
handle them are the optimal/proper way. I am looking for input on how I can go about fixing all of these problems brought on by LinkedList.Index
and properly implement it. Any and all input is appreciated!
Upvotes: 4
Views: 900
Reputation: 539975
Let's address the second problem first:
Furthermore, indices' offset invalidate after mutation to their node's parent list, thus causing absurd situations ...
That is to be expected with all collections. From Collections:
Saved indices may become invalid as a result of mutating operations.
Using an invalidated index is undefined behavior, and anything can happen: An unexpected result, a fatal error, ... Here is a simple example for Swift strings:
var s = "a🇩🇪bcd"
let i = s.firstIndex(of: "🇩🇪")!
s.remove(at: s.startIndex) // invalidates `i`
s.remove(at: i)
print(s) // \360cd
Now the first (main?) problem:
... the method has no way of knowing and checking whether or not that index/node belongs to that specific LinkedList instance.
Quoting from Collections again:
You can pass only valid indices to collection operations. You can find a complete set of a collection’s valid indices by starting with the collection’s startIndex property and finding every successor up to, and including, the endIndex property. All other values of the Index type, such as the startIndex property of a different collection, are invalid indices for this collection.
In your case
mutableList[immutableIndex] = 0
immutableIndex
is not a valid index for mutableList
, so this is again undefined behavior. A user of your library can not expect that this does anything sensible.
A possible way to protect against this misuse could be to store in the LinkedList.Index
a (weak) pointer to the head node of the linked list, and verify that owner in the accessor methods (subscripts).
Upvotes: 2