NFC.cool
NFC.cool

Reputation: 3303

Recursive function in Swift crashes when called with a high amount of input

I have comments that are coming from an API and those comments are either a root comment or a child comment. Those comments are coming unordered from that API and I have to sort them on my side.

A comment looks like this:

struct Comment: Codable, Equatable {
    var depth = 0
    var id: Int = 0
    var parent: Int?
    var content: String = ""
    var created: Int = 0
    var up: Int = 0
    var down: Int = 0
    var confidence: Double = 0
    var name: String = ""
    var mark: Int = 0

    enum CodingKeys: String, CodingKey {
        case id = "id"
        case parent = "parent"
        case content = "content"
        case created = "created"
        case up = "up"
        case down = "down"
        case confidence = "confidence"
        case name = "name"
        case mark = "mark"
    }

    init(from decoder: Decoder) throws {
        let values = try decoder.container(keyedBy: CodingKeys.self)
        id = try values.decode(Int.self, forKey: .id)
        parent = try values.decodeIfPresent(Int.self, forKey: .parent)
        content = try values.decode(String.self, forKey: .content)
        created = try values.decode(Int.self, forKey: .created)
        up = try values.decode(Int.self, forKey: .up)
        down = try values.decode(Int.self, forKey: .down)
        confidence = try values.decode(Double.self, forKey: .confidence)
        name = try values.decode(String.self, forKey: .name)
        mark = try values.decode(Int.self, forKey: .mark)
    }

    init(with message: String, name: String, depth: Int) {
        self.depth = depth
        self.up = 1
        self.content = message
        self.name = name
    }
}

So I started to make a tree structure and I put those comments into nodes:

fileprivate class Node<T> {
    var value: T
    weak var parent: Node?
    var children: [Node] = []

    init(value: T) {
        self.value = value
    }

    func add(child: Node) {
        children.append(child)
        child.parent = self
    }
}

extension Node where T == Comment {

    func search(id: Int) -> Node? {
        if id == self.value.id {
            return self
        }
        for child in children {
            if let found = child.search(id: id) {
                return found
            }
        }
        return nil
    }

    var description: String {
        "Id: \(value.id), parent: \(value.parent ?? -1)"
    }
}

extension Node where T: Equatable {
    func search(value: T) -> Node? {
        if value == self.value {
            return self
        }
        for child in children {
            if let found = child.search(value: value) {
                return found
            }
        }
        return nil
    }
}

So the incoming comments are splitted in to two groups like this:

    private func split(_ comments: [Comment]) {
        let parentNodes = comments.filter { $0.parent == 0 }.map { Node(value: $0) }
        let childNodes = comments.filter { $0.parent != 0 }.map { Node(value: $0) }
        self.sortComments(parentNodes: parentNodes, childNodes: childNodes)
    }

And then I sort them comments with this recursive function:

    private func sortComments(parentNodes: [Node<Comment>], childNodes: [Node<Comment>]) {

        let parentNodes = parentNodes
        var childNodes = childNodes

        if let firstChild = childNodes.first {

            let parentId = firstChild.value.parent!
            if let parentNode = parentNodes.first(where: { $0.value.id == parentId }) {
                firstChild.value.depth = parentNode.value.depth + 1
                parentNode.add(child: firstChild)
                childNodes.removeFirst()
                self.sortComments(parentNodes: parentNodes, childNodes: childNodes)
            } else {
                //Comment is child of child
                //Search children for parent
                //Search also parentNodes, they may have a child already that is the parent
                //of the current child we are looking at

                parentNodes.forEach {
                    if let foundNode = $0.search(id: parentId) {
                        firstChild.value.depth = foundNode.value.depth + 1
                        foundNode.add(child: firstChild)
                        childNodes.removeFirst()
                        self.sortComments(parentNodes: parentNodes, childNodes: childNodes)
                    }
                }

                childNodes.forEach {
                    if let foundNode = $0.search(id: parentId) {
                        firstChild.value.depth = foundNode.value.depth + 1
                        foundNode.add(child: firstChild)
                        childNodes.removeFirst()
                        self.sortComments(parentNodes: parentNodes, childNodes: childNodes)
                    }
                }
            }
        } else {
            let sortedNodes = parentNodes.sorted { $0.value.confidence > $1.value.confidence }
            self.convertCommentNodesToArray(nodes: sortedNodes, currentArray: [])
        }
    }

And finally I convert the nodes to an array so that my TableView can display the data:

    private func convertCommentNodesToArray(nodes: [Node<Comment>], currentArray: [Comment]) {
        var nodes = nodes
        var commentsArray = currentArray

        if let firstNode = nodes.first {
            commentsArray.append(firstNode.value)

            if firstNode.children.count > 0 {
                let remainingNodes = nodes.dropFirst()
                let sortedChildren = firstNode.children.sorted { $0.value.confidence > $1.value.confidence }
                convertCommentNodesToArray(nodes: sortedChildren + remainingNodes, currentArray: commentsArray)
            } else {
                nodes.removeFirst()
                convertCommentNodesToArray(nodes: nodes, currentArray: commentsArray)
            }
        } else {
            self.comments.value = commentsArray
        }
    }

This code works pretty fine with a fairly large amount of comments. But at some point when I'm facing thousands of comments this function crashes because of a stackoverflow crash.

So my question is: How can I make this function better in performance so that it doesn't crash? I'm happy to hear your suggestions :)

Upvotes: 2

Views: 728

Answers (1)

Enricoza
Enricoza

Reputation: 1142

There are several points in your last function that could use some optimizations in regard to a Stack Overflow crash:

Iterative vs Recursive Algorithms

Before even mentioning memory optimization, just looking at those two kinds of algorithms:

  • An iterative approach may look uglier and harder to read, but it imposes almost no limitation on the memory size (you basically have has much memory as the device offers you).
  • A recursive approach, instead, is often cleaner but is limited by the stack size (which is always much smaller than the whole memory size).

Before you dive into an amazing recursive approach you should really pay attention to the size of the input you are going to handle. If it's gonna handle more than some thousand cycles then you should really stick with an iterative approach.

Despite every memory optimizations you may use, if you call a recursive function too many times you'll get a stack overflow. So this is the first point to take in consideration.


Value types vs Reference types

But the amount of cycles is not the only thing that can cause a stack overflow. It can also happen if you put too much variables in the stack. That can be resolved if you think about what is stored where.

Value types will always be allocated in the stack, while reference types generally go in the heap. (This is not always true due to Stack Promotion but I guess you can ignore it for now).

So, for starters, you could think to implement your Comment struct as a class instead. Or at least you could try to avoid duplicating the whole array on each cycle by assigning it back to another local variable that (once again) is going in the stack.

var commentsArray = currentArray

With the above line, for example, you are not only duplicating the array (which is a valueType) but you are also duplicating every single comment in the array (since Comments are value types). And you are duplicating them on each cycle, making the size of the stack grow exponentially.

To avoid duplicating the comments, you can simply use a class instead (this way you'll only duplicate their reference), but if you want to avoid duplicating the array too you can just pass it as an inout parameter.

The same could be said about the node array (but at least Node is a class).


Finally, here is what I would do in your case (but all the above may apply to other cases):

  • Make Comment a class
  • Use an iterative algorithm (that doesn't even need to pass the array around, so no need to bother with inout parameters)

The simplest implementation of an iterative algorithm in your case would be something along the lines of:

    private func convertCommentNodesToArray(nodes: [Node<Comment>]) {
        var nodes = nodes
        var commentsArray = [Comment]()
        while(!nodes.isEmpty) {
           // get the first node
           // insert all children of that node to the nodes array in the position you prefer
           // then take that node comment and add it to the array
           // finally remove that node from the array
        }
        self.comments.value = commentsArray
    }

Upvotes: 4

Related Questions