Saman Ray
Saman Ray

Reputation: 115

most efficient way to count the largest value in swift

I have written the following program which calculates the largest value(counter) returned by a function collatzSequence(value: Int). However, whenever I run it through the function countLongestCollatzSequence(), Xcode always crashes. What can I do to make this code better(or run)?

Thank you.

func collatzSequence(value: Int) -> Int {
    var currentNumber: Int = value
    var counter: Int = 0
    while (currentNumber != 1) {
        if currentNumber % 2 == 0 {
            currentNumber = currentNumber / 2
            counter = counter + 1
        }
        else {
            currentNumber = (( currentNumber * 3 ) + 1 ) / 2
            counter = counter + 2
        }
    }
    return counter
}


func countLongestCollatzSequence() {
    var currentValue: Int = 0
    var largestValue: Int = 0
    for x in (50000...1000000).reversed() {
        currentValue = collatzSequence(value: x)
        if largestValue < currentValue {
            largestValue = currentValue
        }
    }
}

countLongestCollatzSequence()

PS: For clarity, the first function generates a sequence of numbers and the second function finds out which sequence was the longest.

Upvotes: 2

Views: 158

Answers (2)

Rob
Rob

Reputation: 437872

You're doing this in a playground, which has a lot of overhead (e.g., keeping track of values as you're going along), and for reasons that aren't entirely obvious, causes serious problems for the playground.

If you put this code in a stand-alone app, it runs fairly quickly. And if you run this as a release build (i.e. with optimizations turned on), it runs even more quickly (0.3 seconds on my MacBook Pro).

Bottom line, playgrounds aren't ideally suited for maximal efficiency, so if you want to run this most efficiently, I'd just put the code in a stand-alone app and I think you'll be satisfied with the efficiency of your existing algorithm.


Re Alexander's memoized approach, this is discussed in WWDC 2014 video Advanced Swift.

Upvotes: 1

Alexander
Alexander

Reputation: 63321

As Rob mentioned, you'll want to compile this into a standalone program with compiler optimizations enabled. You should never benchmark in the playground.

I took the liberty to write a memoized version of this algorithm, and it's really damn fast.

Firstly, it's important to be able to generate sequences lazily, rather than always being stuck having to compute an entire sequence. Consider the case of n = 8. It produces the sequence [8, 4, 2, 1], which has length 4. Suppose I wanted to calculate the sequence for n = 5 right after. I would get [5, 16, 8, 4, 2, 1], which has length 6. Notice that the last 8 elements are the same. As it turns out, if I were generating this element piece by piece, I could get [5, 16,, reach 8 and then I know that I can stop expanding the sequence. Instead I can just return the number of elements I have so far (2), plus the size of the sequence for n = 8 (4). This allows me to short circuit a vast majority of the calculation.

Here is my implementation of the CollatzSequence:

struct CollatzSequence: Sequence {
    let n: Int

    init(_ n: Int) { self.n = n }

    public func makeIterator() -> CollatzIterator {
        return CollatzIterator(n)
    }
}

struct CollatzIterator: IteratorProtocol {
    var n: Int
    var done = false

    init(_ n: Int) { self.n = n }

    mutating func next() -> Int? {
        guard n > 1 else {
            guard !done else { return nil }
            done = true;
            return 1
        }

        defer {
            n = (n % 2 == 0)
                ? n/2 // even case
                : 3*n + 1 // odd case
        }

        return n
    }
}

Next I made a calculator that computes the length of any given CollatzSequence. Importantly, it keeps a history of inputs and their sizes, in a dictionary I called memoizedCounts. This dictionary is initialized with the base case, n = 1 produces the sequence [1], which has length 1.

struct CollatzSequenceLengthCalculator {
    var memoizedCounts = [1: 1]

    var previousElements = [Int]() // shared instance to save cost of reallocation

    mutating func lengthOfSequence(startingAt n: Int) -> Int {
//      print("====== n: \(n)")
        if let existingCount = memoizedCounts[n] {
//          print("Cache hit: counts[\(n)] is \(existingCount)")
            return existingCount
        }

        previousElements.removeAll(keepingCapacity: true)

        for i in CollatzSequence(n) {
            guard let existingCount = memoizedCounts[i] else {
//              print("No existing count for \(i)")
                previousElements.append(i)
                continue
            }

//          print("Cache hit: counts[\(i)] is \(existingCount).")
//          print("Going back to fill in previous these cache misses: \(previousElements)")

            for (offset, element) in previousElements.reversed().enumerated() {
                let newCount = offset + existingCount + 1
//              print("\tSetting counts[\(element)] to \(newCount)")
                memoizedCounts[element] = newCount
            }

            return existingCount + previousElements.count
        }

        fatalError("This should never happen")
    }
}

I also made a benchmark to compare how many sequence elements were computed in this optimized algorithm vs. the naive approach:

extension CollatzSequenceLengthCalculator {
    mutating func debug_lengthOfSequence(startingAt n: Int)
        -> (result: Int, numIterations: Int) {
//      print("====== n: \(n)")
        if let existingCount = memoizedCounts[n] { // redunant here to speed up common base case
//          print("Cache hit: counts[\(n)] is \(existingCount)")
            return (result: existingCount, numIterations: 0)
        }

        previousElements.removeAll(keepingCapacity: true)

        for i in CollatzSequence(n) {
            guard let existingCount = memoizedCounts[i] else {
//              print("No existing count for \(i)")
                previousElements.append(i)
                continue
            }

//          print("Cache hit: counts[\(i)] is \(existingCount).")
//          print("Going back to fill in previous these cache misses: \(previousElements)")

            for (offset, element) in previousElements.reversed().enumerated() {
                let newCount = offset + existingCount + 1
//              print("\tSetting counts[\(element)] to \(newCount)")
                memoizedCounts[element] = newCount
            }

            return (result: existingCount + previousElements.count, numIterations: previousElements.count)
        }

        fatalError("This should never happen")
    }
}

let input = 1...10000
var debugCalc = CollatzSequenceLengthCalculator()
let debug: [(input: Int, result: Int, numIterations: Int)] = input.map {
    let (result, numIterations) = debugCalc.debug_lengthOfSequence(startingAt: $0)
    return (input: $0, result: result, numIterations: numIterations)
}

//debug.forEach{ print($0) }

print("Total unoptimized iterations: \( debug.map{ $0.result }.reduce(0, +))")
print("Total optimized iterations: \( debug.map{ $0.numIterations }.reduce(0, +))")

I found that calculating all sequences for n in 1...100_000_000, the naive algorithm would have to calculate 18_023_493_583 sequence elements, whereas my optimized algorithm would take only 217_289_746. That's a decrease of 98.8 %!

Upvotes: 1

Related Questions