King Kira
King Kira

Reputation: 61

haskell length runtime O(1) or O(n)

I was working on a Haskell assignment and I was trying think of ways to make my code faster. For example, my factors function below finds the number of divisors of some integer.

factors :: Int -> Int
factors x = length [n | n <- [1..x], mod x n == 0]

However, it occurred to me that I could make my code faster by avoiding usage of "length".

factors :: Int -> Int
factors x = go 1
  where
    go :: Int -> Int
    go i
      | i == x        = 1
      | mod x i == 0  = 1 + go (i + 1)
      | otherwise     = go (i + 1)

I was wondering if Haskell's length function is O(n) like strlen() in C or O(1) like String.length() in Java.

Also, is there a better or more efficient of writing my code?

Upvotes: 4

Views: 3465

Answers (4)

jberryman
jberryman

Reputation: 16645

In my estimation, contrary to the accepted answer, you can in fact infer the complexity of length (and many other functions) just by looking at the definition of [a]:

Prelude> :info []
data [] a = [] | a : [a]    -- Defined in ‘GHC.Types’

Lists are inductively-defined; you can see from that definition (which is almost just regular haskell) that at the top level a list is either the constructor [] or :. Clearly length must recurse n times on this structure and so would have to be O(n).

It's very important to be able to reason at least intuitively in this way, in particular about lists which are ubiquitous. e.g. quick what's the complexity of (!!)?

If you want to do a deep dive into formally reasoning about time complexity in the presence of laziness then you'll need to pick up "Purely Functional Data Structures" by Okasaki.

Upvotes: 4

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476659

From a theoretical perspective, we can not know whether length is θ(n), we know that it is O(n), but it is technically possible that Haskell implements it faster for known lists.

Since a Haskell compiler could be free to implement a list whatever way they want to. But nevertheless it does not matter, since in that case generating the list in the first place will take θ(n).

Note that even if the compiler uses a more dedicated datastructure, Haskell is lazy, so your list comprehension does not result in a complete list, but more in a function that can generate a list lazily.

Finally if we would evaluate the list comprehension eagerly, then it would again require O(n) to first generate the list in the first place. So even if obtaining the length was very fast, then generating the list would require O(n) as a lower bound. So regardless what the efficiency of length is, the algorithm will still scale linearly with the input.

Your own implementation again uses O(n) (and is not very safe to be honest). Nevertheless, you can easily speedup the factorization of a number to O(sqrt n):

factors :: Int -> Int
factors x = go 1
  where
    go :: Int -> Int
    go i | i2 > x = 0
         | i2 == x = 1
         | mod x i == 0 = 2 + go (i+1)
         | otherwise = go (i + 1)
        where i2 = i*i

Here we enumerate from 1 to sqrt(n). Each time we find a factor a, we know that there is a co-factor b = x/a. As long as a is not equal to sqrt(x), we know that those are different. In case a is equal to sqrt(x), we know that a is equal to b and thus we count this as one.

That being said, there are definitely faster ways to do it. It is a topic with a lot of research that has yielded more efficient algorithms. I'm not suggesting that the above is the fastest, but it is definitely a huge improvement in terms of time complexity.

Upvotes: 0

chi
chi

Reputation: 116139

Also, is there a better or more efficient of writing my code?

Integer factorization is one of the most famous problems. There surely have been proposed a lot of algorithms for that, even if I am not expert enough to make a recommendation (CS.SE is around the corner, and can help on that, if needed). None of such proposals is polynomial time, but this doesn't stop them to be faster than the trivial approach.

Even without looking at the literature, a few simple optimizations can be found.

  • The original code scans the whole list [1..x], but this is not needed. We could stop at sqrt x, since after that there are no longer divisors.

  • Even more: after we find a divisor m, we could divide x by m (as many times as possible), and recurse with this new number. E.g. if x = 1000 after we try m=2, we compute 1000 -> 500 -> 250 -> 125, and then find the new divisors (larger than 2) in 125. Note how this made the number much smaller.

I will leave implementing this strategies in Haskell as an exercise :-P

Upvotes: 2

mschmidt
mschmidt

Reputation: 2790

Building the list with the list comprehension already takes O(n). Therefore there is not much overhead when using a length function which should have complexity O(n) in the worst case.

Upvotes: 1

Related Questions