Reputation: 95
This is an example from Richard Bird's Thinking Functionally with Haskell (p. 158). Could someone explain me the reasoning behind 1,2,3 and 4?
EDIT. I understand the first 3 equations formed from the first definition. For 1., why is a summation? How is it tie to T(++)(n, m) = Θ(n)? For the second definition, I understand the first 2 statements. The third one (2.), why k+n? For 3. and 4. I am completely lost
Consider first the following two definitions of concat:
concat xss= foldr (++) [] xss
concat' xss = foldl (++) [] xss
The two definitions are equivalent provided xss is a finite list. Suppose xss is a list of length m of lists all of length n. Then the first definition gives
T(concat)(m, n) = T(foldr (++) [])(m, n),
T(foldr (++) [])(0, n) = Θ(1),
T(foldr (++) [])(m+1, n) = T(++)(n, mn) + T(foldr (++) [])(m, n).
The estimate T(++)(n, mn) arises because a list of length n is concatenated with a list of length mn. Since T(++)(n, m) = Θ(n), we obtain
1. T(foldr (++) [])(m,n) = Σ_{k=0}^{m} Θ(n) = Θ(mn)
For the second definition of concat we have
T(concat')(m, n) = T(foldl (++))(0, m, n),
T(foldl (++))(k, 0, n) = O(1),
2. T(foldl (++))(k, m+1, n) = T(++)(k, n) + T(foldl (++))(k+n, m, n).
The additional argument k refers to the length of the accumulated list in the second argument of foldl. This time we obtain
3. T(foldl (++)) (k,m,n) = Σ_{j=0}{m-1} Θ(k+jn) = Θ(k+m^2n)
Hence
4. T(concat')(m, n) = Θ(m_2 n)
Upvotes: 1
Views: 164
Reputation: 116139
For 1., why is a summation? How is it tie to
T(++)(n, m) = Θ(n)
?
The summation arises from the inductive definition. Consider this independent definition
g(0,n) = something
g(m+1,n) = f(m,n) + g(m,n)
By induction, we get
g(m,m)
= f(m-1,n) + g(m-1,n)
= f(m-1,n) + f(m-2,n) + g(m-2,n)
= f(m-1,n) + f(m-2,n) + f(m-3,n) + g(m-3,n)
= f(m-1,n) + f(m-2,n) + f(m-3,n) + ... + f(m-m,n) + g(m-m,n)
= f(m-1,n) + f(m-2,n) + f(m-3,n) + ... + f(m-m,n) + something
So, the result is a summation of f(n,x)
where x
varies, plus a final term something
.
In the original definition, g(m,n)
is T(foldr (++) [])(m, n)
, while
f(m,n) = T(++)(n, mn)
. The final term does not matter, asymptotically.
For the second definition, I understand the first 2 statements. The third one (2.), why k+n?
When computing foldl (++) x ys
, x
is assumed to be a list having length k
, and ys
to be a list-of-lists having length m+1
, whose elements have length n
.
Now, the recursive foldl
equation is:
foldl (++) x (y:ys) = foldl (++) (x++y) ys
The cost of the recursive call comprises (A) the cost of temp = x++y
, which is T(++)(k, n)
, and (B) the cost of foldl (++) temp ys
. For (B), the length of temp
is now length x + length y = k+n
, while ys
has length m
(n
is unaffected).
So, we get
T(foldl (++))(k, m+1, n)
= A + B
= T(++)(k, n) + T(foldl (++))(k+n, m, n).
For 3. and 4. I am completely lost
In 3. we sum, as we did for 1., the term T(++)(k, n)
with k
varying. Note that k
increases by n
every time we recurse, so we get
T(++)(k, n) + T(++)(k+n, n) + T(++)(k+n+n, n) + ...
= Σ_{j=0}{m-1} T(++)(k+jn, n)
= Σ_{j=0}{m-1} Θ(k+jn)
= Θ(k+m^2 n)
In 4. we choose k=0
since in the definition of concat'
we initially pick x=[]
whose length is 0
. So, only Θ(m^2 n)
survives.
Upvotes: 1