find the maximal difference between all succeeding elements

Question

I am trying to solve a training exercise I found at codeingame.com

The question is the following: You have a list of numbers and want to find minimal value for the differences between v_small - v_big , with the condition that v_big > v_small and v_big is before v_small in the list. In addition the maximal time for this question is 1 second, and maximal memory usage is 512MB.

For small lists a naive algorithm sufficient:

---------------------------------- try1.hs -------------------------------------
import Control.Applicative ((<$>))

main :: IO ()
main = do _ <- getLine
          v <- g . f . map read . take 1000 . words <$> getLine --or equivalently
--        v <- g . h . map read . take 1000 . words <$> getLine 
          print v

f :: [Int] -> [Int]
f [] =   []
f xx@(x:xs) = (minimum $ map (\y -> y-x) xx) : (f xs)

g :: [Int] -> Int
g [] = 0
g xs = minimum xs

h :: [Int] -> [Int]
h [] = []
h (x:xs) = (foldr (\y' y -> min (y'-x) y) 0 xs): (h xs)

But I think both functions f and h generate n*(n+1)/2 many elements, where n is the length of the list. Which takes ages for the last list has 99999 many elements.

the next try was finding local maxima and minima and compare only maxima with minima - which should reduce the cost of the algorithm to #maxima*#minima

---------------------------------- try2.hs -------------------------------------
import Control.Applicative ((<$>))
-- import Control.Arrow ((&&&))

data Extremum = Max Int | Min Int deriving (Show)


main :: IO ()
main = do _ <- getLine
          e <- getExtremes
          print e

getExtremes :: IO Int
getExtremes = minimum . concat . myMap f . headextr .
                                         map read . take 1000 .words <$> getLine

myMap :: (a -> [a] -> [b]) -> [a] -> [[b]]
myMap _ [] = []
myMap g xx@(x:xs) = (g x) xx : myMap g xs

f :: Extremum -> [Extremum] -> [Int]
f (Max y) (Max _:xs) = f (Max y) xs
f (Max y) (Min x:xs) = (min 0 (x-y)): f (Max y) xs
f _ _ = []

headextr :: [Int] -> [Extremum]
headextr xx@(x:y:_) | x > y = Max x : extremes xx
                    | x < y = Min x : extremes xx
headextr xx = extremes xx


extremes :: [Int] -> [Extremum]
extremes [] = []
extremes [x] = [Max x, Min x]
extremes [x,y]      | x > y          =       Min y:[]
                    | otherwise      =       Max y:[]
extremes (x:y:z:xs) | x > y && y < z = Min y:extremes (y:z:xs)
                    | x < y && y > z = Max y:extremes (y:z:xs)
                    | otherwise      =       extremes (y:z:xs)

But still the desired time of 1 second is not met. I reduced the input with take 1000 for profiling, but as I am not a professional programmer I was not able to make any use of it, the only information I found out - which quite obvious - in the first versions f/h are the costly functions and in the second version f too is the culprit.

the input file for this exercise can be found at http://www.codingame.com/ide/fileservlet?id=372552140039 - (if this link is not working it can be found at www.codingame.com -> training -> Stock exchange losses -> Test_5_input.txt/Test_5_output.txt)

So how to speed up this algorithm or is there another algorithm that is faster?

Daniel Wagner · Accepted Answer

Your first two solutions are slow because for each element in the list, they do a computation that visits all successive elements. This is O(n^2). I haven't entirely grokked your second solution, but it seems to be something like this: filter out only local extrema (where local extremum means it's bigger than its two neighbors or smaller than its two neighbors), then run an O(n^2) algorithm on the list of extrema. Unfortunately, in the worst case every element can be an extremum, so this is also O(n^2) overall. (In fact, in a random list, we expect most elements to be local extrema, so this is not just being pessimistic about things.)

Let's see if we can invent an O(n) algorithm instead.

We'll start with a slightly rephrased O(n^2) algorithm. The idea of this algorithm is this: First, nondeterministically choose a place in the list to act as v_big. Then, nondeterministically choose a later place in the list to act as v_small. Take the maximal value among all these nondeterministic choices. In code:

f_spec xs = maximum $ do
    later@(v_big:_) <- tails xs
    v_small:_       <- tails later
    return (v_big - v_small)

Now, we need two separate insights to turn this into an O(n) algorithm. The first is that we only need to split once: once we have chosen v_small, we know that the right way to choose v_big is to pick the biggest element before it in the list. We can implement that algorithm this way:

f_slow xs = maximum $ do
    earlier@(v_small:_) <- tails (reverse xs)
    let v_big = maximum earlier
    return (v_big - v_small)

This is "almost" O(n): it only makes one nondeterministic choice; but doing the necessary computation once we've made that choice is still O(n), resulting in a total runtime of O(n^2). The second insight is that we can memoize the computation needed after our nondeterministic choice, so that this computation is O(1). We can build a list of all the maximums very efficiently like this:

maximums xs = scanl1 max xs

Like maximum, this function takes O(n) time; unlike maximum, this one returns the maximums of all prefixes of xs instead of just the maximum of the entire list. So now, when we do our nondeterministic choice, we can choose both v_small and v_big at the same time:

f_fast xs = maximum $ do
    (v_big, v_small) <- zip (maximums xs) xs
    return (v_big - v_small)

From there you only need a bit of prettification to get something that looks very pretty indeed and still runs in O(n) time:

f xs = maximum $ zipWith (-) (maximums xs) xs

find the maximal difference between all succeeding elements

Answers (2)

Related Questions