matt
matt

Reputation: 2039

How does this monadic function in Haskell work?

I am struggling to wrap my head around this function**

  -- split at whitespace
  -- f "hello world" -> ["hello","world"]

  f =  takeWhile (not . null) . evalState (repeatM $ modify (drop 1) 
    >> State (break (== ' '))) . (' ' :)
    where repeatM = sequence . repeat

I am already struggling with understanding the state monad and the lack of Type signatures and the point free style make it even more confusing.

What is going on here?

** taken from here

Upvotes: 4

Views: 187

Answers (1)

Li-yao Xia
Li-yao Xia

Reputation: 33389

break (== ' ') :: String -> (String, String) produces the longest prefix of the input string without whitespaces on the left, and the remaining suffix on the right. Clearly we want to somehow iterate this process to split a string into its words.

State (break (== ' ')) :: State String String views that function as a stateful computation. Remember State :: (s -> (a, s)) -> State s a, so the output state is the right component. So the initial state is the input string, and break (== ' ') viewed imperatively as a State action, returns the first word and updates the state to the remaining suffix.

Now compose that with modify (drop 1) to drop the next space, and repeat. Note that it appends to the input (' ' :) at the beginning, and starts with modify

(sequence . repeat) (modify (drop 1) >> State (break (== ' ')))
  :: State String [String]
  = do
                                -- current_state = " hello wold"
  modify (drop 1)
                                -- current_state = "hello world"
  s1 <- State (break (== ' '))  -- s1 = "hello"
                                -- current_state = " world"
  modify (drop 1)
                                -- current_state = "world"
  s2 <- State (break (== ' '))  -- s2 = "world"
                                -- current_state = ""
  modify (drop 1)
                                -- current_state = ""
  s3 <- State (break (== ' '))  -- s3 = ""
                                -- current_state = ""
  ...

repeat constructs an infinite list of the same two-step action modify (drop 1) >> State (break (== ' ')) sequence takes a list of actions and runs them, collecting the result in a list. Here I am handwaving, but because this is the lazy state monad (there is a strict one, and this is not it), things work out so that sequence can run an infinite list of action and produce the resulting infinite list of results on demand.

So you get an infinite computation as shown above, producing the infinite list of results of all State (break (== ' ')) steps.

["hello", "world", "", "", ...]

takeWhile (not . null) is hopefully self descriptive: it takes elements from the list while they're not empty.

["hello", "world"]

Upvotes: 8

Related Questions