Carlos López-Camey
Carlos López-Camey

Reputation: 2515

Simplifying use of lambda expressions and memoizing a Map in the State Monad

This function is much faster than its recursive version:

crossSubstrings :: String -> String -> [(String,String)]
crossSubstrings string1 string2 = [(substr1,substr2) | substr1 <- inits string1,
                                                       substr2 <- inits string2]

type Distances = Map.Map (String,String) Int

editDistanceMemoized :: String -> String -> Int
editDistanceMemoized s1 s2 = 
   let 
     substrings = s1 `crossSubstrings` s2
     distances = foldl (editDistance) emptyMap substrings
   in
     distances Map.! (s1,s2)
   where 
     emptyMap = Map.fromList []
     editDistance :: Distances -> (String,String) -> Distances
     editDistance map ([],s1) = map `Map.union` getMap [] s1 (length s1)
     editDistance map (s1,[]) = map `Map.union` getMap s1 [] (length s1)
     editDistance map (s1,s2) = map `Map.union` getMap s1 s2 (cost map s1 s2)
     getMap s1 s2 d = Map.fromList [((s1,s2),d)]
     insertionPCost = \m -> \s1 -> \s2 -> m Map.! (s1, init s2) + 1
     deletionPCost = \m -> \s1 -> \s2 -> m Map.! (init s1, s2)  + 1
     substitutionPCost = \m -> \s1 -> \s2 -> m Map.! (init s1, init s2)
                                             + substitutionCostIfNEQ s1 s2
     substitutionCostIfNEQ = \s1 -> \s2 -> if (last s1 == last s2) then 0 else 2
     cost = \m -> \s1 -> \s2 -> minimum [insertionPCost m s1 s2,
                                         deletionPCost m s1 s2,
                                         substitutionPCost m s1 s2] 

However (first question), I feel like some lambdas could be avoided (doesn't it look repetitive? look specially at cost). Is there a way to compose minimum ?

In addition, the State Monad could be used to propagate the map (instead of using foldl?). Despite despite reading how State.>>= and State.id behave, I'm not 100% sure how the signature should look like (second question).

I thought of this one, where the state is "the next pair of strings to be measured" and Distances contains the memoized distances.

 editDistance :: State Distances (String,String) -> State Distances ()?

Upvotes: 0

Views: 116

Answers (1)

dave4420
dave4420

Reputation: 47052

insertionPCost, deletionPCost, substitutionPCost and substitutionCostIfNEQ are only called from each other and cost, and always with the same arguments (save that substitutionCostIfNEQ doesn't take m); so we can rearrange them like this:

cost = \m -> \s1 -> \s2 -> minimum [insertionPCost, deletionPCost, substitutionPCost] 
  where insertionPCost = m Map.! (s1, init s2) + 1
        deletionPCost = m Map.! (init s1, s2)  + 1
        substitutionPCost = m Map.! (init s1, init s2) + substitutionCostIfNEQ
        substitutionCostIfNEQ = if (last s1 == last s2) then 0 else 2

And the explicit lambdas aren't getting you anything, so rewrite to be clearer:

cost m s1 s2 = minimum [insertionPCost, deletionPCost, substitutionPCost] 
  where insertionPCost = m Map.! (s1, init s2) + 1
        deletionPCost = m Map.! (init s1, s2)  + 1
        substitutionPCost = m Map.! (init s1, init s2) + substitutionCostIfNEQ
        substitutionCostIfNEQ = if (last s1 == last s2) then 0 else 2

To answer your second question, currently you have

editDistance :: Distances -> (String,String) -> Distances

If you were to use State instead, that would be

editDistance :: (String,String) -> State Distances ()

That is, editDistance would be a function that takes (String,String), and yields something that interacts with a Distances state, and no other meaningful result.

But.

Firstly, I don't see that there's anything wrong with your use of foldl.

Secondly, you never really use the accumulated value, what would be the state. You use it to make a new value, but you don't look anything up in it. So you don't need State, you only need Writer.

editDistance :: (String,String) -> Writer Distances ()

That is, editDistance would be a function that takes (String,String), and yields something that adds to a Distances accumulator, and no other meaningful result.

(There's a subtlety here: the first parameter to Writer has to be a Monoid, and it has to use the combining operation (mappend) that's useful to you; well, Maps are Monoids, and their mappend is the same union that you use in your original editDistance, so it all works out fine.)

Upvotes: 1

Related Questions