Haskell - Finding undeclared variables

Question

AST:

data AST = Nr Int | Sum AST AST | Mul AST AST | Min AST | If AST AST AST | 
Let String AST AST | Var String deriving (Eq, Show)

Hi! I need some help finding undeclared variables in the input. My problem is that I can't simply do it in my evaluator like so:

eval :: Env -> AST -> Int
eval env (Nr nr) = nr
eval env (Sum xs xss) = eval env xs + eval env xss
eval env (Mul xs xss) = eval env xs * eval env xss
eval env (Min xs ) = - eval env xs
eval env (If x xs xss) = if (eval env x == 0)
                then eval env xs
                else eval env xss
eval env (Let s xs xss) = eval ((s, (eval env xs)) : env) xss 
eval env (Var x) = case lookup x env of
    Just n -> n
    Nothing -> error ("Variable " ++ x ++ " is undeclared!")

If there are any undeclared variables I need to give an appropriate error containing a list of ALL undeclared variables either while parsing, or post process my AST before evaluating. And I'm not sure where to start. Here's an example of a parsed expression:

parse "let X = + 1 2 in * X + 2 - X"
    Let "X" (Sum (Nr 1) (Nr 2)) (Mul (Var "X") (Sum (Nr 2) (Min (Var "X"))))

rampion · Accepted Answer

Let's start with types:

If there are any undeclared variables I need to give an appropriate error containing a list of ALL undeclared variables either while parsing.

How about a function eval that will either give you list of the undeclared variables, or an Int (if there were no undeclared variables).

type Identifier = String

eval :: Env -> AST -> Either [Identifier] Int

We need to wrap raw numbers in a Right now:

eval env (Nr nr) = Right nr

The same for the a declared variable in the Var case, while an undeclared variable gets wrapped in a list and a Left:

eval env (Var x) = case lookup x env of
    Just n -> Right n
    Nothing -> Left [x]

For the Min case, we can't just negate the recursive call anymore, since negation isn't defined for Either [Identifier] Int.

We could pattern match to see what we got:

eval env (Min xs ) = case eval env xs of
  Left err -> Left err
  Right x  -> Right (-x)

But that's pretty wordy, and is exactly the same as using fmap from Either e's functor instance:

eval env (Min xs ) = fmap negate (eval env xs)

Similarly for Sum, we could pattern match on both arguments:

eval env (Sum xs xss) = case (eval env xs, eval env xss) of
  (Left err, Left err') -> Left (err ++ err')
  (Left err, Right _)   -> Left err
  (Right _, Left err')  -> Left err'
  (Right a, Right b)    -> Right (a + b)

Note how if both subterms contain undeclared variables that we concatenate them to get the list of undeclared variables under the Sum.

This is the same trick we'll need for the rest of the constructors. However, I don't want have to type out a huge case statement like that every time. That's a lot of work for a little addition! And If and Let are going to have eight cases!

So let's make a helper function to do this for us:

apply :: Either [Identifier] (a -> b) -> Either [Identifier] a -> Either [Identifier] b
apply (Left err) (Left err') = Left (err ++ err')
apply (Left err) (Right _)   = Left err
apply (Right _)  (Left err') = Left err'
apply (Right f)  (Right a)   = Right (f a)

Now defining cases for Sum, Mul, and If is much easier:

eval env (Sum xs xss) = fmap (+) (eval env xs) `apply` eval env xss
eval env (Mul xs xss) = fmap (*) (eval env xs) `apply` eval env xss
eval env (If x xs xss) = fmap jnz (eval env x) `apply` eval env xs `apply` eval env xss
  where jnz i a a' = if i == 0 then a else a'

Let is slightly different:

eval env (Let s xs xss) = fmap second v `apply` eval env' xss
  where val = eval env xs
        env' = (s,val) : env
        getRight (Right a) = a
        getRight (Left _) = 0
        second _ a = a

Note how we "cheat" by providing a bogus value to the environment for the second term when the first term contains an undeclared variable. Since we're not going to be use any Int value the second term might yield in this case anyway, this is ok.

Once you get a bit further in Haskell, you might notice that apply looks a lot like <*> from Applicative. The reason we didn't just use that is that Either e's Applicative instance doesn't work how we want it to. Instead of aggregating the errors, it quits when it hits the first one:

>>> Left ["foo"] `apply` Left ["bar", "baz"]
Left ["foo", "bar", "baz"]
>>> Left ["foo"] <*> Left ["bar", "baz"]
Left ["foo"]

However, there's the Validation type from the either package that has an applicative instance that works exactly this way, so if you wanted, you could use that:

>>> Failure ["foo"] <*> Failure ["bar", "baz"]
Failure ["foo", "bar", "baz"]

One approach that might make the Let case less hacky would be to change the return type of eval from Either [Identifier] Int to ([Identifier], [(Identifier, Int)] -> Int) - have it return a list of all the free variables in the expression, along with a way to evaluate the expression given a binding of those variables.

If we give that type a name:

data Result a = Result { freeVariables :: [Identifier], eval :: [(Identifier,Int)] -> a }

We can define Functor and Applicative instances for it:

instance Functor Result where
  fmap f (Result is g) = Result is (f . g)
instance Applicative Result where
  pure a = Result [] (const a)
  Result is ff <*> js fa = Result (is ++ js) (ff <*> js)

And use those to easily define a function to parse out the free variables and an eval expression:

parse :: AST -> Result Int
parse (Nr nr) = pure nr
parse (Sum xs xss) = (+) <$> parse xs <*> parse xss
parse (Mul xs xss) = (*) <$> parse xs <*> parse xss
parse (Min xs ) = negate <$> parse xs
parse (If x xs xss) = jnz <$> parse x <*> parse xs <*> parse xss
  where jnz a b c = if a == 0 then b else c
parse (Let s xs xss) = Result ks h
  where Result is f = parse xs
        Result js g = parse xss
        ks = is ++ delete s js
        h env = g ((s,f env):env)
parse (Var x) = Result [x] $ \env -> case lookup x env of
  Just n -> n
  Nothing -> error ("Variable " ++ x ++ " is undeclared!")

Haskell - Finding undeclared variables

Answers (1)

Related Questions