Reputation: 31
AST:
data AST = Nr Int | Sum AST AST | Mul AST AST | Min AST | If AST AST AST |
Let String AST AST | Var String deriving (Eq, Show)
Hi! I need some help finding undeclared variables in the input. My problem is that I can't simply do it in my evaluator like so:
eval :: Env -> AST -> Int
eval env (Nr nr) = nr
eval env (Sum xs xss) = eval env xs + eval env xss
eval env (Mul xs xss) = eval env xs * eval env xss
eval env (Min xs ) = - eval env xs
eval env (If x xs xss) = if (eval env x == 0)
then eval env xs
else eval env xss
eval env (Let s xs xss) = eval ((s, (eval env xs)) : env) xss
eval env (Var x) = case lookup x env of
Just n -> n
Nothing -> error ("Variable " ++ x ++ " is undeclared!")
If there are any undeclared variables I need to give an appropriate error containing a list of ALL undeclared variables either while parsing, or post process my AST before evaluating. And I'm not sure where to start. Here's an example of a parsed expression:
parse "let X = + 1 2 in * X + 2 - X"
Let "X" (Sum (Nr 1) (Nr 2)) (Mul (Var "X") (Sum (Nr 2) (Min (Var "X"))))
Upvotes: 2
Views: 263
Reputation: 89053
Let's start with types:
If there are any undeclared variables I need to give an appropriate error containing a list of ALL undeclared variables either while parsing.
How about a function eval
that will either give you list of the undeclared variables, or an Int
(if there were no undeclared variables).
type Identifier = String
eval :: Env -> AST -> Either [Identifier] Int
We need to wrap raw numbers in a Right
now:
eval env (Nr nr) = Right nr
The same for the a declared variable in the Var
case,
while an undeclared variable gets wrapped in a list and a Left
:
eval env (Var x) = case lookup x env of
Just n -> Right n
Nothing -> Left [x]
For the Min
case, we can't just negate the recursive call anymore, since
negation isn't defined for Either [Identifier] Int
.
We could pattern match to see what we got:
eval env (Min xs ) = case eval env xs of
Left err -> Left err
Right x -> Right (-x)
But that's pretty wordy, and is exactly the same as using fmap
from Either e
's functor instance:
eval env (Min xs ) = fmap negate (eval env xs)
Similarly for Sum
, we could pattern match on both arguments:
eval env (Sum xs xss) = case (eval env xs, eval env xss) of
(Left err, Left err') -> Left (err ++ err')
(Left err, Right _) -> Left err
(Right _, Left err') -> Left err'
(Right a, Right b) -> Right (a + b)
Note how if both subterms contain undeclared variables that we concatenate them to get the list of undeclared variables under the Sum
.
This is the same trick we'll need for the rest of the constructors. However, I don't want have to type out a huge case
statement like that every time. That's a lot of work for a little addition! And If
and Let
are going to have eight cases!
So let's make a helper function to do this for us:
apply :: Either [Identifier] (a -> b) -> Either [Identifier] a -> Either [Identifier] b
apply (Left err) (Left err') = Left (err ++ err')
apply (Left err) (Right _) = Left err
apply (Right _) (Left err') = Left err'
apply (Right f) (Right a) = Right (f a)
Now defining cases for Sum
, Mul
, and If
is much easier:
eval env (Sum xs xss) = fmap (+) (eval env xs) `apply` eval env xss
eval env (Mul xs xss) = fmap (*) (eval env xs) `apply` eval env xss
eval env (If x xs xss) = fmap jnz (eval env x) `apply` eval env xs `apply` eval env xss
where jnz i a a' = if i == 0 then a else a'
Let
is slightly different:
eval env (Let s xs xss) = fmap second v `apply` eval env' xss
where val = eval env xs
env' = (s,val) : env
getRight (Right a) = a
getRight (Left _) = 0
second _ a = a
Note how we "cheat" by providing a bogus value to the environment for the second term when the first term contains an undeclared variable. Since we're not going to be use any Int
value the second term might yield in this case anyway, this is ok.
Once you get a bit further in Haskell, you might notice that apply
looks a lot like <*>
from Applicative
. The reason we didn't just use that is that Either e
's Applicative
instance doesn't work how we want it to. Instead of aggregating the errors, it quits when it hits the first one:
>>> Left ["foo"] `apply` Left ["bar", "baz"]
Left ["foo", "bar", "baz"]
>>> Left ["foo"] <*> Left ["bar", "baz"]
Left ["foo"]
However, there's the Validation
type from the either
package that has an applicative instance that works exactly this way, so if you wanted, you could use that:
>>> Failure ["foo"] <*> Failure ["bar", "baz"]
Failure ["foo", "bar", "baz"]
One approach that might make the Let
case less hacky would be to change the return type of eval
from Either [Identifier] Int
to ([Identifier], [(Identifier, Int)] -> Int)
- have it return a list of all the free variables in the expression, along with a way to evaluate the expression given a binding of those variables.
If we give that type a name:
data Result a = Result { freeVariables :: [Identifier], eval :: [(Identifier,Int)] -> a }
We can define Functor
and Applicative
instances for it:
instance Functor Result where
fmap f (Result is g) = Result is (f . g)
instance Applicative Result where
pure a = Result [] (const a)
Result is ff <*> js fa = Result (is ++ js) (ff <*> js)
And use those to easily define a function to parse out the free variables and an eval expression:
parse :: AST -> Result Int
parse (Nr nr) = pure nr
parse (Sum xs xss) = (+) <$> parse xs <*> parse xss
parse (Mul xs xss) = (*) <$> parse xs <*> parse xss
parse (Min xs ) = negate <$> parse xs
parse (If x xs xss) = jnz <$> parse x <*> parse xs <*> parse xss
where jnz a b c = if a == 0 then b else c
parse (Let s xs xss) = Result ks h
where Result is f = parse xs
Result js g = parse xss
ks = is ++ delete s js
h env = g ((s,f env):env)
parse (Var x) = Result [x] $ \env -> case lookup x env of
Just n -> n
Nothing -> error ("Variable " ++ x ++ " is undeclared!")
Upvotes: 6