insitu
insitu

Reputation: 4698

How to capture higher order functions in an embedded DSL in Haskell?

I would like to write an embedded DSL in Haskell from which I could generate code in another language (namely Python but that's irrelevant to the question). There are a lot of different approaches possible to do that, like using a Free monad based interpreters or tagless interpreters which I am aware of. But no approach that I am aware of seems to be able to capture function definitions, which kind of make sense but is also quite limiting.

My question is: How can I embed an actual Haskell function into a DSL? The idea would be to capture an Haskell function definition into something like Lam var body constructor, which would look like:

data Var = Var ...  -- use DeBruijn numbering?

data Expr repr = Lam Var repr

Ideally I would like to be able to write something like:

foo :: Expr repr
foo = \ n -> n + n * 12

and then be able to interpret this Expr in various ways, including generating foreign code.

I experimented with https://hackage.haskell.org/package/data-reify which provides some techniques to capture functions but did not get very far. Is what I am looking for ever possible?

Upvotes: 5

Views: 274

Answers (1)

user23202524
user23202524

Reputation: 1

I am still new to Haskell myself, however, I was recently working on a DSL myself, so this was my approach:

Let's say you have a simple AST including a small lambda calculus:

data Exp :: Type -> Type where
  Var ::
    String ->       -- Name
    Exp a
  Apply ::
    Exp (a -> b) -> -- Function
    Exp a ->        -- Parameter
    Exp b
  Lambda ::
    String ->       -- Bound variable name
    Exp b ->        -- "Right hand side" of lambda
    Exp (a -> b)
  ...

deriving instance Show (Exp t)

Note that the structure above is not actually type-safe: It is not ensured that bound variables have the same type as their individual uses:

unsafe :: Exp (Int -> String)
unsafe = Lambda "x" (Var "x" :: Exp String)

will compile without issues. (Note that the inner variable has a different type than the parameter of the lambda expression.) In fact, above AST does not even guarantee that referenced variables are bound by an outside expression.

Since we prefer using Haskell function types anyway, it suffices to simply not export our lambda calculus to other packages.

Let us now "convert" a unary function into an expression of our AST. We want to replace all occurences of a given variable by Var placeholders, and then bind that variable by a lambda:

convert :: String -> (Exp a -> Exp b) -> Exp (a -> b)
convert varName f = Lambda varName . f $ Var varName

ghci> convert "x" $ \x -> x
Lambda "x" (Var "x")

Currently, we do not have a way of determining a distinct variable name whenever we want to bind a variable. We will however need this in order to avoid variable name collisions reliably.

A simple way of achieving this is by tracking the number of variables bound, globally, in your compiler, and generating a new variable name from that count. Depending on how the lambda expressions are used/compiled later on, this might not be necessary though, due to scopes.

Using type classes, we can generalize this to a convert-function for n-ary functions:

class Convertible t d where
  convert :: Int -> t -> Exp d

instance Convertible (Exp t) t where
  convert :: Int -> Exp t -> Exp t
  convert _ = id

instance (Convertible b c) => Convertible (Exp a -> b) (a -> c) where
  convert :: Int -> (Exp a -> b) -> Exp (a -> c)
  convert varCount f = Lambda var . convert (varCount + 1) . f $ Var var
    where
      var = "v" ++ show varCount

This is already nice, but leads to some ambiguous types, because we declare c as a free type variable, so there could, in theory, be multiple instances for converting function types (essentially, c is not uniquely determined by Exp a -> b, here).

As a solution to this, you can instead determine c using a closed type family:

type family Dropped a :: Type where
  Dropped (Exp a -> b) = a -> Dropped b
  Dropped (Exp t) = t

So, for example:

Dropped (Exp Int) ~ Int,
Dropped (Exp Int -> Exp String) ~ Int -> String
and
Dropped (Exp a1 -> ... -> Exp an) ~ a1 -> ... -> an

Now we can rewrite our class from above:

class Convertible t where
  convert :: Int -> t -> Exp (Dropped t)

instance Convertible (Exp t) where
  convert :: Int -> Exp t -> Exp t
  convert _ = id

instance (Convertible b) => Convertible (Exp a -> b) where
  convert :: Int -> (Exp a -> b) -> Exp (a -> Dropped b)
  convert varCount f = Lambda var . convert (varCount + 1) . f $ Var var
    where
      var = "v" ++ show varCount

Now we can conveniently convert functions of our AST:

ghci> :{
ghci| fstE :: Exp a -> Exp b -> Exp a
ghci| fstE x _ = x
ghci| :}
ghci> convert 0 fstE
Lambda "v0" (Lambda "v1" (Var "v0"))

Note 1

Above implementation does not cover all functions with expressions of our AST but rather functions of the form

Exp t1 -> ... -> Exp tn

More types of functions can be added, however.

Note 2

Depending on the implementation/design of your DSL, it might be preferrable to convert directly from functions to the AST of your generated language, skipping the lambda calculus inbetween. This can even help uncurry Haskell's functions, which are always curried.

Note 3

You will have to enable a number of extensions to run above code. These should suffice:

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeFamilies #-}

I hope that this actually helps you with the problem you described.

Upvotes: 0

Related Questions