Operators in an interpreter

Question

I'm making an interpreter just for fun. First I'm trying to evaluate expressions. The evaluation returns a Value object and every type has it's own Value structure. So for example:

struct Value  // This is the abstract base class for every value type
{
     int type;
};

struct IntegerValue : public Value
{
     int value;

     IntegerValue(int value) : value(value), type(VALUE_INTEGER) {}
};

I don't know if this is a nice design (probably not) but works so far. But as I define new types and operatos, the evaluation methods get huge. For example at operator '==' the left side and right side can be string, integer, float and so on... So I guess I need to define operators for Value structures and not check them in the eval methods (and maybe even allow user-defined operators like in c++) but I just can't think of a fast, elegant and easily extendable design. Any ideas?

luser droog · Accepted Answer

In my postscript interpreter (written in C), I defined my typed-objects as a union so I could carefully arrange the members to overlay the same memory.

union {
    word tag;
    struct { word tag; word pad0; int val; } _int;
    struct { word tag; word pad0; float val; } _real;
    //...
} object;

You may not need to be so memory-conscious in your project, but I've gotten a lot of mileage from this structure.

As for handling the explosion of type-combinations that can occur. I've recently implemented several numeric types in my APL interpreter and with the help of macros (which hide a lot more code), the 3 possible types become 9 separate cases:

/* apply binary math op to nums, yielding num
TODO: additional numeric types.
    configurable overflow promotion handling.
 */
#define BIN_MATH_FUNC(func,z,x,y,overflow,domainI,domainD) \
     switch(NUMERIC_TYPES(x,y)){ \
     case TYPEPAIR(IMM,IMM): DOM(domainI,z,numimm(x),numimm(y)) \
                             if (overflow(numimm(x),numimm(y))) \
                                 z=flo((D)numimm(x) func (D)numimm(y)); \
                             else z=num(numimm(x) func numimm(y)); break; \
     case TYPEPAIR(IMM,FIX): DOM(domainI,z,numimm(x),numint(y)) \
                             if (overflow(numimm(x),numint(y))) \
                                 z=flo((D)numimm(x) func (D)numint(y)); \
                             else z=num(numimm(x) func numint(y)); break; \
     case TYPEPAIR(IMM,FLO): DOM(domainD,z,numimm(x),numdbl(y)) \
                             z=flo(numimm(x) func numdbl(y)); break; \
     case TYPEPAIR(FIX,IMM): DOM(domainI,z,numint(x),numimm(y)) \
                             if (overflow(numint(x),numimm(y))) \
                                 z=flo((D)numint(x) func (D)numimm(y)); \
                             else z=num(numint(x) func numimm(y)); break; \
     case TYPEPAIR(FIX,FIX): DOM(domainI,z,numint(x),numint(y)) \
                             if (overflow(numint(x),numint(y))) \
                                 z=flo((D)numint(x) func (D)numint(y)); \
                             else z=num(numint(x) func numint(y)); break; \
     case TYPEPAIR(FIX,FLO): DOM(domainD,z,numint(x),numdbl(y)) \
                             z=flo(numint(x) func numdbl(y)); break; \
     case TYPEPAIR(FLO,IMM): DOM(domainD,z,numdbl(x),numimm(y)) \
                             z=flo(numdbl(x) func numimm(y)); break; \
     case TYPEPAIR(FLO,FIX): DOM(domainD,z,numdbl(x),numint(y)) \
                             z=flo(numdbl(x) func numint(y)); break; \
     case TYPEPAIR(FLO,FLO): DOM(domainD,z,numdbl(x),numdbl(y)) \
                             z=flo(numdbl(x) func numdbl(y)); break; \
     }

The func argument to the macro is a C math operator like + or * or %. So I just need to present floating point number where I want floating-point math or integers where I want integer math. The domain? functions are only needed for detecting division by zero.

The TYPEPAIR helper macro is very useful, and perhaps not obvious how it should work. The arguments are enum values, C's version of atomic symbols which are represented as small integers. So here I just needed to distinguish 3 numeric types, so I make an enum for them.

enum { IMM = 1, FIX, FLO, NTYPES };

Normally enum values are allocated starting with 0, but for the math trick, I want these to start with 1. Then I can treat these values like a number system with NTYPES as the radix or base. With the symbols defined like this, I can calculate a 2-digit type as a single numeric value. This is a constant value, so if it's calculated by a macro, it can be used as a switch case.

#define TYPEPAIR(a,b)  ((a)*NTYPES+(b))

It also be composed to produce a larger type-pattern number representation.

TYPEPAIR(TYPEPAIR(IMM,FIX),FLO)

which expands to

((((IMM)*NTYPES+(FIX)))*NTYPES+(FLO))

This kind of stuff lets me do matching of larger patterns of things with less code. Instead of something like

if (TYPE(x)==IMM && TYPE(y)==FIX) //...
//...

which isn't expressible as a switch.

Operators in an interpreter

Answers (1)

Related Questions