thiagoh
thiagoh

Reputation: 7388

LLVM: How to keep track of data types of the Value* at runtime for untyped language?

I'm implementing an untyped programming language using LLVM to generate the backend code. In order to keep track of the current type of a particular variable I'm using a struct StructTy_struct_datatype_t which is defined as:

PointerTy_8 = PointerType::get(IntegerType::get(TheContext, 8), 0);

StructTy_struct_datatype_t = StructType::create(TheContext, "struct.datatype_t");
std::vector<Type *> StructTy_struct_datatype_t_fields;
StructTy_struct_datatype_t_fields.push_back(IntegerType::get(TheContext, 32));
StructTy_struct_datatype_t_fields.push_back(PointerTy_8);

// which represents the struct
typedef struct datatype_t {
  int type; // holds an integer that tells me the type (1 = int, 2 = float, ...)
  void* v; // holds a pointer to the actual value
} datatype_t;

Then, suppose I have a function like this

def function_add(a, b) {
   return a + b;
}

I want this function be able to accept

The code that process the binary operation ie. a + b follows

Value* L = lhs_codegen_elements.back();
Value* R = rhs_codegen_elements.back();

if (!L || !R) {
  logError("L or R are undefined");
  return codegen;
}

AllocaInst* lptr_datatype = (AllocaInst*)((LoadInst*)L)->getPointerOperand();
AllocaInst* rptr_datatype = (AllocaInst*)((LoadInst*)R)->getPointerOperand();

ConstantInt* const_int32_0 = ConstantInt::get(TheContext, APInt(32, StringRef("0"), 10));
ConstantInt* const_int32_1 = ConstantInt::get(TheContext, APInt(32, StringRef("1"), 10));

GetElementPtrInst* lptr_type =
    GetElementPtrInst::Create(StructTy_struct_datatype_t, lptr_datatype, {const_int32_0, const_int32_0}, "type");
GetElementPtrInst* rptr_type =
    GetElementPtrInst::Create(StructTy_struct_datatype_t, rptr_datatype, {const_int32_0, const_int32_0}, "type");

GetElementPtrInst* lptr_v =
    GetElementPtrInst::Create(StructTy_struct_datatype_t, lptr_datatype, {const_int32_0, const_int32_1}, "v");
GetElementPtrInst* rptr_v =
    GetElementPtrInst::Create(StructTy_struct_datatype_t, rptr_datatype, {const_int32_0, const_int32_1}, "v");

LoadInst* lload_inst_type = load_inst_codegen(TYPE_INT, lptr_type);
LoadInst* rload_inst_type = load_inst_codegen(TYPE_INT, rptr_type);

LoadInst* lload_inst_v = load_inst_codegen(TYPE_VOID_POINTER, lptr_v);
LoadInst* rload_inst_v = load_inst_codegen(TYPE_VOID_POINTER, rptr_v);

CmpInst* cond1 =
    new ICmpInst(ICmpInst::ICMP_EQ, lload_inst_type, ConstantInt::get(TheContext, APInt(32, TYPE_DOUBLE)));

Function* function_bb = dyn_cast<Function>(bb);

BasicBlock* label_if_then_double = BasicBlock::Create(TheContext, "if.then.double", function_bb);
BasicBlock* label_if_then_long = BasicBlock::Create(TheContext, "if.then.long", function_bb);

BranchInst* branch_inst = BranchInst::Create(label_if_then_double, label_if_else, cond1, bb);

L->dump(); // %load_inst = load %struct.datatype_t, %struct.datatype_t* %alloca_datatype_v, align 8
R->dump(); // %load_inst = load %struct.datatype_t, %struct.datatype_t* %alloca_datatype_v1, align 8

L->getType()->dump(); // %struct.datatype_t = type { i32, i8* }
R->getType()->dump(); // %struct.datatype_t = type { i32, i8* }

lload_inst_type->dump(); //   %load_inst = load i32, i32* %type, align 4
rload_inst_type->dump(); //   %load_inst = load i32, i32* %type, align 4

lload_inst_v->dump(); //   %load_inst = load i8*, i8** %v, align 8
rload_inst_v->dump(); //   %load_inst = load i8*, i8** %v, align 8

if (op == '+') {
  // issue: how to take the decision without knowing the type lload_inst_v holds
  BinaryOperator::Create(Instruction::FAdd, lload_inst_v, rload_inst_v, "add", label_if_then_double);
  // or
  BinaryOperator::Create(Instruction::Add, lload_inst_v, rload_inst_v, "add", label_if_then_long);
}

So the problem is that I need to know which is the type lload_inst_type and rload_inst_type hold, so that I can switch the methods from the LLVM API BinaryOperator::Create(Instruction::FAdd, ...) for floats and BinaryOperator::Create(Instruction::Add, ...) for ints, for instance.

However, I just realized I can't figure out the value of a AllocaInst, LoadInst while generating the backend (at least I'm not aware of how to do that).

Upvotes: 0

Views: 478

Answers (1)

Colin LeMahieu
Colin LeMahieu

Reputation: 618

If your source language type system is untyped, this will have to be hidden from LLVM since it's IR is typed. You'll have to design a way to track the type at runtime, maybe some sort of enumerated tagged object system. Your function calls will have to check the types passed in at runtime and pick the appropriate function to call.

LLVM doesn't provide any of this functionality, this will have to be the responsibility of the type system of the runtime of your language.

Upvotes: 1

Related Questions