Nicholas Obert
Nicholas Obert

Reputation: 1638

How to store a dynamically typed variable in a Rust struct?

I disclaim that I'm a beginner in Rust and this is my first project. I'm trying to code a SQL compiler in Rust. I have to create a Token struct that represents a syntactical unit of a SQL query. In C/C++ I would write a structure like this:

typedef unsigned long Value;

// Represents the type of the token (e.g. number, string, identifier...)
typedef enum class TokenType { ... }

struct Token {
  TokenType type;
  unsigned char priority;
  Value value;
}

Using this approach I use the Value type to store any kind of value. If it's a number, value is interpreted as an integer, float..., whereas when value represents a string or an identifier, it is cast to a string pointer and then used to access the heap-allocated data it points to.
To decide how to handle the value property of Token, I use the type field. For example, if type tells me the token is an integer, I cast value to an int. If type tells me the token is an identifier, I cast value to a string pointer.

My question is, is it possible to do something similar in safe Rust? If not, then what would be a nice way of achieving a dynamically typed variable that can be interpreted in multiple ways based on the TokenType? Or is there a neater approach to creating a syntactical token that works well in Rust?

Thanks in advance.

Upvotes: 2

Views: 3821

Answers (3)

FZs
FZs

Reputation: 18639

In Rust, there's a better solution for this problem.

Rust's enums work a bit differently, but they are even more suited to this case: Enum variants can themselves have fields that only apply to one variant of the enum.

That way, you can store the value inside your enum:

struct Token{
  priority: u8,
  value: TokenValue
}

enum TokenValue{
  Int (i32),
  Str (String),
  Id {
    name: String,
    id: u32
  }
  // ... other variants
}

This also gives type safety the code that uses it, and Rust can optimise how much memory it allocates for each variant.

This feature is explained very well in The Rust Programming Language book.

Upvotes: 1

Jakub Dóka
Jakub Dóka

Reputation: 2625

// some variants can hold parsed data
pub enum TokenKind {
    Ident,
    Op,
    Int(i64),
    Float(f64),
    String,
    Eof,
}

pub struct Source {
    pub content: String,
}

impl Source {
    // no need to allocate a new string for each token, just store the span
    // and access the slice when needed
    pub fn get_underlying_string(&self, token: &Token) -> &str {
        &self.content[token.span.clone()]
    }   
}

// we use struct because token has some common parts
pub struct Token {
    pub kind: TokenKind,
    pub span: Range<usize>,
}

fn main() {
    let mut token = Token {
        kind: TokenKind::Int(10),
        span: 0..1,
    };

    println!("int: {}", match token.kind {
        TokenKind::Int(i) => i,
        _ => panic!("unexpected token kind"),
    }); // prints "int: 10"
}

Upvotes: 2

Chayim Friedman
Chayim Friedman

Reputation: 71605

In Rust, where you have a closed set of possible types, you should use an enum. Enums in Rust are not like in C: they can carry payload (i.e. they're sum types).

enum TokenType {
    Number(i32),
    Identifier(String),
    String(String),
    Plus,
    Minus,
    // ...
}

And then match against it:

let e: TokenType = ...;
match e {
    TokenType::Number(n) => println!("number: {n}"),
    TokenType::String(s) => println!("string: {s}"),
    TokenType::Plus => println!("plus"),
    // ...
}

Upvotes: 4

Related Questions