Reputation: 1638
I disclaim that I'm a beginner in Rust and this is my first project. I'm trying to code a SQL compiler in Rust. I have to create a Token
struct that represents a syntactical unit of a SQL query. In C/C++ I would write a structure like this:
typedef unsigned long Value;
// Represents the type of the token (e.g. number, string, identifier...)
typedef enum class TokenType { ... }
struct Token {
TokenType type;
unsigned char priority;
Value value;
}
Using this approach I use the Value
type to store any kind of value. If it's a number, value
is interpreted as an integer, float..., whereas when value
represents a string or an identifier, it is cast to a string pointer and then used to access the heap-allocated data it points to.
To decide how to handle the value
property of Token
, I use the type
field. For example, if type
tells me the token is an integer, I cast value
to an int. If type
tells me the token is an identifier, I cast value
to a string pointer.
My question is, is it possible to do something similar in safe Rust? If not, then what would be a nice way of achieving a dynamically typed variable that can be interpreted in multiple ways based on the TokenType
? Or is there a neater approach to creating a syntactical token that works well in Rust?
Thanks in advance.
Upvotes: 2
Views: 3821
Reputation: 18639
In Rust, there's a better solution for this problem.
Rust's enum
s work a bit differently, but they are even more suited to this case: Enum variants can themselves have fields that only apply to one variant of the enum.
That way, you can store the value inside your enum:
struct Token{
priority: u8,
value: TokenValue
}
enum TokenValue{
Int (i32),
Str (String),
Id {
name: String,
id: u32
}
// ... other variants
}
This also gives type safety the code that uses it, and Rust can optimise how much memory it allocates for each variant.
This feature is explained very well in The Rust Programming Language book.
Upvotes: 1
Reputation: 2625
// some variants can hold parsed data
pub enum TokenKind {
Ident,
Op,
Int(i64),
Float(f64),
String,
Eof,
}
pub struct Source {
pub content: String,
}
impl Source {
// no need to allocate a new string for each token, just store the span
// and access the slice when needed
pub fn get_underlying_string(&self, token: &Token) -> &str {
&self.content[token.span.clone()]
}
}
// we use struct because token has some common parts
pub struct Token {
pub kind: TokenKind,
pub span: Range<usize>,
}
fn main() {
let mut token = Token {
kind: TokenKind::Int(10),
span: 0..1,
};
println!("int: {}", match token.kind {
TokenKind::Int(i) => i,
_ => panic!("unexpected token kind"),
}); // prints "int: 10"
}
Upvotes: 2
Reputation: 71605
In Rust, where you have a closed set of possible types, you should use an enum. Enums in Rust are not like in C: they can carry payload (i.e. they're sum types).
enum TokenType {
Number(i32),
Identifier(String),
String(String),
Plus,
Minus,
// ...
}
And then match
against it:
let e: TokenType = ...;
match e {
TokenType::Number(n) => println!("number: {n}"),
TokenType::String(s) => println!("string: {s}"),
TokenType::Plus => println!("plus"),
// ...
}
Upvotes: 4