Reputation: 3098
I'm implementing a scanner in Rust. I have a scan
method on a Scanner
struct which takes a string slice as the source code, breaks that string into a Vec<&str>
of UTF-8 characters (using the crate unicode_segmentation
), and then delegates each char to a scan_token
method which determines its lexical token and returns it.
extern crate unicode_segmentation;
use unicode_segmentation::UnicodeSegmentation;
struct Scanner {
start: usize,
current: usize,
}
#[derive(Debug)]
struct Token<'src> {
lexeme: &'src [&'src str],
}
impl Scanner {
pub fn scan<'src>(&mut self, source: &'src str) -> Vec<Token<'src>> {
let mut offset = 0;
let mut tokens = Vec::new();
// break up the code into UTF8 graphemes
let chars: Vec<&str> = source.graphemes(true).collect();
while let Some(_) = chars.get(offset) {
// determine which token this grapheme represents
let token = self.scan_token(&chars);
// push it to the tokens array
tokens.push(token);
offset += 1;
}
tokens
}
pub fn scan_token<'src>(&mut self, chars: &'src [&'src str]) -> Token<'src> {
// get this lexeme as some slice of the slice of chars
let lexeme = &chars[self.start..self.current];
let token = Token { lexeme };
token
}
}
fn main() {
let mut scanner = Scanner {
start: 0,
current: 0,
};
let tokens = scanner.scan("abcd");
println!("{:?}", tokens);
}
The error I receive is:
error[E0597]: `chars` does not live long enough
--> src/main.rs:22:42
|
22 | let token = self.scan_token(&chars);
| ^^^^^ borrowed value does not live long enough
...
28 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'src as defined on the method body at 15:17...
--> src/main.rs:15:17
|
15 | pub fn scan<'src>(&mut self, source: &'src str) -> Vec<Token<'src>> {
| ^^^^
I suppose I understand the logic behind why this doesn't work: the error makes it clear that chars
needs to live as long as lifetime 'src
, because tokens
contains slice references into the data inside chars
.
What I don't understand is, since chars
is just a slice of references into an object which does have a lifetime of 'src
(namely source
), why can't tokens
reference this data after chars
has been dropped? I'm fairly new to low-level programming and I suppose my intuition regarding references + lifetimes might be somewhat broken.
Upvotes: 3
Views: 372
Reputation: 430634
Your problem can be reduced to this:
pub fn scan<'a>(source: &'a str) -> Option<&'a str> {
let chars: Vec<&str> = source.split("").collect();
scan_token(&chars)
}
pub fn scan_token<'a>(chars: &'a [&'a str]) -> Option<&'a str> {
chars.last().cloned()
}
error[E0597]: `chars` does not live long enough
--> src/lib.rs:3:17
|
3 | scan_token(&chars)
| ^^^^^ borrowed value does not live long enough
4 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the function body at 1:13...
--> src/lib.rs:1:13
|
1 | pub fn scan<'a>(source: &'a str) -> Option<&'a str> {
| ^^
The scan_token
function requires that the reference to the slice and the references inside the slice have the same lifetime: &'a [&'a str]
. Since the Vec
lives for a shorter period of time, that's what the unified lifetime must be. However, the lifetime of the vector isn't long enough to return the value.
Remove the unneeded lifetime:
pub fn scan_token<'a>(chars: &[&'a str]) -> Option<&'a str>
Applying these changes to your complete code, you see the core problem is repeated in the definition of Token
:
struct Token<'src> {
lexeme: &'src [&'src str],
}
This construction makes it definitely not possible for your code to compile as-is — there is no vector of slices that lives as long as the slices. Your code is simply not possible in this form.
You could pass in a mutable reference to a Vec
to use as storage, but this would be pretty unusual and has plenty of downsides you'd hit when you try to do anything larger:
impl Scanner {
pub fn scan<'src>(&mut self, source: &'src str, chars: &'src mut Vec<&'src str>) -> Vec<Token<'src>> {
// ...
chars.extend(source.graphemes(true));
// ...
while let Some(_) = chars.get(offset) {
// ...
let token = self.scan_token(chars);
// ...
}
// ...
}
// ...
}
fn main() {
// ...
let mut chars = Vec::new();
let tokens = scanner.scan("abcd", &mut chars);
// ...
}
You probably just want Token
to be Vec<&'src str>
See also:
Upvotes: 1