Reputation: 2702
I have the following code in Rust. I know that I am not supposed to return references to local variables, and in this case I am not. The string to split is passed as a &str
reference and, after determining the split boundary, I am returning &s[0..idx]
where idx
is the end of the boundary. I was confident that this would not result in a "dangling" reference related error. However, it turns out I was wrong!
fn demo4() {
let mut s = String::from("Elijah Wood");
let firstname = str_split(&s, &String::from(" "));
println!("First name of actor: {}", firstname);
}
// can handle both &str and &String
fn str_split(s: &str, pat: &str) -> &str {
let bytes = s.as_bytes();
let b_pat = pat.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b_pat {
return &s[0..i];
}
}
&s[..]
}
fn main() {
demo4();
}
I am getting the following error:
error[E0106]: missing lifetime specifier
--> src/main.rs:7:37
|
7 | fn str_split(s: &str, pat: &str) -> &str {
| ^ expected lifetime parameter
|
= help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `s` or `pat`
Any explanation is greatly appreciated.
Upvotes: 6
Views: 4749
Reputation: 27895
The error message tells you what's wrong, although not how to fix it:
= help: this function's return type contains a borrowed value, but the
signature does not say whether it is borrowed from `s` or `pat`
The compiler uses lifetimes to determine whether code is safe or not. Part of that is knowing what each reference could be borrowing from. The signature:
fn str_split(s: &str, pat: &str) -> &str
does not indicate whether str_split
returns a reference into s
or a reference into pat
, so Rust can't tell how to check the validity of the reference. (See also this question for a version of this where the function has no reference arguments at all.)
To fix this, you need to introduce a lifetime parameter:
fn str_split<'a>(s: &'a str, pat: &str) -> &'a str
This says, roughly, "If you borrow a string for some lifetime 'a
, you can call str_split
on it (and another string) and get back a reference also valid for lifetime 'a
." &pat
is not annotated with 'a
, because the result does not borrow from pat
, only from s
.
The Rust Programming Language has a chapter on lifetimes that addresses this very issue and I would strongly recommend you read it; Rust's lifetimes go beyond merely preventing dangling pointers.
Although not part of the question, the the body of this function is a one-liner. Unless this is purely a learning exercise, don't do more work than you have to:
fn str_split<'a>(s: &'a str, pat: &str) -> &'a str {
s.split(pat).next().unwrap_or(s)
}
Upvotes: 12
Reputation: 2702
Thanks to everyone for explaining the error and the reasons behind it. I have fixed the code and made some changes which I would like to explain. First thanks to @trentcl for noting that the pattern matching was semantically wrong. The reason being that the search was for the pattern was done by matching against each bytes in the array rather than the whole array itself. This prompted me to change the function to only return the words by splitting on the first occurrence of space character ' '
.
Also the function signature needed a lifetime trait to be included to make to it compile correctly. The working code is presented below:
// 4 Demo with string spliting
fn demo4() {
let s = String::from("Elijah Wood");
let firstname = str_split(&s);
println!("First name of actor: {}", firstname);
}
// splits a string at first space
fn str_split<'a>(s : &'a str) -> &'a str {
let bytes = s.as_bytes();
for(i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}
&s[..]
}
Upvotes: 0
Reputation: 65702
&str
is a shorthand for &'a str
, where 'a
is some lifetime parameter that needs to be declared beforehand. In some simple cases. it's possible to omit these lifetime parameters and the compiler will expand it for you. However, there are some cases where you need to declare the lifetimes explicitly.
From The Rust Programming Language, Second Edition (emphasis mine), here are the rules regarding omitted lifetime parameters:
Each parameter that is a reference gets its own lifetime parameter. In other words, a function with one parameter gets one lifetime parameter:
fn foo<'a>(x: &'a i32)
, a function with two arguments gets two separate lifetime parameters:fn foo<'a, 'b>(x: &'a i32, y: &'b i32)
, and so on.If there is exactly one input lifetime parameter, that lifetime is assigned to all output lifetime parameters:
fn foo<'a>(x: &'a i32) -> &'a i32
.If there are multiple input lifetime parameters, but one of them is
&self
or&mut self
because this is a method, then the lifetime ofself
is assigned to all output lifetime parameters. This makes writing methods much nicer.
The problem with your function is that it has two input lifetime parameters, therefore the compiler will not choose one for you. You have to write your function like this:
fn str_split<'a>(s: &'a str, pat: &str) -> &'a str {
s
}
If this syntax is new to you, make sure you read the chapter on lifetimes.
Why can't the compiler just figure it out by itself? Because Rust has a principle that the signature of a function should not change because of a change in its implementation. It simplifies the compiler (it doesn't have to deal with interdependent functions whose signatures have not been fully determined) and it also simplifies the maintenance of your own code. For example, if you were to change the implementation of your function like so:
fn str_split(s: &str, pat: &str) -> &str {
pat
}
then the output's lifetime parameter would have to be linked to pat
's lifetime parameter. In a library, this is a breaking change; you don't want breaking changes to slip by without you noticing!
Upvotes: 7