Reputation: 1054
I am reading The Rust Programming Language. In this code, I don't understand why it has to be args[1].clone()
and why it can't be &args[1]
:
use std::env;
use std::fs;
fn main() {
let args: Vec<String> = env::args().collect();
let config = parse_config(&args);
println!("Searching for {}", config.query);
println!("In file {}", config.filename);
let contents = fs::read_to_string(config.filename)
.expect("Something went wrong reading the file");
println!("With text:\n{}", contents);
}
struct Config {
query: String,
filename: String,
}
fn parse_config(args: &[String]) -> Config {
let query = args[1].clone();
let filename = args[2].clone();
Config { query, filename }
}
The book explains it but I still don't understand. It says something about the struct taking ownership.
Is this the same as the code above? This is what the compiler said to do when I changed args[1].clone
to &args[1]
fn parse_config(args: &[String]) -> Config {
let query = &args[1];
let filename = &args[2];
Config { query: query.to_string(), filename: filename.to_string() }
}
Upvotes: 1
Views: 2094
Reputation: 822
Since you mentioned "struct taking ownership or something", I'm going to deviate a little from your question to clarify things a bit.
The way rust deal with memory safety without resorting to garbage collection is by using the concepts of ownership and borrowing. These concepts take time to internalize, but some aspects of it can be seen in simple examples:
struct Bike {
brand: String;
tire: f32;
}
impl Bike {
fn ride_bike(&self) {
println!("Goin' where the weather suits my clothes!")
}
}
fn main() {
// You are the owner of the bike
let rafaels_bike = Bike {
brand: String::from("Bike Friday"),
tire: 28.0
}
// You are giving your bike to me
let eduardos_bike = rafaels_bike
// Since you gave the bike to me, you cannot ride it anymore
// rafaels_bike.ride_bike() // It will give a borrow after move error, if uncommented
// The bike is mine now, so I can ride it.
eduardos_bike.ride_bike()
}
Rust does things this way because it needs to know when to clean up memory. To do so, it looks at the owner's scope: when the owner goes out of scope, it's time to clean up its resources. In this case the Bike
struct allocates memory, so it needs to be cleaned up.
This concept of ownership permeates every operation in Rust, including struct and function declarations, expressions, etc. As a result, the way you declare functions can give information to the caller about what you want to do with the data.
// Immutable borrow: You're giving me permission to use your data (Bike),
// but I can't change it
fn have_a_look(bike: &Bike)
// Mutable borrow: You're giving me permission to modify your data (Bike)
fn install_modifications(bike: &mut Bike)
// Moving operation: You're giving the bike away, you cannot use it anymore
// after you call the function
fn give_to_charity(bike: Bike) // Bike is still available inside the function
This also applies to the struct declaration, if you're declaring an attribute without an &
behind it, it's saying that it wants to own the data**, so the Bike
struct is saying that it wants to own the brand String
data. Well, the bike analogy kind of falls apart here, so let's get back to your question.
** There's a small inconsistency here that applies to values that implement the Copy trait. If a value implements a Copy trait, rust will copy it around everywhere it can, so you don't need to bother with putting an &
before the type, as it happens for f32
and lots of other basic types.
To understand why you can't do what you want to do (use &args[1]
instead of args[1].clone()
), we need to keep in mind the rules of ownership in rust:
With this is mind, let's look at the parse_config
function if it was implemented the way you want to:
// The Config struct wants to own both the query and filename strings
struct Config {
query: String,
filename: String,
}
// You're saying that you want to look at the args data and nothing else
fn parse_config(args: &[String]) -> Config {
let query = &args[1];
let filename = &args[2];
// Then you're trying to give it away to Config behind the caller's back
// You broke the contract
Config { query, filename }
}
Essentially, if this was allowed the Strings
in args
would have two owners, the args
and the config
variables breaking rule n° 2 of ownership.
Let's suppose rust wasn't so strict and rule n° 2 didn't exist. In this hypothesis, this would be possible:
// This is a world without rule n° 2, this code will not compile.
use std::env;
use std::fs;
fn main() {
// args own the String inside the Vec
let args: Vec<String> = env::args().collect();
let contents = {
// config will also own the Strings at index 1 and 2
let config = parse_config(&args);
println!("Searching for {}", config.query);
println!("In file {}", config.filename);
fs::read_to_string(config.filename)
.expect("Something went wrong reading the file")
// config goes out of scope, cleaning the memory for strings it owns
};
// The strings at index 1 and 2 don't exist anymore
// This is known as a dangling pointer
println!("{:?}", args);
println!("With text:\n{}", contents);
// End of scope: Will try to clean args memory, but some of it was already cleaned
// This is known as a double-free
}
struct Config {
query: String,
filename: String,
}
fn parse_config(args: &[String]) -> Config {
let query = &args[1];
let filename = &args[2];
// Give the strings to Config anyway
Config { query, filename }
}
As you see, everything would break and Rust would not be able to guarantee memory safety anymore.
The solution the book offers is to clone the strings, meaning that you allocate an entire new block of memory thats different from the one args
is owning:
let query = args[1].clone();
let filename = args[2].clone();
Now both config
and args
are ok, since they own entirely different blocks of memory. When they go out of scope, each one will clean what they are owning.
This one was already mentioned. You can move the args
variable in main to the function parameter. Since it now owns the data, you can do whatever you want with it, including giving ownership to another struct.
fn parse_config(mut args: Vec<String>) -> Config
This works because the main
function doesn't use the args
variable after the parse_config
is called. You still can't use indexing though, because you cannot move out of an Index since it happens by means of an immutable borrow:
from the Index trait documentation:
fn index(&self, index: Idx) -> &Self::Output
This one you arrived yourself and yes, it's functionally equivalent to the one provided by the book.
The implementation of to_string
for String
is as following:
// to_string is implemented by means of to_owned
#[stable(feature = "string_to_string_specialization", since = "1.17.0")]
impl ToString for String {
#[inline]
fn to_string(&self) -> String {
self.to_owned()
}
}
// to_owned uses clone when the type implements the Clone trait
[stable(feature = "rust1", since = "1.0.0")]
impl<T> ToOwned for T
where
T: Clone,
{
type Owned = T;
fn to_owned(&self) -> T {
self.clone()
}
fn clone_into(&self, target: &mut T) {
target.clone_from(self);
}
}
// And String implements Clone
#[stable(feature = "rust1", since = "1.0.0")]
impl Clone for String {
fn clone(&self) -> Self {
String { vec: self.vec.clone() }
}
fn clone_from(&mut self, source: &Self) {
self.vec.clone_from(&source.vec);
}
}
You can also solve this by using lifetimes in the Config
struct, like this:
struct Config<'a> {
query: &'a String,
filename: &'a String,
}
fn parse_config(args: &[String]) -> Config {
let query = &args[1];
let filename = &args[2];
// Now config is not trying to own the Strings
// So you can pass your references to it
Config { query, filename }
}
This allows you to implement the code the way you want to but it ties the lifetime of the Config
result to the lifetime of args
, which in this case is the main
function scope.
In other words, you can only access config.filename
and config.query
while the initial owner (args
in main
) is still in scope.
I've made a playground with an working example and another with a different scope for args so you can see how the lifetime affects the return.
I'm not a Rust pro and I made some simplifications in order to make the explanation straightfoward. If I said anything wrong hopefully a more experienced rustacean will correct me.
Upvotes: 9
Reputation: 342
std::ops::Index
returns a reference to the type in the container, in this case a slice.
You have several options for getting working code. The best of them is to rewrite parse_config
to take a Vec
.
fn parse_config(mut args: Vec<String>) -> Config {
let filename = args.remove(2);
let query = args.remove(1);
Config { query, filename }
}
Upvotes: 1