oberblastmeister
oberblastmeister

Reputation: 1054

I don't understand why the variable has to be cloned and not referenced

I am reading The Rust Programming Language. In this code, I don't understand why it has to be args[1].clone() and why it can't be &args[1]:

use std::env;
use std::fs;

fn main() {
    let args: Vec<String> = env::args().collect();

    let config = parse_config(&args);

    println!("Searching for {}", config.query);
    println!("In file {}", config.filename);

    let contents = fs::read_to_string(config.filename)
        .expect("Something went wrong reading the file");

    println!("With text:\n{}", contents);
}

struct Config {
    query: String,
    filename: String,
}

fn parse_config(args: &[String]) -> Config {
    let query = args[1].clone();
    let filename = args[2].clone();

    Config { query, filename }
}

The book explains it but I still don't understand. It says something about the struct taking ownership.

Is this the same as the code above? This is what the compiler said to do when I changed args[1].clone to &args[1]

fn parse_config(args: &[String]) -> Config {
    let query = &args[1];
    let filename = &args[2];

    Config { query: query.to_string(), filename: filename.to_string() }
}

Upvotes: 1

Views: 2094

Answers (2)

Eduardo macedo
Eduardo macedo

Reputation: 822

Since you mentioned "struct taking ownership or something", I'm going to deviate a little from your question to clarify things a bit.

A simplified overview of ownership

The way rust deal with memory safety without resorting to garbage collection is by using the concepts of ownership and borrowing. These concepts take time to internalize, but some aspects of it can be seen in simple examples:

struct Bike {
   brand: String;
   tire: f32;
}

impl Bike {
   fn ride_bike(&self) {
       println!("Goin' where the weather suits my clothes!")
   }
}

fn main() {
   // You are the owner of the bike
   let rafaels_bike = Bike { 
      brand: String::from("Bike Friday"), 
      tire: 28.0 
   }
   
   // You are giving your bike to me
   let eduardos_bike = rafaels_bike

   // Since you gave the bike to me, you cannot ride it anymore
   // rafaels_bike.ride_bike()    // It will give a borrow after move error, if uncommented

   // The bike is mine now, so I can ride it.
   eduardos_bike.ride_bike()
}

Rust does things this way because it needs to know when to clean up memory. To do so, it looks at the owner's scope: when the owner goes out of scope, it's time to clean up its resources. In this case the Bike struct allocates memory, so it needs to be cleaned up.

This concept of ownership permeates every operation in Rust, including struct and function declarations, expressions, etc. As a result, the way you declare functions can give information to the caller about what you want to do with the data.

// Immutable borrow: You're giving me permission to use your data (Bike), 
// but I can't change it
fn have_a_look(bike: &Bike) 

// Mutable borrow: You're giving me permission to modify your data (Bike)
fn install_modifications(bike: &mut Bike) 

// Moving operation: You're giving the bike away, you cannot use it anymore
// after you call the function
fn give_to_charity(bike: Bike)     // Bike is still available inside the function

This also applies to the struct declaration, if you're declaring an attribute without an & behind it, it's saying that it wants to own the data**, so the Bike struct is saying that it wants to own the brand String data. Well, the bike analogy kind of falls apart here, so let's get back to your question.

** There's a small inconsistency here that applies to values that implement the Copy trait. If a value implements a Copy trait, rust will copy it around everywhere it can, so you don't need to bother with putting an & before the type, as it happens for f32 and lots of other basic types.

The problem at hand

To understand why you can't do what you want to do (use &args[1] instead of args[1].clone()), we need to keep in mind the rules of ownership in rust:

  1. Each value in Rust has a variable that’s called its owner.
  2. There can only be one owner at a time.
  3. When the owner goes out of scope, the value will be dropped.

With this is mind, let's look at the parse_config function if it was implemented the way you want to:

// The Config struct wants to own both the query and filename strings
struct Config {
    query: String,
    filename: String,
}

// You're saying that you want to look at the args data and nothing else
fn parse_config(args: &[String]) -> Config {
    let query = &args[1];
    let filename = &args[2];

    // Then you're trying to give it away to Config behind the caller's back
    // You broke the contract
    Config { query, filename }
}

Essentially, if this was allowed the Strings in args would have two owners, the args and the config variables breaking rule n° 2 of ownership.

Let's suppose rust wasn't so strict and rule n° 2 didn't exist. In this hypothesis, this would be possible:

// This is a world without rule n° 2, this code will not compile.
use std::env;
use std::fs;

fn main() {
    // args own the String inside the Vec
    let args: Vec<String> = env::args().collect();

    let contents = {
        // config will also own the Strings at index 1 and 2
        let config = parse_config(&args);

        println!("Searching for {}", config.query);
        println!("In file {}", config.filename);

        fs::read_to_string(config.filename)
            .expect("Something went wrong reading the file")

        // config goes out of scope, cleaning the memory for strings it owns
    };

    // The strings at index 1 and 2 don't exist anymore
    // This is known as a dangling pointer
    println!("{:?}", args);

    println!("With text:\n{}", contents);

    // End of scope: Will try to clean args memory, but some of it was already cleaned
    // This is known as a double-free
}

struct Config {
    query: String,
    filename: String,
}

fn parse_config(args: &[String]) -> Config {
    let query = &args[1];
    let filename = &args[2];
    
    // Give the strings to Config anyway
    Config { query, filename }
}

As you see, everything would break and Rust would not be able to guarantee memory safety anymore.

The Solutions

Cloning

The solution the book offers is to clone the strings, meaning that you allocate an entire new block of memory thats different from the one args is owning:

    let query = args[1].clone();
    let filename = args[2].clone();

Now both config and args are ok, since they own entirely different blocks of memory. When they go out of scope, each one will clean what they are owning.

Moving into the function

This one was already mentioned. You can move the args variable in main to the function parameter. Since it now owns the data, you can do whatever you want with it, including giving ownership to another struct.

fn parse_config(mut args: Vec<String>) -> Config

This works because the main function doesn't use the args variable after the parse_config is called. You still can't use indexing though, because you cannot move out of an Index since it happens by means of an immutable borrow:

from the Index trait documentation:

fn index(&self, index: Idx) -> &Self::Output

Using to_string()

This one you arrived yourself and yes, it's functionally equivalent to the one provided by the book.

The implementation of to_string for String is as following:

// to_string is implemented by means of to_owned
#[stable(feature = "string_to_string_specialization", since = "1.17.0")]
impl ToString for String {
    #[inline]
    fn to_string(&self) -> String {
        self.to_owned()
    }
}

// to_owned uses clone when the type implements the Clone trait
[stable(feature = "rust1", since = "1.0.0")]
impl<T> ToOwned for T
where
    T: Clone,
{
    type Owned = T;
    fn to_owned(&self) -> T {
        self.clone()
    }

    fn clone_into(&self, target: &mut T) {
        target.clone_from(self);
    }
}

// And String implements Clone
#[stable(feature = "rust1", since = "1.0.0")]
impl Clone for String {
    fn clone(&self) -> Self {
        String { vec: self.vec.clone() }
    }

    fn clone_from(&mut self, source: &Self) {
        self.vec.clone_from(&source.vec);
    }
}

Using lifetimes

You can also solve this by using lifetimes in the Config struct, like this:

struct Config<'a> {
    query: &'a String,
    filename: &'a String,
}

fn parse_config(args: &[String]) -> Config {
    let query = &args[1];
    let filename = &args[2];

    // Now config is not trying to own the Strings
    // So you can pass your references to it
    Config { query, filename }
}

This allows you to implement the code the way you want to but it ties the lifetime of the Config result to the lifetime of args, which in this case is the main function scope.

In other words, you can only access config.filename and config.query while the initial owner (args in main) is still in scope.

I've made a playground with an working example and another with a different scope for args so you can see how the lifetime affects the return.


I'm not a Rust pro and I made some simplifications in order to make the explanation straightfoward. If I said anything wrong hopefully a more experienced rustacean will correct me.

Upvotes: 9

Boyd Johnson
Boyd Johnson

Reputation: 342

std::ops::Index returns a reference to the type in the container, in this case a slice.

You have several options for getting working code. The best of them is to rewrite parse_config to take a Vec.

fn parse_config(mut args: Vec<String>) -> Config {
    let filename = args.remove(2);
    let query = args.remove(1);

    Config { query, filename }
}

Upvotes: 1

Related Questions