man zet
man zet

Reputation: 901

Insert in arbitrary nested HashMap

I want to have a data-structure that allows me to have arbitrary nested HashMap. For that I've constructed the following struct:

struct Database {
    children: HashMap<String, Database>,
    data: String,
}

For inserting in this structure i get a list of keys and a value to insert. So for example for the input

let subkeys = vec!["key1", "key1.1", "key1.1.3"];
let value = "myvalue";

I want the database to have this (pseudo) structure:

{
    "data" : "",
    "children": {
        "key1": {
            "data" : "",
            "children": {
                "key1.1": {
                    "data" : "",
                    "children" : {
                        "key1.1.3": {
                            "data": "myvalue",
                            "children" : {}
                        }
                    }  
                }
            }
        }
    }   
}

and then for example for a second insert request

let subkeys = vec!["key1", "key1.1", "key1.1.2"];
let value = "myvalue2";

the structure should look (pseudo) like this:

{
    "data" : "",
    "children": {
        "key1": {
            "data" : "",
            "children": {
                "key1.1": {
                    "data" : "",
                    "children" : {
                        "key1.1.3": {
                            "data": "myvalue",
                            "children" : {}
                        },
                        "key1.1.2": {
                            "data": "myvalue2",
                            "children" : {}
                        }
                    }  
                }
            }
        }
    }   
}

So here is a minimal reproducible example of what I've tried (not working) playground

use std::collections::HashMap;

struct Database {
    children: HashMap<String, Database>,
    data: String,
}

fn main()
{
    // make a databse object
    let mut db = Database { 
        children: HashMap::new(),
        data: "root".to_string(), 
    };

    // some example subkeys
    let subkeys = vec!["key1", "key1.1", "key1.1.3"];
    // and the value i want to insert
    let value = "myvalue";

    // a reference to the current HashMap
    // initialize with the root 
    let mut root = &db.children;

    // iterate throught subkeys
    for subkey in subkeys.iter() {

        // match the result of a get request to the hashmap
        match root.get::<String>(&subkey.to_string()) {
            Some(child) => {
                // if the key already exists set the root to the children of the child
                root = &child.children;
            }
            None => {
                // if key doesnt exist add it with a ne empty hashmap
                let d = Database{children: HashMap::new(), data: "".to_string()};

                // set root to this new databse obejct
                root = &d.children; 

                root.insert(subkey.to_string(), d);


            }
        }
    }
}

So as I understand it there are to problems with this code:

  1. &d.children get s dropped after the match and so root "kind of" has no value

  2. also the root.insert seems to be a problem because root is a & reference, so the data it refers to cannot be borrowed as mutable`

What do I need to do to make my code work and produce results like shown above. Do I maybe need to change something in my struct Database?

Upvotes: 3

Views: 3758

Answers (1)

SCappella
SCappella

Reputation: 10474

First, some comments on what you have so far and why it doesn't work. root needs to be a mutable reference. Note the distinction between a mutable variable (let mut root = &db.children;) and a mutable reference (let root = &mut db.children;). The former allows the variable itself to be changed. The latter allows the data behind the reference to be changed. In this instance, we need both (let mut root = &mut db.children) because we not only change root as we iterate through the nodes, but we also modify the data behind the reference whenever we need to insert a new node.

The same thing applies to d in the inner loop (it needs to be a mutable variable), though as we'll see, mutating d isn't really what we want.

// if key doesnt exist add it with a ne empty hashmap
let d = Database{children: HashMap::new(), data: "".to_string()};

// set root to this new databse obejct
root = &mut d.children; 

root.insert(subkey.to_string(), d);

Ignoring the errors for a moment, what should this code do? d is a new Database with no real data in it. Then, we set root to be the (empty) set of children of this new Database. Finally, we insert the new Database into root. But since we changed root in the second step, it's no longer the parent: we're inserting d as a child of itself!

We instead want to switch the order of the second two steps. But if we simply switch those two lines, we get the error

error[E0382]: borrow of moved value: `d`
  --> src/main.rs:41:24
   |
36 |                 let mut d = Database{children: HashMap::new(), data: "".to_string()};
   |                     ----- move occurs because `d` has type `Database`, which does not implement the `Copy` trait
37 |                 
38 |                 root.insert(subkey.to_string(), d);
   |                                                 - value moved here
...
41 |                 root = &mut d.children; 
   |                        ^^^^^^^^^^^^^^^ value borrowed here after move

So the problem is that d is no longer a local variable when we try to set root to its children. We need root to be the children of the just-inserted value. The usual idiom for this kind of thing is the entry API. It allows us to attempt to get a value from a HashMap and if it's not found, insert something. Most relevantly, this insertion returns a mutable reference to whatever value now resides at that key.

Now that section looks like

// if key doesnt exist add it with a new empty hashmap
let d = Database{children: HashMap::new(), data: "".to_string()};

// insert the new database object and
// set root to the hashmap of children
root = &mut root.entry(subkey.to_string()).or_insert(d).children;

At this point, we have an apparently working program. By adding a #[derive(Debug)] to Database, we can see what the database looks like with println!("{:#?}, db);. However, if we try to add in the second value, everything blows up. Rather than placing the two values side-by-side, they end up in completely separate branches of the database. This traces back to the commented out lines in the Some(child) branch of the match statement.

We'd like to set root to a mutable reference to child.children, but even just uncommenting that line without any changes causes the error that root is mutably borrow while borrowed elsewhere. The problem is that we're using the borrow in root.get(&subkey.to_string()) now. Before, since we ignored child and the other branch didn't use any data from that borrow, the borrow could end right away. Now it has to last for the whole duration of the match. This prevents us from borrowing mutably even in the None case.

Fortunately, since we're using the entry API, we don't need this match statement at all! The whole thing can just be replaced with

let d = Database {
    children: HashMap::new(),
    data: "".to_string(),
};

// insert the new database object and
// set root to the hashmap of children
root = &mut root.entry(subkey.to_string()).or_insert(d).children;

If the subkey already exists in the set of children, root.entry(...).or_insert(...) will point to that already existing child.

Now we just need to clean up the code. Since you're using it more that once, I'd recommend factoring the act of inserting a path of keys into a function. Rather than following the HashMap<String, Database> through the path, I'd recommend following the Database itself, since that will allow you to modify its data field at the end. To that end, I'd suggest a function with this signature:

impl Database {
    fn insert_path(&mut self, path: &[&str]) -> &mut Database {
        todo!()
    }
}

Next, since we only need to create a new Database (d) when one doesn't already exist, we can use Entry's or_insert_with method to create the new database only when necessary. This is easiest when there's a function to create the new database, so let's add #[derive(Default)] to the list of derives on Database. That makes our function

impl Database {
    fn insert_path(&mut self, path: &[&str]) -> &mut Self {
        let mut root = self;
        // iterate throught path
        for subkey in path.iter() {
            // insert the new database object if necessary and
            // set root to the hashmap of children
            root = root
                .children
                .entry(subkey.to_string())
                // insert (if necessary) using the Database::default method
                .or_insert_with(Database::default);
        }
        root
    }
}

At this point we should run cargo clippy to see if there are any suggestions. There's one about using to_string on &&str. To fix that, you have two choices. One, use one of the other methods for converting &strs to Strings instead of to_string. Two, dereference the &&str before using to_string. This second option is simpler. Since we're iterating over &[&str] (Vec<&str>::iter in your original), the items in the iteration are &&str. The idiomatic way to strip off the extra layer of references is to use a pattern to destructure the items.

for &subkey in path {
   ^^^ this is new
    ... // subkey has type &str instead of &&str here
}

My last piece of advice would be to change the name of root to something more generic, like node. It's only the root right at the start, so the name is misleading after that. Here's the final code together with your tests (playground):

use std::collections::HashMap;

#[derive(Default, Debug)]
struct Database {
    children: HashMap<String, Database>,
    data: String,
}

impl Database {
    fn insert_path(&mut self, path: &[&str]) -> &mut Self {
        // node is a mutable reference to the current database
        let mut node = self;
        // iterate through the path
        for &subkey in path.iter() {
            // insert the new database object if necessary and
            // set node to (a mutable reference to) the child node
            node = node
                .children
                .entry(subkey.to_string())
                .or_insert_with(Database::default);
        }
        node
    }
}

fn main() {
    // make a databse object
    let mut db = Database {
        children: HashMap::new(),
        data: "root".to_string(),
    };

    // some example subkeys
    let subkeys = vec!["key1", "key1.1", "key1.1.3"];
    // and the value i want to insert
    let value = "myvalue";

    let node = db.insert_path(&subkeys);
    node.data = value.to_string();

    println!("{:#?}", db);

    let subkeys = vec!["key1", "key1.1", "key1.1.2"];
    let value = "myvalue2";

    let node = db.insert_path(&subkeys);
    node.data = value.to_string();

    println!("{:#?}", db);
}

Upvotes: 6

Related Questions