Caballero
Caballero

Reputation: 12111

split HashMap into equal chunks

What's the best way to split a HashMap into equal chunks? For instance, this is how I split a Vec<String>:

extern crate num_cpus;

fn main() {

    let cpu_count = num_cpus::get();

    let list: Vec<String> = vec!["one".into(), "two".into(), "three".into(), "four".into(), "five".into(), "six".into(), "seven".into(), "eight".into(), "nine".into(), "ten".into()];

    let chunk_len = (list.len() / cpu_count) as usize + 1;
    let mut chunks = Vec::new();
    for chunk in list.chunks(chunk_len) {
        chunks.push(chunk.to_owned());
    }

    for chunk in chunks {
        println!("{:?}", chunk);
    }

}

produces output

["one", "two"]
["three", "four"]
["five", "six"]
["seven", "eight"]
["nine", "ten"]

How would I do the same with HashMap<String, String>?

Upvotes: 0

Views: 1839

Answers (2)

oli_obk
oli_obk

Reputation: 31263

As long as you don't care about the order of the elements you take out of the HashMap, you can convert your HashMap<String, String> into a Vec<(String, String)> by calling your_map.into_iter().collect::<Vec<_>>()

Then you can use the same algorithm you used to convert your Vec<String>


To be able to compete with @DK.'s elaborate answer I decided to create a generic version of your chunking algorithm:

fn chunk<T, U>(data: U) -> Vec<U>
    where U: IntoIterator<Item=T>,
    U: FromIterator<T>,
    <U as IntoIterator>::IntoIter: ExactSizeIterator
{
    let cpu_count = 6 /*num_cpus::get()*/;

    let mut iter = data.into_iter();
    let iter = iter.by_ref();

    let chunk_len = (iter.len() / cpu_count) as usize + 1;

    let mut chunks = Vec::new();
    for _ in 0..cpu_count {
        chunks.push(iter.take(chunk_len).collect())
    }
    chunks
}

Try it out in the PlayPen

Upvotes: 3

DK.
DK.

Reputation: 59125

I'm not sure it makes any sense to "chunk" a HashMap directly. In any case, the solution is obvious: don't. You can chunk a Vec (actually any array slice), so just use that! After all, a HashMap is logically just an unordered sequence of (Key, Value) pairs.

fn chunk_vec() {
    let cpu_count = 6 /*num_cpus::get()*/;

    let list: Vec<String> = [
        "one", "two", "three", "four", "five",
        "six", "seven", "eight", "nine", "ten"
    ].iter().map(|&s| s.into()).collect();

    let chunk_len = (list.len() / cpu_count) as usize + 1;
    let chunks: Vec<Vec<_>> = list.chunks(chunk_len)
        .map(|c| c.iter().collect())
        .collect();
    for chunk in chunks {
        println!("{:?}", chunk);
    }
}

fn chunk_hash() {
    use std::collections::HashMap;

    let cpu_count = 6 /*num_cpus::get()*/;

    let hash: HashMap<String, i32> = [
        ("one", 1), ("two", 2), ("three", 3), ("four", 4), ("five", 5),
        ("six", 6), ("seven", 7), ("eight", 8), ("nine", 9), ("ten", 10)
    ].iter().map(|&(k, v)| (k.into(), v)).collect();

    let list: Vec<_> = hash.into_iter().collect();

    let chunk_len = (list.len() / cpu_count) as usize + 1;
    let chunks: Vec<HashMap<_, _>> = list.chunks(chunk_len)
        .map(|c| c.iter().cloned().collect())
        .collect();
    for chunk in chunks {
        println!("{:?}", chunk);
    }
}

I took the liberty of fiddling with your example code a little to highlight the similarities (and differences) between the two functions.

Upvotes: 4

Related Questions