se7entyse7en
se7entyse7en

Reputation: 4294

How bad is it to move the drop of a large struct to another thread?

I have a method that looks as follows:

fn func(
    &self,
) -> Something {
    let v: Vec<u8> = ...1mln entries...;

    ...do something with that vec...

    Something::default()
}

After some profiling, I realized that this function is spending around 300ms for dropping that Vec<u8>. I was able to measure it by doing:

use std::time::Instant;

...

fn func(
    &self,
) -> Something {
    let v: Vec<u8> = ...1mln entries...;

    ...do something with that vec...

    let drop_t0 = Instant::now();
    drop(v);
    println!("[drop_elapsed] {}ms", drop_t0.elapsed().as_millis());

    Something::default()
}

I wanted this func to be faster for this cleanup so I came up with the following:

use tokio::task;

...

fn func(
    &self,
) -> Something {
    let v: Vec<u8> = ...1mln entries...;

    ...do something with that vec...

    task::spawn(async move {
        drop(v);
    });

    Something::default()
}

And indeed I achieved what I wanted. I'm still a Rust newbie, so I wanted to know how bad is this. Are there other alternatives that are more idiomatic? One idea would be to make v as a field of the struct and recycle it every time, but it doesn't feel natural.

EDIT:

The Vec is not actually a Vec<u8>, but it's a Vec<CustomStruct> where:

pub struct CustomStruct {
    pub field: Vec<Option<Vec<u8>>>,
}

It's built from the response of a server. Then in the function, I do some manipulation and after that, I just don't need it anymore.

Upvotes: 1

Views: 359

Answers (1)

Finomnis
Finomnis

Reputation: 22591

There are two things that I think are important.

For one, I would consider a drop() function that takes 300ms as a blocking, computation-heavy task and therefore unsuited for asynchronous programming. Blocking tasks have the potential to block the executor and therefore globally block everything.

Therefore, instead of executing it in a tokio::spawn(), I would execute it in a tokio::spawn_blocking(), which is meant for blocking calls.

task::spawn_blocking(move || drop(v));

The second one is that you should try to avoid having such an allocation-heavy data structure in the first place. One million Vec allocations is a lot.

The question here is: can you combine multiple of those Vec allocations?

There are many alternatives:

  • if they are all similar in size, use a (usize, [u8; size]) to store a fixed-size array for each element. The usize is for storing how big the contained data actually is. This does not require an allocation and can directly be stored in the array. Further, you could use the size 0 for storing that it is empty, no need for an Option.
  • if the data trickles in as continuous packets, use a stream instead of collecting it all in one data structure.
  • if the data is just a bunch of brocken up parts of a larger vector, see if there is the possibility to use slices or offsets instead, and keep the actual data in the large vector.

It's hard to say for sure without knowing the exact context of your code, but maybe that created some inspiration for further optimizations.

Upvotes: 4

Related Questions