Reputation: 11923
I am having trouble to get performance improvement by parallelizing a DES encryption algorithm.
Here is my attempt:
fn des(message: &[u8], subkeys: Vec<u64>) -> Vec<u8> {
let mut pool = Pool::new(THREAD_COUNT);
let message = message_to_u64s(message);
crossbeam::scope(|scope| {
pool.map(scope, message.iter().enumerate(), |(i, &block)| {
let permuted = ip(block);
let mut li = permuted & 0xFFFFFFFF00000000;
let mut ri = permuted << 32;
for subkey in &subkeys {
let last_li = li;
li = ri;
ri = last_li ^ feistel(ri, *subkey);
}
let r16l16 = ri | (li >> 32);
to_u8_vec(fp(r16l16))
}).collect::<Vec<_>>()
}).concat()
}
(this uses the crates crossbeam
and simple_parallel
but I will accept solutions not using these)
Unfortunately, this implementation is slower than the version without thread:
fn des(message: &[u8], subkeys: Vec<u64>) -> Vec<u8> {
let message = message_to_u64s(message);
let mut cipher = vec![];
for block in message {
let permuted = ip(block);
let mut li = permuted & 0xFFFFFFFF00000000;
let mut ri = permuted << 32;
for subkey in &subkeys {
let last_li = li;
li = ri;
ri = last_li ^ feistel(ri, *subkey);
}
let r16l16 = ri | (li >> 32);
let mut bytes = to_u8_vec(fp(r16l16));
cipher.append(&mut bytes);
}
cipher
}
I believe the collect
and concat
are the issues but I don't know how to avoid them without using unsafe code.
So how can I improve the performance of this algorithm (by using threads) using safe code? (solutions with unsafe code would also be interesting, but I believe there must be a solution without unsafe code)
Upvotes: 0
Views: 80
Reputation: 34185
Use a profiler. You could try guessing where the slowdown is, but in you may not find the right place anyway.
But for an educated guess... I'd try splitting the message into THREAD_COUNT
parts and feeding those parts to the thread pool instead. If you're sending 8-byte fragments separately, you'll spend more time on managing that than on the DES itself.
Upvotes: 4