Reputation: 621
Recent Intel chips (Ivy Bridge and up) have instructions for generating (pseudo) random bits. RDSEED
outputs "true" random bits generated from entropy gathered from a sensor on the chip. RDRAND
outputs bits generated from a pseudorandom number generator seeded by the true random number generator. According to Intel's documentation, RDSEED
is slower, since gathering entropy is costly. Thus, RDRAND
is offered as a cheaper alternative, and its output is sufficiently secure for most cryptographic applications. (This is analogous to the /dev/random
versus /dev/urandom
on Unix systems.)
I was curious about the performance difference between the two instructions, so I wrote some code to compare them. To my surprise, I find there is virtually no difference in performance. Could anyone provide an explanation? Code and system details follow.
/* Compare the performance of RDSEED and RDRAND.
*
* Compute the CPU time used to fill a buffer with (pseudo) random bits
* using each instruction.
*
* Compile with: gcc -mdrnd -mdseed
*/
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <x86intrin.h>
#define BUFSIZE (1<<24)
int main() {
unsigned int ok, i;
unsigned long long *rand = malloc(BUFSIZE*sizeof(unsigned long long)),
*seed = malloc(BUFSIZE*sizeof(unsigned long long));
clock_t start, end, bm;
// RDRAND (the benchmark)
start = clock();
for (i = 0; i < BUFSIZE; i++) {
ok = _rdrand64_step(&rand[i]);
}
bm = clock() - start;
printf("RDRAND: %li\n", bm);
// RDSEED
start = clock();
for (i = 0; i < BUFSIZE; i++) {
ok = _rdseed64_step(&seed[i]);
}
end = clock();
printf("RDSEED: %li, %.2lf\n", end - start, (double)(end-start)/bm);
free(rand);
free(seed);
return 0;
}
Upvotes: 4
Views: 1824
Reputation: 592
Interesting - in my case with 3.6 GHz 10-Core Intel Core i9 (on iMac), with the above program (corrected to repeat RDRAND/RDSEED call in case of failure) I observe:
$ ./rdseed-test
RDRAND: 1751837
RDSEED: 1752472, 1.00
I must admit that I'm puzzled - trying this same executable a few days later gives me 3x difference, like the one reported by others above:
$ ./rdseed-test
RDRAND: 1761312
RDSEED: 5309609, 3.01
No idea why sometimes RDSEED runs as fast as RDRAND, and sometimes - three times slower.
Upvotes: 1
Reputation: 64955
You aren't checking the return value, so you don't how many actual random numbers you have generated. With retry, as Florian suggested the RDSEED
version is more than 3 times slower:
RDRAND: 1989817
RDSEED: 6636792, 3.34
Under the covers, the hardware entropy source probably generates only at a limited rate, and this causes RDSEED
to fail when called at a rate faster than the entropy can regenerate. RDRAND
, on the other hand, is only generating a pseudo-random sequence based on periodic re-seeding, so it is unlikely to fail.
Here is the modified code excerpt:
// RDRAND (the benchmark)
start = clock();
for (i = 0; i < BUFSIZE; i++) {
while (!_rdrand64_step(&rand[i]))
;
}
bm = clock() - start;
printf("RDRAND: %li\n", bm);
// RDSEED
start = clock();
for (i = 0; i < BUFSIZE; i++) {
while (!_rdseed64_step(&seed[i]))
;
}
end = clock();
Upvotes: 6
Reputation: 33719
For me, on a Core m7-6Y75, the RDSEED
in your test program occasionally fails (I added two assert (ok);
s, and the second one fails occasionally). Correct code would retry, resulting in a performance difference in favor of RDRAND
. (Retrying is required for RDRAND
as well, but it does not seem to happen in practice, so RDRAND
is faster.)
Upvotes: 4