Reputation: 621

Negligible difference in performance between RDSEED and RDRAND

Recent Intel chips (Ivy Bridge and up) have instructions for generating (pseudo) random bits. RDSEED outputs "true" random bits generated from entropy gathered from a sensor on the chip. RDRAND outputs bits generated from a pseudorandom number generator seeded by the true random number generator. According to Intel's documentation, RDSEED is slower, since gathering entropy is costly. Thus, RDRAND is offered as a cheaper alternative, and its output is sufficiently secure for most cryptographic applications. (This is analogous to the /dev/random versus /dev/urandom on Unix systems.)

I was curious about the performance difference between the two instructions, so I wrote some code to compare them. To my surprise, I find there is virtually no difference in performance. Could anyone provide an explanation? Code and system details follow.

Benchmark

/* Compare the performance of RDSEED and RDRAND.
 *
 * Compute the CPU time used to fill a buffer with (pseudo) random bits 
 * using each instruction.
 *
 * Compile with: gcc -mdrnd -mdseed
 */
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <x86intrin.h>

#define BUFSIZE (1<<24)

int main() {

  unsigned int ok, i;
  unsigned long long *rand = malloc(BUFSIZE*sizeof(unsigned long long)), 
                     *seed = malloc(BUFSIZE*sizeof(unsigned long long)); 

  clock_t start, end, bm;

  // RDRAND (the benchmark)
  start = clock();
  for (i = 0; i < BUFSIZE; i++) {
    ok  = _rdrand64_step(&rand[i]);
  }
  bm = clock() - start;
  printf("RDRAND: %li\n", bm);

  // RDSEED
  start = clock();
  for (i = 0; i < BUFSIZE; i++) {
    ok = _rdseed64_step(&seed[i]);
  }
  end = clock();
  printf("RDSEED: %li, %.2lf\n", end - start, (double)(end-start)/bm);

  free(rand);
  free(seed);
  return 0;
}

System details

Intel Core i7-6700 CPU @ 3.40GHz
Ubuntu 16.04
gcc 5.4.0

Upvotes: 4

Answers (3)

Mouse

Reputation: 592

Interesting - in my case with 3.6 GHz 10-Core Intel Core i9 (on iMac), with the above program (corrected to repeat RDRAND/RDSEED call in case of failure) I observe:

$ ./rdseed-test 
RDRAND: 1751837
RDSEED: 1752472, 1.00

Update

I must admit that I'm puzzled - trying this same executable a few days later gives me 3x difference, like the one reported by others above:

$ ./rdseed-test 
RDRAND: 1761312
RDSEED: 5309609, 3.01

No idea why sometimes RDSEED runs as fast as RDRAND, and sometimes - three times slower.

Upvotes: 1

BeeOnRope

Reputation: 64955

You aren't checking the return value, so you don't how many actual random numbers you have generated. With retry, as Florian suggested the RDSEED version is more than 3 times slower:

RDRAND: 1989817
RDSEED: 6636792, 3.34

Under the covers, the hardware entropy source probably generates only at a limited rate, and this causes RDSEED to fail when called at a rate faster than the entropy can regenerate. RDRAND, on the other hand, is only generating a pseudo-random sequence based on periodic re-seeding, so it is unlikely to fail.

Here is the modified code excerpt:

  // RDRAND (the benchmark)
  start = clock();
  for (i = 0; i < BUFSIZE; i++) {
    while (!_rdrand64_step(&rand[i]))
        ;
  }
  bm = clock() - start;
  printf("RDRAND: %li\n", bm);

  // RDSEED
  start = clock();
  for (i = 0; i < BUFSIZE; i++) {
    while (!_rdseed64_step(&seed[i]))
        ;
  }
  end = clock();

Upvotes: 6

Florian Weimer

Reputation: 33719

For me, on a Core m7-6Y75, the RDSEED in your test program occasionally fails (I added two assert (ok);s, and the second one fails occasionally). Correct code would retry, resulting in a performance difference in favor of RDRAND. (Retrying is required for RDRAND as well, but it does not seem to happen in practice, so RDRAND is faster.)