Reputation: 2575
Imagine I was able to shuffle all numbers between 0 and 2^32 using something like the Knuth shuffle and a seeded random number generator seeded with a key.
Conceptually, I would need two arrays (using Z5 instead of Z232 for brevity):
[2, 0, 1, 4, 3] // perm
[1, 2, 0, 4, 3] // inv === p^-1
If I had these arrays, I could efficiently look up the nth element in the permutation as well as find out with element in the purmutation value v;
v = perm[n];
n == inv[v]; // true
I don't want to store two 16 GB arrays of uint representing this shuffled set because I am never interested in the entire shuffled sequence at any time. I am only ever interested in the value of the nth element.
I ideally want to write two pure functions that work like this:
uint nthShuffled = permutate<uint>(key, n); // O(log n)
uint n == invert<uint>(key, nthShuffled); // O(log n)
Requirements:
I understand that in theory there must be at least 232! unique keys in order to represent any possible permutation, but I believe I can hide that problem in practice behind a good hashing function.
Is there anything out there that comes close to this?
Upvotes: 4
Views: 1022
Reputation: 74382
From a cryptographic point of view, you want a block cipher with 32-bit blocks.
The problem of encryption (aka "keyed permutation") over arbitrary (and often small) domains is what Format-Preserving Encryption is about.
There is a generic "perfect" solution for that specific problem -- but the computation involves sampling through an hypergeometric distribution, which implies a lot of mucking with floating point and arbitrary precision numbers, which is expensive.
There are also "approximate" solutions in which the permutation is not, strictly speaking, uniformly chosen among all possible permutations, but the difference can be made arbitrarily small, to the point that it is not possible to distinguish between the implemented permutation and a really randomly chosen permutation. See in particular the Thorp shuffle.
There is no standard and secure 32-bit block cipher because 32 bits are not enough to ensure security in situations where block ciphers are commonly used (encryption of long streams of data, e.g. as part of SSL); 64-bit blocks are already frowned upon. So you are a bit on your own here.
Upvotes: 2
Reputation: 53047
Hashing isn't going so solve random number sequences.
Store 2^32 bits. That's .5 GB.
Run the Fischer-Yates shuffle and "cross off" bits as you go along. If you want to know the content of the 5th element then you'll cross out 4 and the 5th random value will be your number.
To get the nth permutation then you need to backtrack. Run the algorithm n times and get numbers like:
Find 5th index after 4 permutations:
First iteration:
1st : skip (run through the RNG)
2nd : skip
3rd : skip
4th : 7th index to 5th index
Second iteration: (run using same seed as 1st iteration)
1st : skip
2nd : skip
3rd : 3rd index to 7th index
4th : 7th index to 5th index
Third iteration:
1st : skip
2nd : 4th index to 7th index
3rd : 3rd index to 7th index
4th : 7th index to 5th index
Fourth iteration:
1st : 8th index to 4th index
2nd : 4th index to 7th index
3rd : 3rd index to 7th index
4th : 7th index to 5th index
By the last iteration, you know that the 8th index leads becomes the 5th index.
EDIT: I wrote a quick program to test the speed. It's taking a few minutes per permutation. It's slow, but still usable.
Upvotes: 1
Reputation: 14477
Any block cipher is actually a pseudo-random permutation. A 32-bit block cipher permutates the integers between 0
and 2 ^ 32 - 1
.
Given a secret key, encrypting N
with this key gives the N-th
pseudo-random number.
The only problem would be finding a good 32-bit block cipher. The only one I know is SKIP32, but I do not know anything about its strength.
SKIP32's key size is 80 bits. If it is a good cipher, that would be enough.
But again, I do not know the cipher.
If increasing the range to 2 ^ 64 - 1
integers is an option for you, you could simpply use a well-known 64-bit block cipher like Triple-DES or Blowfish.
Upvotes: 4
Reputation: 3322
" Knowldedge of the first 100 elements in the permutation provides no information on what might be the 101st element in the permutation. "
You need to store the whole array in memory. I suggest using stxxl, which is designed for large data types by storing the bulk of the container on disk. By the very nature of random permutation, you can't extrapolate the value of [n-1] or [n+1] given [n]. So it doesn't look like space can be optimized.
Upvotes: 3