user3464229
user3464229

Reputation: 11

Possible to index unique permutations from a table?

Algorithm for an index number to display a specific and unique permutation?

Imagine a table with 4 columns.

Each column has 1000 elements.

Selecting one element at random from each column results in 1 trillion permutations. (1000^4)

Without indexing each one of those trillion permutations, would it be possible to assign an index number, from 1 to 1 trillion, that represents a specific permutation. Ideally, supplying an index number would yield that unique permutation.

Here's the tricky part: When looking at two index numbers that are close together (example: 12345 and 12346) the two permutations should not appear almost random-- they should not look like they are closely related.

Example: If each element were a word,

The following would be acceptable, as each index number represents a distinctly different set of words:

123456 = apple, banana, cow, dog
123457 = elephant, fox, goat, hippo
123458 = iguana, jackal, kangaroo, lion
123459 = mouse, newt, octopus, pig
123460 = apple, fox, newt, lion

(note: it's certainly okay for some repeats-- just not too many too often)

The following would NOT be acceptable because nearby index numbers result in vastly similar results:

123456 = apple, banana, cow, dog
123457 = apple, banana, cow, elephant
123458 = apple, banana, cow, fox
123459 = apple, banana, cow, goat

The solution should be scalable-- I should be able to change and have 10,000 elements in a column instead of 1000, and I should be able to have 10 columns instead of 4.

Any ideas?

ADDITIONAL DETAIL: Due to space requirements, I would not want to store the actual index, but I want the index number to be able to be broken down to point to the exact permutation it references.

Upvotes: 1

Views: 214

Answers (3)

Nuclearman
Nuclearman

Reputation: 5314

Seems like the simplest solution would be to convert the base 10 number into base 1000. However, it only works as long as the columns are restricted to 1000 elements each.

Here is the code in Python.

def convertToBase(number,base,min_digits):
    n = number
    value = []
    while n > 0:
        value.append(n % base)
        n = int(n/base)
    while len(value) < min_digits:
        value.append(0)
    return value

Where base is 1000 and min_digits is 4 (ensuring four columns). The resulting value is your permutation.

Upvotes: 0

John C
John C

Reputation: 1981

The most "convenient" solution that comes to mind is to use a straightforward "meta-index" that combines the indexes of each column, and then use some some sort of encryption on the resulting index to produce your "official" index.

Given an arbitrary number, then, decrypt it and separate out the component indexes.

My initial instinct was to suggest a hash function instead of encryption, but hash functions aren't easily reversed (meaning you can't produce the tuple for a given index) and it's difficult to create such functions with a minimum of empty slots or overlaps.

You can control how much the index components are predictable by the degree to which bits are transposed in your encryption algorithm. If you just XOR each byte with a key, the column index won't be consecutive, but the relationships will be there. If you exchange bits with neighboring (or non-neighboring) bytes, though, you delocalize each index's representation. (I'm not necessarily recommending DES, but it's worth looking at to get a sense of what's easily done to obscure contents.)

One caveat: If you want every index to match to a valid tuple, you need to make sure that each value for every component index is taken, somehow. That's another discussion altogether.

Upvotes: 0

Vincent van der Weele
Vincent van der Weele

Reputation: 13187

I would propose to solve this in two steps.

  1. Create a simple index that does not meet your locality constraints. For instance, order the sets lexicographically. As an example, assume you have 4 columns and 1000 elements in each column, you would number them from 0 to 1000 per column. The set [2, 100, 4, 927] has index 002 100 004 927. Note that to consecutive elements in this ordering only differ in the last column, which is not desired.

  2. Apply some hash function to your indices. For instance, let's assume you have a hash function f which has f(5) = 394 033 748 123 and f(6) = 921 038 839 104. You use the result of the hash as an index of step 1. Two consecutive indices in your input now have very different outputs (provided that your hash function works properly).

Upvotes: 2

Related Questions