user972014
user972014

Reputation: 3856

Python - Get random color, given a seed number as fast as possible

I need to find a random color, given a specific seed number - fast. given the same ID twice, should return the same color.

I did this:

def id_to_random_color(number):
    random_bytes = hashlib.sha1(bytes(number)).digest()
    return [int(random_bytes[-1]) / 255, int(random_bytes[-2]) / 255, int(random_bytes[-3]) / 255, 1.0]

The problem is that calculating the sha1 of numbers many times is very slow in total. (I'm using this function about 100k times)

Edit: The reason I use a hash function is that I want to have colors that are different for numbers that are close

e.g id_to_random_color(7) should be very different from id_to_random_color(9)

Upvotes: 6

Views: 2157

Answers (3)

PM 2Ring
PM 2Ring

Reputation: 55469

You didn't mention the range of number. It must be a non-negative integer, otherwise bytes(number) would fail. (BTW, that function returns a bytes string consisting of number zero bytes, which will chew up a lot of RAM if number is large). I presume number is at least 24 bits to cover the 24 bit RGB color space.

Using a cryptographic hash function for this purpose is overkill. OTOH, the hashlib functions are quite fast, since they're coded in C. We could utilize the built_in hash function, however hash(n) simply returns n for machine-size integers, so we'd need to do something like hash((n, n)) to get random-looking output. However, the results of doing that sort of thing aren't particularly random: hash is designed for hashtable work, not for the kind of scrambling we want here.

To generate the random RGB values I've adapted the mixing algorithm from xxHash by Yann Collet. You can view the C source code of that algorithm in the xxhash.c source code. This algorithm is reasonably fast, and it has good avalanching. Bret Mulvey has written a good introductory article about hash mixing functions and the avalanche effect.

def id_to_random_color(n):
    n = ((n ^ n >> 15) * 2246822519) & 0xffffffff
    n = ((n ^ n >> 13) * 3266489917) & 0xffffffff
    n = (n ^ n >> 16) >> 8
    return [u / 255. for u in n.to_bytes(3, 'big')] + [1.0]

This function works well for n in range(2**24), and in fact its results are quite good over the whole of range(2**32); it will still give usable results outside that range. To test it here, I'll use a simplified version that returns the RGB values as integers. The first test simply shows the RGB values for n in range(20). The second test generates 25600 random numbers and finds the corresponding RGB values. We should get approximately 100 hits for each R, G, & B value.

from collections import Counter
from random import seed, randrange

seed(42)

def id_to_RGB(n):
    n = ((n ^ n >> 15) * 2246822519) & 0xffffffff
    n = ((n ^ n >> 13) * 3266489917) & 0xffffffff
    n = (n ^ n >> 16) >> 8
    return tuple(n.to_bytes(3, 'big'))

# Tests

# Show some colors
for i in range(20):
    rgb = id_to_RGB(i)
    print('{:2d}: {:02x} {:02x} {:02x}'.format(i, *rgb))
print()

# Count the frequency of each color for random `n`
counts = {k: Counter() for k in 'rgb'}
for i in range(25600):
    n = randrange(2 ** 32)
    for k, v in zip('rgb', id_to_RGB(n)):
        counts[k][v] += 1

for k in 'rgb':
    print(k, sorted(counts[k].values()))

output

 0: 00 00 00
 1: 60 6d 18
 2: 4e f2 bf
 3: 75 4f 48
 4: 60 98 f1
 5: 17 1d 98
 6: 3b 69 13
 7: aa 10 98
 8: c1 31 e3
 9: 1e fa 4a
10: 7f 05 b2
11: 86 0e b3
12: 39 84 c6
13: c1 75 4f
14: e2 38 87
15: db 54 79
16: 45 14 b6
17: cb 56 68
18: 8e bf d8
19: cd 50 3f

Counter output

r [74, 75, 75, 77, 78, 80, 80, 80, 80, 81, 82, 83, 84, 85, 85, 85, 86, 86, 86, 88, 88, 88, 88, 89, 89, 89, 89, 89, 89, 89, 89, 90, 90, 90, 90, 90, 90, 90, 91, 91, 91, 91, 91, 91, 91, 91, 91, 92, 92, 92, 92, 92, 92, 92, 92, 93, 93, 93, 93, 93, 93, 93, 93, 93, 94, 94, 94, 94, 94, 94, 94, 94, 94, 95, 95, 95, 95, 95, 95, 95, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 97, 97, 97, 97, 97, 97, 97, 97, 98, 98, 98, 98, 98, 98, 98, 98, 98, 99, 99, 99, 99, 99, 99, 99, 100, 100, 100, 100, 100, 100, 100, 100, 100, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 101, 102, 102, 102, 102, 102, 102, 102, 102, 102, 102, 102, 102, 102, 102, 103, 103, 103, 103, 103, 103, 103, 103, 103, 103, 103, 104, 104, 104, 104, 104, 104, 104, 105, 105, 105, 105, 105, 105, 105, 105, 105, 105, 106, 106, 106, 106, 106, 106, 106, 106, 106, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 108, 108, 108, 108, 108, 108, 108, 108, 109, 109, 109, 109, 109, 109, 109, 109, 109, 109, 110, 110, 110, 110, 110, 110, 110, 110, 111, 112, 112, 112, 112, 112, 113, 113, 113, 114, 114, 115, 115, 115, 115, 116, 116, 116, 116, 118, 119, 120, 123, 124, 126, 128, 138]
g [73, 74, 74, 77, 78, 79, 79, 81, 81, 82, 82, 83, 83, 83, 84, 84, 84, 84, 85, 85, 85, 86, 87, 87, 87, 87, 87, 87, 87, 88, 88, 88, 88, 88, 89, 89, 89, 89, 89, 89, 90, 90, 90, 90, 90, 90, 90, 90, 90, 91, 91, 91, 91, 92, 92, 92, 92, 92, 92, 92, 92, 92, 93, 93, 93, 93, 93, 93, 93, 93, 94, 94, 94, 94, 94, 94, 95, 95, 95, 95, 95, 95, 95, 95, 95, 95, 95, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 98, 98, 98, 98, 98, 98, 98, 98, 99, 99, 99, 99, 99, 99, 99, 99, 99, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 101, 101, 101, 101, 101, 101, 102, 102, 102, 102, 102, 102, 102, 102, 102, 103, 103, 103, 103, 103, 103, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 105, 105, 105, 105, 105, 105, 105, 105, 105, 105, 105, 105, 105, 105, 106, 106, 106, 106, 106, 106, 106, 107, 107, 107, 107, 107, 107, 107, 107, 107, 108, 108, 108, 109, 109, 109, 110, 110, 110, 110, 110, 111, 111, 111, 111, 111, 111, 112, 112, 112, 112, 112, 112, 113, 113, 113, 113, 113, 113, 113, 113, 114, 114, 114, 114, 115, 115, 116, 117, 117, 117, 117, 118, 118, 118, 119, 119, 119, 120, 120, 121, 121, 121, 123, 125, 126, 128]
b [73, 74, 77, 78, 78, 79, 80, 80, 80, 81, 82, 84, 84, 84, 84, 84, 84, 84, 84, 85, 85, 85, 85, 85, 86, 86, 86, 86, 86, 87, 87, 87, 87, 88, 88, 89, 89, 89, 89, 89, 89, 89, 89, 90, 90, 90, 90, 90, 90, 91, 91, 91, 91, 91, 91, 92, 93, 93, 93, 93, 93, 93, 93, 94, 94, 94, 94, 94, 94, 94, 94, 94, 94, 95, 95, 95, 95, 95, 95, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 96, 97, 97, 97, 97, 97, 97, 98, 98, 98, 98, 98, 98, 98, 98, 98, 98, 98, 98, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 101, 101, 101, 101, 101, 101, 101, 101, 101, 102, 102, 102, 102, 102, 102, 102, 103, 103, 103, 103, 103, 103, 103, 103, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 105, 105, 105, 105, 105, 105, 105, 106, 106, 106, 106, 106, 106, 106, 107, 107, 107, 107, 107, 107, 107, 107, 107, 108, 108, 108, 108, 108, 108, 108, 108, 108, 108, 108, 108, 109, 109, 109, 109, 109, 109, 109, 110, 110, 110, 111, 111, 111, 111, 112, 112, 112, 113, 113, 113, 114, 114, 114, 114, 114, 115, 115, 115, 115, 115, 115, 115, 115, 115, 116, 116, 116, 117, 118, 119, 120, 120, 122, 124, 126, 127, 128, 131]

You may notice that id_to_RGB returns all zeroes for a zero input. If that's undesirable, you can add an extra mixing step at the start (also borrowed from xxHash).

def id_to_RGB(n):
    n = (374761397 + n * 3266489917) & 0xffffffff    
    n = ((n ^ n >> 15) * 2246822519) & 0xffffffff
    n = ((n ^ n >> 13) * 3266489917) & 0xffffffff
    n = (n ^ n >> 16) >> 8
    return tuple(n.to_bytes(3, 'big'))

Upvotes: 3

rudicangiotti
rudicangiotti

Reputation: 566

Using simple random number generators with kind of static variables could improve performance:

import random
prev, r, g, b = None, 0, 0, 0
def id_to_random_color(number):
    global prev, r, g, b
    if number != prev:
        r = random.random()
        g = random.random()
        b = random.random()
        prev = number
    return r, g, b, 1.0

Update:
As AndrewMcDowell stated in his comment, the function could return different values if the input is repeated in not sequential occasions.
Here is a possible workaround:

import random
memory = {}
def id_to_random_color(number, memory):
    if not number in memory:
        r = random.random()
        g = random.random()
        b = random.random()
        memory[number] = (r, g, b, 1.0)
    return memory[number]

Further update:
The same function skeleton could be used even to calculate hash:

memory = {}
def id_to_random_color(number):
    if not number in memory:
        numByte = str.encode(number)
        hashObj = hashlib.sha1(numByte).digest()
        r, g, b = hashObj[-1] / 255.0, hashObj[-2] / 255.0, hashObj[-3] / 255.0
        memory[number]= (r, g, b, 1.0)
        return r, g, b, 1.0
    else:
        return memory[number]

Despite it is a bit more verbose syntax, the else statement improves a bit performances, avoiding subsequent memory write and read (as Jake stated in his answer).

Upvotes: 3

Jake
Jake

Reputation: 637

I would use a dictionary to quickly index already generated seeds.

import random

random_seeds = {}

def id_to_random_color(number):
    if number in random_seeds.keys():
        return random_seeds[number]
    else:
        color = [random.random(), random.random(), random.random(), 1.0]
        random_seeds[number] = color
        return color

Upvotes: 0

Related Questions