Random number- same number on every run different number for each row

Question

i want the following function will return a different number for each row in a data frame but the same number every time the function runs.

thanks.

def inc14(p):
if p==1:
    return random.randint(1,2000)
elif p==2:
    return random.randint(2001,3000)
elif p==3:
    return random.randint(3001,4000)
elif p==4:
    return random.randint(4001,5000)
elif p==5:
    return random.randint(5001,7000)
elif p==6:
    return random.randint(7001,9000)
elif p==7:
    return random.randint(9001,12000)
elif p==8:
    return random.randint(12001,15000)
elif p==9:
    return random.randint(15001,20000)
elif p==10:
    return random.randint(20001,40000)
elif p==11:
    return 0.01
else:
    return np.NaN

data['inc_cont14']=data['inc14'].apply(inc14)

Boendal · Accepted Answer

Defined ranges doesn't matter:

Here a running example if the defined ranges doesn't matter, if they matter see below:

import random
import pandas as pd

random.seed(42) # Seed is here to always produce the same numbers

data = {'Name':['Tom', 'nick', 'krish', 'jack'], 'Age':[20, 21, 19, 18]}
df = pd.DataFrame(data)  #create a dummy dataframe

# The dataframe has 4 rows. So we need 4 random numbers.
# If we want to generate 4 random numbers, without duplicates we can use random.sample
# In this example we sample 4 random number in the range of 0-399
range_multiplier = 100
df['Random'] = random.sample(range(len(df.index)*range_multiplier), len(df.index))

print(df)

Output:

    Name  Age  Random
0    Tom   20     327
1   nick   21      57
2  krish   19      12
3   jack   18     379

You can run the same code and will get the same random number than I have if you use the same seed than I used.

Defined ranges matter:

And in case you need this ranges here the new function which is a lot shorter, but you have to prepare all the numbers.:

random.seed(42) # Seed is here to always produce the same numbers

# for all p(1-10) and their ranges (1-2000, 2001-3000, 3001-4000,...) 
# we generate a dictionary with p as the key 
# and as value a list of all numbers in the defined range
# without duplicates with random.sample
p_numbers = {
    1: random.sample(range(1, 2001), 2000),
    2: random.sample(range(2001, 3001), 1000),
    ...
    10: random.sample(range(20001,40001), 20000)
}

def inc14(p,p_numbers):
    if p >= 1 and p<=10:
        # take the first element of the number and remove it
        # from the list (to avoid taking it again)
        return p_numbers[p].pop(0) 
    elif p == 11:
        return 0.01
    else:
        return np.nan

data['inc_cont14']=data['inc14'].apply(inc14,p_numbers)

We need the seed again to not get any duplicates.

We create a dictionary with the available numbers for their p. if p is between 1 and 10 we take the number from the dictionary and remove it from there to not get it twice.

Random number- same number on every run different number for each row

Answers (2)

Related Questions