Reputation: 37
I have a dataframe that looks something like this:
0 Fish Trout
1 Fish Pickerel
2 Fish Pike
3 Bird Goose
4 Bird Duck
I'd like to assign a random number between 5 and 45 to the entries corresponding to fish, and a random number between 55 and 95 to entries corresponding to birds (the logic here is to generate a numeric value so that I can plot this against some other numeric criteria in bokeh or seaborn).
I've gotten this far:
Num_Fish = np.random.randint(5, 45)
Num_Bird = np.random.randint(55, 95)
d = {'Bird': Num_Bird, 'Fish': Num_Fish}
data['Random'] = data['Category'].map(d)
The problem with the above is that it assigns the same random number to all fish, and a different random number to all birds. What I want are unique random numbers (within the range specified) for each type of fish or bird.
So at the moment it produces something like this:
0 Fish Trout 22
1 Fish Pickerel 22
2 Fish Pike 22
3 Bird Goose 53
4 Bird Duck 53
How can I get unique random numbers (within the range specified) for separate entries in each category?
Beyond that, is there a way to avoid repeating random numbers in the case of large datasets?
Would be very grateful for any suggestions... thanks
Upvotes: 2
Views: 1078
Reputation: 51165
map
and a dict
dct = {'Bird': [55, 95], 'Fish': [5, 45]}
def map_animal(animal):
return np.random.randint(*dct[animal])
df['rand_num'] = df.Type.map(map_animal)
Type Name rand_num
0 Fish Trout 25
1 Fish Pickerel 18
2 Fish Pike 44
3 Bird Goose 56
4 Bird Duck 74
Upvotes: -1
Reputation: 21
from io import StringIO
import numpy as np
import pandas as pd
df = pd.read_csv(StringIO('''ID,ClassLevel0,ClassLevel1
0,Fish,Trout
1,Fish,Pickerel
2,Fish,Pike
3,Bird,Goose
4,Bird,Duck
'''))
df.index = df.ID
random_param = {'Fish': (5, 45), 'Bird': (55, 95)}
for level0, ldf in df.groupby('ClassLevel0'):
df.loc[ldf.index, 'Value'] = np.random.randint(*random_param[level0], len(ldf))
Upvotes: 2