Reputation: 143
I have the following dataframe:
data = {'Names':['Antonio','Bianca','Chad','Damien','Edward','Frances','George'],'Sport':['Basketball','Placeholder','Football','Placeholder','Tennis','Placeholder','Placeholder']}
df = pd.DataFrame(data, columns = ['Names','Sport'])
I want to replace the value 'Placeholder' randomly with any value from the following list:
extra_sports = ['Football','Basketball','Tennis','Rowing']
The final outcome should look something like this whereby the value 'Placeholder' is now gone and replaced randomly with values from the list:
data = {'Names':['Antonio','Bianca','Chad','Damien','Edward','Frances','George'],'Sport':['Basketball','Tennis','Football','Rowing','Tennis','Football','Tennis']}
df = pd.DataFrame(data, columns = ['Names','Sport'])
And if possible how would I implement random.seed so that I can reproduce the results.
Upvotes: 2
Views: 1377
Reputation: 862581
I believe you need replace only values Placeholder
with list, for length of list use sum
of boolean True
s for correct length of benerated array:
extra_sports = ['Football','Basketball','Tennis','Rowing']
np.random.seed(1)
m = df['Sport'].eq('Placeholder')
df.loc[m, 'Sport'] = np.random.choice(extra_sports, size=m.sum())
print (df)
Names Sport
0 Antonio Basketball
1 Bianca Basketball
2 Chad Football
3 Damien Rowing
4 Edward Tennis
5 Frances Football
6 George Football
Upvotes: 3