Quincy
Quincy

Reputation: 143

How to replace values in a column randomly using values from a list on Pandas?

I have the following dataframe:

data = {'Names':['Antonio','Bianca','Chad','Damien','Edward','Frances','George'],'Sport':['Basketball','Placeholder','Football','Placeholder','Tennis','Placeholder','Placeholder']}

df = pd.DataFrame(data, columns = ['Names','Sport'])

I want to replace the value 'Placeholder' randomly with any value from the following list:

extra_sports = ['Football','Basketball','Tennis','Rowing']

The final outcome should look something like this whereby the value 'Placeholder' is now gone and replaced randomly with values from the list:

data = {'Names':['Antonio','Bianca','Chad','Damien','Edward','Frances','George'],'Sport':['Basketball','Tennis','Football','Rowing','Tennis','Football','Tennis']}

df = pd.DataFrame(data, columns = ['Names','Sport'])

And if possible how would I implement random.seed so that I can reproduce the results.

Upvotes: 2

Views: 1377

Answers (1)

jezrael
jezrael

Reputation: 862581

I believe you need replace only values Placeholder with list, for length of list use sum of boolean Trues for correct length of benerated array:

extra_sports = ['Football','Basketball','Tennis','Rowing']
   
np.random.seed(1) 
m = df['Sport'].eq('Placeholder')
df.loc[m, 'Sport'] = np.random.choice(extra_sports, size=m.sum())
print (df)
     Names       Sport
0  Antonio  Basketball
1   Bianca  Basketball
2     Chad    Football
3   Damien      Rowing
4   Edward      Tennis
5  Frances    Football
6   George    Football

Upvotes: 3

Related Questions