Reputation: 193
everyone! I'm trying to do something for my wife but am having some issues. I want to create a certain value and replace info column by column.
Here's what I did:
import numpy as np
import pandas as pd
datalist = ['Sex', 'Race', 'Age', 'FT']
df = pd.DataFrame(np.random.randint(0,1,size=(3101, 4)), columns=datalist) #I want four columns and 3100 rows.
df = df.replace(to_replace ="0", value ="Female", limit=1752, inplace=True) #I'm trying to turn 1752 of the rows under Sex to be Female, and the rest Male.
Before I could get to the male side, I tested the df and found this:
Sex Race Age FT
0 None 0 0 0
1 None 0 0 0
2 None 0 0 0
3 None 0 0 0
4 None 0 0 0
Why is Sex returning as none? I've turned off the inplace but it just keeps everything as 0. What am I doing wrong?
Thanks!
Upvotes: 0
Views: 90
Reputation: 120399
Simply:
df['Sex'] = ['female']*1752 + ['male']*(3101-1752)
At the end, you can shuffle your dataframe:
df.sample(frac=1)
Upvotes: 0
Reputation: 3
This should get you on your way, if I understand your question(s):
import numpy as np
import pandas as pd
datalist = ['Sex', 'Race', 'Age', 'FT']
numpy_data = np.random.choice([0,1],size=(3101, 4))
df = pd.DataFrame(data=numpy_data, columns=datalist)
df['Sex'] = df['Sex'].astype(str)
df['Sex'].replace(to_replace ="0", value ="Female", limit=1752, inplace=True)
Upvotes: 0
Reputation: 139
i think loc method would be efficient to replace value(s) in a column... actually i don't know the reason why you triy to use replace method tough..
df.loc[0:1752-1,'Sex']='Female'
df.loc[df.Sex!='Female',:'Sex']='Male'
print(df)
df.value_counts()
Upvotes: 1