Reputation: 49
I want to transform age range to age numerical value. I used def Age(x) & If statement to transform, but it doesn't work and give the wrong result. I attached the images of the step that I did and the result. The dataset that I used is BlackFriday. Please help me to clarify the mistakes. Thank you!
Upvotes: 2
Views: 2958
Reputation: 1
A simple function to modifiy age_range to mean:
Here is the age ranges we have
temp_df['age_range'].unique()
array([70, '18-25', '26-35', '36-45', '46-55', '56-70'], dtype=object)
Function to modify age
def mod_age(df):
for i in range(df.shape[0]):
if(df.loc[i,'age_range']==70):
df.loc[i,'age_range']=70
elif(df.loc[i,'age_range']=='18-25'):
df.loc[i,'age_range']=(18+25)//2
elif(df.loc[i,'age_range']=='26-35'):
df.loc[i,'age_range']=(26+35)//2
elif(df.loc[i,'age_range']=='36-45'):
df.loc[i,'age_range']=(36+45)//2
elif(df.loc[i,'age_range']=='46-55'):
df.loc[i,'age_range']=(46+55)//2
elif(df.loc[i,'age_range']=='56-70'):
df.loc[i,'age_range']=(56+75)//2
age_range family_size marital_status sum
2 70 2 Single 4
25 40 4 Single 2
5 21 2 Married 4
32 50 3 Single 3
13 30 2 Single 5
Upvotes: 0
Reputation: 88285
Given what is shown from the result of value_counts
, it seems like a simple str.extract
with a fillna
for ages of 55+
will do:
df.Age.str.extract(r'(?<=-)(\d+)').fillna(56)
Lets consider the following example:
df = pd.DataFrame({'Age':['26-35','36-45', '55+']})
Age
0 26-35
1 36-45
2 55+
df.Age.str.extract(r'(?<=-)(\d+)').fillna(56).rename(columns={0:'Age'})
Age
0 35
1 45
2 56
Upvotes: 1