Reputation: 1235
I'll have a pandas dataframe and looking for a method to replace the values for column = c
with the mean value for column = a
and column = b
.
Being a novice at Python, I have checked the replace function, and the possibility to create a dict to perform the replacement. My failed attempt is visible below.
If anyone has any suggestions I would be thrilled.
df = pd.DataFrame({'column': ['a', ' b', 'a', 'b', 'b', 'a', 'b', 'c', 'd' ], 'value': range(0,9)})
dict = {"(df[df['column'] == 'c']['value'].values)": "df[df['column'] == 'a']['value'].mean()"}
df.replace(dict)
Upvotes: 1
Views: 476
Reputation: 862851
You can set and select column value
by DataFrame.loc
:
df.loc[df['column'] == 'c', 'value'] = df.loc[df['column'] == 'a', 'value'].mean()
print (df)
column value
0 a 0.000000
1 b 1.000000
2 a 2.000000
3 b 3.000000
4 b 4.000000
5 a 5.000000
6 b 6.000000
7 c 2.333333
8 d 8.000000
Or if need test multiple values use Series.isin
:
df.loc[df['column'] == 'c', 'value'] = df.loc[df['column'].isin(['a','b']), 'value'].mean()
print (df)
column value
0 a 0.000000
1 b 1.000000
2 a 2.000000
3 b 3.000000
4 b 4.000000
5 a 5.000000
6 b 6.000000
7 c 3.333333
8 d 8.000000
Upvotes: 2