Reputation: 313
I am trying to apply pandas groupby to a column which contains floats and strings. The DataFrame looks like:
name value
0 var_1 1.4
1 var_2 1110
3 var_2 900
4 var_3 'some_str'
5 var_1 2.7
I am trying to apply the groupby method so that the output dataframe looks something like:
name value
0 var_1 2.15
1 var_2 1005
2 var_3 'some_str'
i.e. the get the average of all those values which are recorded multiple times, and preserve the no-numeric values as they are.
If the column was only made up of numeric types this would be simple enough to implement as:
new_df = df.groupby('name').mean().reset_index()
Is there an easy way to overcome the mixed types which makes this method as I have written above inapplicable?
Upvotes: 1
Views: 475
Reputation: 863226
Use try-except
statement:
#if need convert strings column to mixed values
df['value'] = pd.to_numeric(df['value'], errors='coerce').fillna(df['value'])
def f(x):
try:
return x.mean()
except:
return ','.join(x)
new_df = df.groupby('name')['value'].apply(f).reset_index()
print (new_df)
name value
0 var_1 2.05
1 var_2 1005
2 var_3 'some_str'
Upvotes: 2