Reputation: 1470
I'm new to pandas, and I want to create a new column in my pandas dataframe. I'd like to groupby one column, and then divide two other columns together.
This perfectly works:
df['new_col'] = (df.col2/df.col3)
However, when I groupby another column, what I have doesn't work:
df['new_col'] = df.groupby('col1')(df.col2/df.col3)
Does anyone know how I can rewrite the above code? Thanks.
Upvotes: 2
Views: 6363
Reputation: 164713
This may be what you are looking for:
import pandas as pd
df = pd.DataFrame([['A', 4, 3], ['B', 2, 4], ['C', 5, 1], ['A', 5, 1], ['B', 2, 7]],
columns=['Col1', 'Col2', 'Col3'])
# Col1 Col2 Col3
# 0 A 4 3
# 1 B 2 4
# 2 C 5 1
# 3 A 5 1
# 4 B 2 7
df['Col4'] = df['Col2'] / df['Col3']
df = df.sort_values('Col1')
# Col1 Col2 Col3 Col4
# 0 A 4 3 1.333333
# 3 A 5 1 5.000000
# 1 B 2 4 0.500000
# 4 B 2 7 0.285714
# 2 C 5 1 5.000000
Or if you need to perform a groupby.sum
first:
df = df.groupby('Col1', as_index=False).sum()
df['Col4'] = df['Col2'] / df['Col3']
# Col1 Col2 Col3 Col4
# 0 A 9 4 2.250000
# 1 B 4 11 0.363636
# 2 C 5 1 5.000000
Upvotes: 2
Reputation: 294358
Setup
df = pd.DataFrame(dict(
Col1=list('AAAABBBB'),
Col2=range(1, 9, 1),
Col3=range(9, 1, -1)
))
df
df.groupby('Col1').sum().eval('Col4 = Col2 / Col3')
Col1 Col2 Col3
0 A 1 9
1 A 2 8
2 A 3 7
3 A 4 6
4 B 5 5
5 B 6 4
6 B 7 3
7 B 8 2
Solution
Using pd.DataFrame.eval
We can use eval
to create new columns in a pipeline
df.groupby('Col1', as_index=False).sum().eval('Col4 = Col2 / Col3')
Col1 Col2 Col3 Col4
0 A 10 30 0.333333
1 B 26 14 1.857143
Upvotes: 6