Reputation: 5299
I have df like this. this value is projected values so that many columns exist.
Customer seg value0 value1
A a 10 60
A b 20 50
A c 30 40
B a 40 30
B b 50 20
B c 60 10
I would like to calculate value by referring to seg
columns.
a-b-c ( a minus b minus c)
in each customers
customer value0 value1
A -40 -30
B -70 0
How can I calculate each values by grouping customers.
df.groupby(customer)
Thanks
Upvotes: 2
Views: 56
Reputation: 1551
How about this:
In [42]: df
Out[42]:
Customer seg value0 value1
0 A a 10 60
1 A b 20 50
2 A c 30 40
3 B a 40 30
4 B b 50 20
5 B c 60 10
In [43]: df.pivot('seg', 'Customer').T.eval('a - b - c').unstack(level=0)
Out[43]:
value0 value1
Customer
A -40 -30
B -70 0
If you prefer groupby
, there is another solution:
In [44]: df.groupby('Customer').apply(lambda x:
x.set_index('seg')[['value0', 'value1']].T.eval('a - b - c'))
Upvotes: 2
Reputation: 28729
Another approach : Use numpy subtract, combined with reduce:
(df.groupby('Customer')
.agg(value0=('value0',np.subtract.reduce),
value1=('value1',np.subtract.reduce))
)
value0 value1
Customer
A -40 -30
B -70 0
Upvotes: 0
Reputation: 863741
Idea is multiple values for subtract by -1
and then aggregate sum
:
#filter only a,b,c rows
df1 = df[df['seg'].isin(['a','b','c'])]
a = np.where(df1['seg'].eq('a'), 1, -1)
df1.iloc[:, 2:] *= a[:, None]
print (df1)
Customer seg value0 value1
0 A a 10 60
1 A b -20 -50
2 A c -30 -40
3 B a 40 30
4 B b -50 -20
5 B c -60 -10
df2 = df1.groupby('Customer', as_index=False).sum()
print (df2)
Customer value0 value1
0 A -40 -30
1 B -70 0
Or if want multiple by numeric columns:
df1 = df[df['seg'].isin(['a','b','c'])]
c = df1.select_dtypes(np.number).columns
a = np.where(df1['seg'].eq('a'), 1, -1)
df1[c] *= a[:, None]
df2 = df1.groupby('Customer', as_index=False).sum()
print (df2)
Customer value0 value1
0 A -40 -30
1 B -70 0
Upvotes: 4