Reputation: 11
This is my question:
I have a file csv like this:
SELL,NUMBER,TYPE,MONTH
-1484829.72,25782,E,3
-1337196.63,26688,E,3
-1110271.83,15750,E,3
-1079426.55,16117,E,3
-964656.26,11344,D,1
-883818.81,10285,D,2
-836068.57,14668,E,3
-818612.27,13806,E,3
-765820.92,14973,E,3
-737911.62,8685,D,2
-728828.93,8975,D,1
-632200.31,12384,E
41831481.50,18425,E,2
1835587.70,33516,E,1
1910671.45,20342,E,6
1916569.50,24088,E,6
1922369.40,25101,E,1
2011347.65,23814,E,3
2087659.35,18108,D,3
2126371.86,34803,E,2
2165531.50,35389,E,3
2231818.85,37515,E,3
2282611.90,32422,E,6
2284141.50,21199,A,1
2288121.05,32497,E,6
I want to make a groupby TYPE and sum the columns SELLS and NUMBERS making a separation between negative and positive number
I make this command:
end_result= info.groupby(['TEXTOCANAL']).agg({
'SELLS': (('negative', lambda x : x[x < 0].sum()), ('positiv', lambda x : x[x > 0].sum())),
'NUMBERS': (('negative', lambda x : x[info['SELLS'] <0].sum()), ('positive', lambda x : x[info['SELLS'] > 0].sum())),
})
And the result is the following:
SELLS NUMBERS
negative positive negative positive
TYPE
A -1710.60 5145.25 17 9
B -95.40 3391.10 1 29
C -3802.25 36428.40 191 1063
D 0.00 30.80 0 7
E -19143.30 102175.05 687 1532
But i want to make this groupby adding the column MONTH
Something like that:
1 2
SELLS NUMBERS
negative positive negative positive negative positive negative positive
TYPE
A -1710.60 5145.25 17 9 -xxx.xx xx.xx xx xx
B -95.40 3391.10 1 29
C -3802.25 36428.40 191 1063
D 0.00 30.80 0 7
E -19143.30 102175.05 687 1532
Any idea?
Thanks in advance for your help
Upvotes: 1
Views: 61
Reputation: 16683
Similar answer to RichieV's answer. I was unaware of np.sign
, which is a neat trick.
Another way to do this is that you can .assign
a flag
column with np.where
to identify a positive
or negative
. Then, groupby all non-numerical columns and move the second and third fields to the the columns with .unstack([1,2]
).
info = (info.assign(flag=np.where((info['SELL'] > 0), 'postive', 'negative'))
.groupby(['TYPE','MONTH','flag'])['SELL', 'NUMBER'].sum()
.unstack([1,2]))
output (image since multi-indexes are messy).
Upvotes: 1
Reputation: 5183
This should work:
end_result = (
info.groupby(['TYPE', 'MONTH', np.sign(info.SELL)]) # groupby negative and positive SELL
['SELL', 'NUMBER'].sum() # select columns to be aggregated
# in this case is redundant to select columns
# since those are the only two columns left
# groupby moves TYPE and MONTH as index
.unstack([1, 2]) # reshape as you need it
.reorder_levels([0, 1, 3, 2]) # to have pos/neg as last level in MultiIndex
.rename({-1: 'negative', 1: 'positive'}, axis=1, level=-1)
)
Upvotes: 1