Reputation: 11

panda python how to make a groupby more columns

This is my question:

I have a file csv like this:

SELL,NUMBER,TYPE,MONTH
-1484829.72,25782,E,3
-1337196.63,26688,E,3
-1110271.83,15750,E,3
-1079426.55,16117,E,3
-964656.26,11344,D,1
-883818.81,10285,D,2
-836068.57,14668,E,3
-818612.27,13806,E,3
-765820.92,14973,E,3
-737911.62,8685,D,2
-728828.93,8975,D,1
-632200.31,12384,E
41831481.50,18425,E,2
1835587.70,33516,E,1
1910671.45,20342,E,6
1916569.50,24088,E,6
1922369.40,25101,E,1
2011347.65,23814,E,3
2087659.35,18108,D,3
2126371.86,34803,E,2
2165531.50,35389,E,3
2231818.85,37515,E,3
2282611.90,32422,E,6
2284141.50,21199,A,1
2288121.05,32497,E,6

I want to make a groupby TYPE and sum the columns SELLS and NUMBERS making a separation between negative and positive number

I make this command:

end_result= info.groupby(['TEXTOCANAL']).agg({
                                                'SELLS': (('negative', lambda x : x[x < 0].sum()), ('positiv', lambda x : x[x > 0].sum())),
                                                'NUMBERS': (('negative', lambda x : x[info['SELLS'] <0].sum()), ('positive', lambda x : x[info['SELLS'] > 0].sum())),
                                                })

And the result is the following:

                          SELLS                      NUMBERS
                   negative    positive     negative  positive
TYPE
A                  -1710.60    5145.25           17       9
B                  -95.40      3391.10           1        29
C                  -3802.25    36428.40          191      1063
D                   0.00       30.80             0        7
E                  -19143.30   102175.05         687      1532

But i want to make this groupby adding the column MONTH

Something like that:

                    1                             2
                                                         SELLS                    NUMBERS
                   negative    positive     negative  positive            negative    positive  negative  positive
TYPE
A                  -1710.60    5145.25           17       9         -xxx.xx    xx.xx    xx    xx
B                  -95.40      3391.10           1        29
C                  -3802.25    36428.40          191      1063
D                   0.00       30.80             0        7
E                  -19143.30   102175.05         687      1532

Any idea?

Thanks in advance for your help

Upvotes: 1

Answers (2)

David Erickson

Reputation: 16683

Similar answer to RichieV's answer. I was unaware of np.sign, which is a neat trick.

Another way to do this is that you can .assign a flag column with np.where to identify a positive or negative. Then, groupby all non-numerical columns and move the second and third fields to the the columns with .unstack([1,2]).

info = (info.assign(flag=np.where((info['SELL'] > 0), 'postive', 'negative'))
            .groupby(['TYPE','MONTH','flag'])['SELL', 'NUMBER'].sum()
            .unstack([1,2]))

output (image since multi-indexes are messy).

Upvotes: 1

RichieV

Reputation: 5183

This should work:

end_result = (
    info.groupby(['TYPE', 'MONTH', np.sign(info.SELL)]) # groupby negative and positive SELL
    ['SELL', 'NUMBER'].sum() # select columns to be aggregated
    # in this case is redundant to select columns
    # since those are the only two columns left
    # groupby moves TYPE and MONTH as index
    .unstack([1, 2]) # reshape as you need it
    .reorder_levels([0, 1, 3, 2]) # to have pos/neg as last level in MultiIndex
    .rename({-1: 'negative', 1: 'positive'}, axis=1, level=-1)
    )

Upvotes: 1

panda python how to make a groupby more columns

Answers (2)

Related Questions