Michael Scofield
Michael Scofield

Reputation: 147

How to add all the data of the dataframe by identifying the common column in dataframe?

I've a dataframe DF1:

   YEAR  JAN_EARN  FEB_EARN  MAR_EARN  APR_EARN  MAY_EARN  JUN_EARN  JUL_EARN  AUG_EARN  SEP_EARN  OCT_EARN  NOV_EARN  DEC_EARN
0  2017        20        21      22.0        23      24.0      25.0      26.0      27.0        28      29.0        30        31
1  2018        30        31      32.0        33      34.0      35.0      36.0      37.0        38      39.0        40        41
2  2019        40        41      42.0        43       NaN      45.0       NaN       NaN        48      49.0        50        51
3  2017        50        51      52.0        53      54.0      55.0      56.0      57.0        58      59.0        60        61
4  2017        60        61      62.0        63      64.0       NaN      66.0       NaN        68       NaN        70        71
5  2021        70        71      72.0        73      74.0      75.0      76.0      77.0        78      79.0        80        81
6  2018        80        81       NaN        83       NaN      85.0       NaN      87.0        88      89.0        90        91

group the rows by common row in "YEAR" column and add all the data of that column. I tried to check with this:

DF2['New'] = DF1.groupby(DF1.groupby('YEAR')).sum()

The Expected Output is like: DF2;

   YEAR  JAN_EARN  FEB_EARN ......
0  2017       130       133 ......
1  2018       110       112 ......
2  2019        40        41 ......
3  2021        70        71 ......

Thank You For Your Time :)

Upvotes: 0

Views: 46

Answers (1)

Keyser Soze
Keyser Soze

Reputation: 272

You were halfway through there, just rectify some small details as following :

  • Don't assign a groupby object to a newly defined column, replace your line of 'Df2['New'] = ...' with :

    DF2 = DF1.groupby('YEAR' , as_index = False).sum().reset_index(drop = True)
    

If you wish to see all the columns relative to each year, create a list with the range of years your df has then apply a mask for each element in that list. You will obtain one dataframe per year then concatenate them with axis = 0.

Another way of doing so would be sorting DF1's years by chronological order then slicing. I'm afraid we misunderstood your question, if that's the case please develop more so we can help.

Upvotes: 2

Related Questions