Split a dataframe and sum [pandas]

Question

I have the following dataframe (dummy data):

            score   GDP
country     
Bangladesh  6      12
Bolivia     4      10
Nigeria     3      9
Pakistan    2      3
Ghana       1      3
India       1      3
Algeria     1      3

And I want to split it into two groups based on GDP and sum the score of each group. On the condition of GDP being less than 9:

           sum_score
country     
rich       13      
poor        5

sacuL · Accepted Answer

You can use np.where to make your rich and poor categories, then groupby that category and get the sum:

df['country_cat'] = np.where(df.GDP < 9, 'poor', 'rich')
df.groupby('country_cat')['score'].sum()

country_cat
poor     5
rich    13

You can also do the same in one step, by not creating the extra column for the category (but IMO the code becomes less readable):

df.groupby(np.where(df.GDP < 9, 'poor', 'rich'))['score'].sum()

Split a dataframe and sum [pandas]

Answers (2)

Related Questions