Fares K. A.
Fares K. A.

Reputation: 1285

Sum of specific rows in a dataframe (Pandas)

I'm given a set of the following data:

week  A      B      C      D      E
1     243    857    393    621    194
2     644    576    534    792    207
3     946    252    453    547    436
4     560    100    864    663    949
5     712    734    308    385    303

I’m asked to find the sum of each column for specified rows/a specified number of weeks, and then plot those numbers onto a bar chart to compare A-E.

Assuming I have the rows I need (e.g. df.iloc[2:4,:]), what should I do next? My assumption is that I need to create a mask with a single row that includes the sum of each column, but I'm not sure how I go about doing that.

I know how to do the final step (i.e. .plot(kind='bar'), I just need to know what the middle step is to obtain the sums I need.

Upvotes: 5

Views: 31072

Answers (2)

jezrael
jezrael

Reputation: 862601

You can use for select by positions iloc, sum and Series.plot.bar:

df.iloc[2:4].sum().plot.bar()

graph1

Or if want select by names of index (here weeks) use loc:

df.loc[2:4].sum().plot.bar()

graph2

Difference is iloc exclude last position:

print (df.loc[2:4])
        A    B    C    D    E
week                         
2     644  576  534  792  207
3     946  252  453  547  436
4     560  100  864  663  949

print (df.iloc[2:4])
        A    B    C    D    E
week                         
3     946  252  453  547  436
4     560  100  864  663  949

And if need also filter columns by positions:

df.iloc[2:4, :4].sum().plot.bar()  

And by names (weeks):

df.loc[2:4, list('ABCD')].sum().plot.bar()

Upvotes: 6

sacuL
sacuL

Reputation: 51335

All you need to do is call .sum() on your subset of the data:

df.iloc[2:4,:].sum()

Returns:

week       7
A       1506
B        352
C       1317
D       1210
E       1385
dtype: int64

Furthermore, for plotting, I think you can probably get rid of the week column (as the sum of week numbers is unlikely to mean anything):

df.iloc[2:4,1:].sum().plot(kind='bar')
# or
df[list('ABCDE')].iloc[2:4].sum().plot(kind='bar')

plot

Upvotes: 2

Related Questions