Reputation: 2031
my_df = DataFrame(np.arange(1,13).reshape(4,3), columns=list('abc'))
my_df.sum(axis="rows")
O/P is
a 22
b 26
c 30
// I expect it to sum by rows thereby giving
0 6
1 15
2 24
3 33
my_df.sum(axis="columns") //helps achieve this
Why does it work counterintutively? In a similar context, drop method works as it should i.e when i write
my_df.drop(['a'],axis="columns")
// This drops column "a".
Am I missing something? Please enlighten.
Upvotes: 2
Views: 67
Reputation: 18906
It is a naming convention. The sum of the columns gives a row-wise sum. You are looking for axis='columns')
.
Ok that was interesting. In pandas normally 0 is for columns and 1 is for rows. However looking in the docs we find that the allowed params are:
axis : {index (0), columns (1)}
You are passing a param that does not exist which results in the default. This can thus be read as: The sum of the columns returns the row sum. The sum of the index returns the column sum. What you want to use it axis=1
or axis='columns'
which results in your desired output:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.arange(1,13).reshape(4,3), columns=list('abc'))
print(df.sum(axis=1))
Returns:
0 6
1 15
2 24
3 33
dtype: int64
Upvotes: 1