Onilol
Onilol

Reputation: 1339

Pandas sum rows by step

I have the following table:

           A  A_pct     B   B_pct

Player1  1.0   12.5  15.0   18.75
Player2  7.0   87.5  65.0   81.25
Total    8.0  100.0  80.0  100.00

I'm trying to add a column at the end with the sum of all the columns that don't have a _pct suffix.

I could add the column by using sum on the non-pct dataframe but I ended with a NaN value at the end:

           A  A_pct     B   B_pct  Total

Player1  1.0   12.5  15.0   18.75   16.0
Player2  7.0   87.5  65.0   81.25   72.0
Total    8.0  100.0  80.0  100.00    NaN

which I can fix with df.['Total'].fillna(100, inplace=True) but that seems cumbersome..

Does an option to sum by step exist? Something like sum([i for i in df.columns[::2]]

Upvotes: 0

Views: 906

Answers (1)

Dillon
Dillon

Reputation: 999

This will select all columns without the '_pct' and sum by row

df['Total'] = df[df.columns[~df.columns.str.contains('_pct')]].sum(axis=1)

df
Out[]:
           A  A_pct     B   B_pct  Total
Player1  1.0   12.5  15.0   18.75   16.0
Player2  7.0   87.5  65.0   81.25   72.0
Total    8.0  100.0  80.0  100.00   88.0

Step by step

# Get the names of all columns withoput the '_pct' string
columns_names_without_pct = df.columns[~df.columns.str.contains('_pct')]

# Select only the part of the dataframe that contains these columns
df_without_pct = df[columns_names_without_pct]

# Sum along axis 1, the horizontal axis
df_without_pct.sum(axis=1)

# Set this to a new column called  'Total'
df['Total'] = df_without_pct.sum(axis=1)

Upvotes: 2

Related Questions