Reputation: 583

Append new column to DF after sum?

I have a sample dataframe below:

sn C1-1 C1-2 C1-3 H2-1 H2-2 K3-1 K3-2
1   4     3    5    4    1    4    2
2   2     2    0    2    0    1    2
3   1     2    0    0    2    1    2

I will like to sum based on the prefix of C1, H2, K3 and output three new columns with the total sum. The final result is this:

sn total_c1 total_h2 total_k3
1      12      5        6
2      4       2        3
3      3       2        3

What I have tried on my original df:

lst = ["C1", "H2", "K3"]
lst2 = ["total_c1", "total_h2", "total_k3"]

for k in lst:
    idx = df.columns.str.startswith(i)
    for j in lst2:
        df[j] = df.iloc[:,idx].sum(axis=1)
        df1 = df.append(df, sort=False)

But I kept getting error

IndexError: Item wrong length 35 instead of 36.

I can't figure out how to append the new total column to produce my end result in the loop.

Any help will be appreciated (or better suggestion as oppose to loop). Thank you.

Upvotes: 1

Answers (3)

sammywemmy

Reputation: 28699

Another option, where we create a dictionary to groupby the columns:

mapping = {entry: f"total_{entry[:2]}" for entry in df.columns[1:]}
result = df.groupby(mapping, axis=1).sum()
result.insert(0, "sn", df.sn)
result

   sn   total_C1    total_H2    total_K3
0   1      12          5          6
1   2      4           2          3
2   3      3           2          3

Upvotes: 0

BENY

Reputation: 323306

Let us try ,split then groupby with it with axis=1

out = df.groupby(df.columns.str.split('-').str[0],axis=1).sum().set_index('sn').add_prefix('Total_').reset_index()
Out[84]: 
   sn  Total_C1  Total_H2  Total_K3
0   1        12         5         6
1   2         4         2         3
2   3         3         2         3

Upvotes: 0

Quang Hoang

Reputation: 150765

You can use groupby:

# columns of interest
cols = df.columns[1:]

col_groups = cols.str.split('-').str[0]
out_df = df[['sn']].join(df[cols].groupby(col_groups, axis=1)
                            .sum()
                            .add_prefix('total_')
                        )

Output:

   sn  total_C1  total_H2  total_K3
0   1        12         5         6
1   2         4         2         3
2   3         3         2         3

Upvotes: 1

Append new column to DF after sum?

Answers (3)

Related Questions