Reputation: 3713
If I have a data frame with nested headers like this:
John Joan
Smith, Jones, Smith,
Index1 234 432 324
Index2 2987 234 4354
...how do I create a new column that sums the values of each row?
I tried df['sum']=df['John']+df['Joan']
but that resulted in this error:
ValueError: Wrong number of items passed 3, placement implies 1
Upvotes: 1
Views: 1607
Reputation: 294258
If I understand you correctly:
...how do I create a new column that sums the values of each row?
The sum of each row is just
df.sum(axis=1)
The trick is getting to be a new column. You need to ensure the column you add has 2 levels of column heading.
df.loc[:, ('sum', 'sum')] = df.sum(axis=1)
I'm not happy with it, but it works.
Joan John sum
Smith, Jones, Smith, sum
Index1 324 432 234 990
Index2 4354 234 2987 7575
Upvotes: 1
Reputation: 109546
Dance Party, haven't heard from you in a while.
You want to groupby, but specify a level and axis. axis=1
means you want to sum the rows instead of the columns. level=0
is the top row of the columns.
df = pd.DataFrame({
('John', 'Smith,'): [234, 2987],
('John', 'Jones,'): [432, 234],
('Joan', 'Smith,'): [324, 4354]}, index=['Index1', 'Index2'])
>>> df.groupby(level=0, axis=1).sum()
Joan John
Index1 324 666
Index2 4354 3221
Upvotes: 1