Reputation: 4928
I am using pandas 1.14.
I have dataframe that looks like this:
col1 col2 ....
A B C D E
11 1 1 1 1 2 3
3 3 4
30 3 10 2
... ...
22 3 4 5 6 3 1
df.index
outputs
MultiIndex([('11', '1', '1', '1', '1'),
('11', '1', '1', '1', '3'),
('11', '1', '1', '30', '3'),
...
('22', '3', '4', '5', '6')],
names=["A","B","C", "D", "E"], length=10000)
df.columns
outputs
Index(["col1", "col2", ...], dtype="object")
what I want to do it add both columns and divide by 2. in single index dataframe I would usually do
df["new"] = (df["col1"] + df["col2"])/2
how can I do this with multiindex dataframe?
My desired dataframe should look like this
col1 col2 new
A B C D E
11 1 1 1 1 2 3 2.5
3 3 4 3.5
30 3 10 2 6
... ...
22 3 4 5 6 3 1 2
Thanks in advance!
Upvotes: 0
Views: 119
Reputation: 31146
no special treatment, standard techniques. My standard is to always use assign()
df = pd.DataFrame({"A":[11],"B":[1],"C":[1],"D":[1],"E":[1],"col1":[2],"col2":[3]})
df = df.set_index(["A","B","C","D","E"])
df = df.assign(new=lambda dfa: dfa.sum(axis=1)/2)
print(df.to_string())
col1 col2 new
A B C D E
11 1 1 1 1 2 3 2.5
Upvotes: 0
Reputation: 6337
I did an experiment and your approche should work.
df = pd.DataFrame({'a':[1,2,3,4], 'b':[2,3,4,5]}, index=[['1', '1', '2', '2'], ['1','2','1','2']])
df
>>>
a b
1 1 1 2
2 2 3
2 1 3 4
2 4 5
Your approche.
df['new'] = (df['a'] + df['b']) / 2
df
>>>
a b new
1 1 1 2 1.5
2 2 3 2.5
2 1 3 4 3.5
2 4 5 4.5
```
Upvotes: 0
Reputation: 2887
Your solution should work for MultiIndexes as well
In [14]: df = pd.DataFrame([[2,3],[3,4],[10,2],[3,1]], columns=['col1', 'col2'], index=index)
In [15]: df
Out[15]:
col1 col2
A B C D E
11 1 1 1 1 2 3
3 3 4
30 3 10 2
22 3 4 5 6 3 1
In [16]: df['new'] = (df['col1'] + df['col2'])/2
In [17]: df
Out[17]:
col1 col2 new
A B C D E
11 1 1 1 1 2 3 2.5
3 3 4 3.5
30 3 10 2 6.0
22 3 4 5 6 3 1 2.0
Upvotes: 1