Reputation: 35331
When I run code like this:
import pandas as pd
A = pd.DataFrame([('a', -1.374201, 35),
('b', 1.415697, 29),
('a', 0.233841, 18),
('b', 1.550599, 30),
('a', -0.178370, 63),
('b', -1.235956, 42),
('a', 0.088046, 2),
('b', 0.074238, 84)], columns='key value other'.split())
B = A.groupby('key')['value'].mean()
C = pd.DataFrame([('a', 0.469924, 44),
('b', 1.231064, 68),
('a', -0.979462, 73),
('b', 0.322454, 97)], columns='key value other'.split())
D = C.set_index('key')
D['value'] -= B
...the last line fails with the error:
Exception: Reindexing only valid with uniquely valued Index objects
What am I doing wrong?
Upvotes: 1
Views: 3038
Reputation: 68216
If I follow your example correctly (thanks for adding it, BTW), I believe what you need is as simple as:
D.sub(B, axis='index')
Which gives me:
In [29]: D.sub(B, axis='index')
Out[29]:
value other
key
a 0.777595 44.307671
a -0.671791 73.307671
b 0.779919 67.548856
b -0.128690 96.548856
As you can see, this messes up the other
column. If that's a problem, you're back in the same duplicate index situation, unfortunately.
Upvotes: 3