ℕʘʘḆḽḘ
ℕʘʘḆḽḘ

Reputation: 19395

strange ix selection in pandas with duplicate indices

There is something I dont understand with the ix selector in pandas.

Consider the following dataframe

dfnu=pd.DataFrame({'A':[7,1,2,3,4],'B':[7,8,9,1,1]},index=list('AABCD'))

now look at this output

dfnu['A']<2
Out[128]: 
A    False
A     True
B    False
C    False
D    False
Name: A, dtype: bool


dfnu['test']=dfnu.ix[dfnu['A']<2,'A']
dfnu
Out[127]: 
   A  B  test
A  7  7     1
A  1  8     1
B  2  9   NaN
C  3  1   NaN
D  4  1   NaN

what is going on here? why on earth test is equal to 1 on the first row?

Upvotes: 0

Views: 55

Answers (2)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210882

You can do it like this:

dfnu.ix[dfnu.A < 2, 'test'] = 1

Output

In [289]: dfnu
Out[289]:
   A  B
A  7  7
A  1  8
B  2  9
C  3  1
D  4  1

In [290]: dfnu.ix[dfnu.A < 2, 'test'] = 1

In [291]: dfnu
Out[291]:
   A  B  test
A  7  7   NaN
A  1  8   1.0
B  2  9   NaN
C  3  1   NaN
D  4  1   NaN

it will give you the result you wanted

Upvotes: 1

BrenBarn
BrenBarn

Reputation: 251438

Since there is only one row with A<2, dfnu.ix[dfnu['A'<2, 'A'] has only one value:

>>> dfnu.ix[dfnu['A']<2, 'A']
A    1
Name: A, dtype: int64

When you assign this back into dfnu, the values are matched on the index. In other words, because the one row shown above has A as the index, its value (1) is assigned to every row in the original DataFrame that has A as the index. This is also why you get NaN for the other rows; since they don't have A as the index, no value is assigned for them.

Upvotes: 1

Related Questions