perpetuallyperplexed
perpetuallyperplexed

Reputation: 15

Boolean indexing in pandas dataframes

I'm trying to apply boolean indexing to a pandas DataFrame.

nm - stores the names of players

ag- stores the player ages

sc - stores the scores

capt - stores boolean index values

import pandas as pd 
nm=pd.Series(['p1','p2', 'p3', 'p4'])
ag=pd.Series([12,17,14, 19])
sc=pd.Series([120, 130, 150, 100])
capt=[True, False, True, True]
Cricket=pd.DataFrame({"Name":nm,"Age":ag ,"Score":sc}, index=capt)
print(Cricket) 

Output:

       Name  Age  Score
True   NaN  NaN    NaN
False  NaN  NaN    NaN
True   NaN  NaN    NaN
True   NaN  NaN    NaN

Whenever I run the code above, I get a DataFrame filled with NaN values. The only case in which this seems to work is when capt doesn't have repeating elements.

i.e When capt=[False, True] (and reasonable values are given for nm, ag and sc) this code works as expected.

I'm running python 3.8.5, pandas 1.1.1 Is this a deprecated functionality?

Desired output:

        Name  Age  Score
True    p1   12    120
False   p2   17    130
True    p3   14    150
True    p4   19    100

Upvotes: 1

Views: 80

Answers (1)

jezrael
jezrael

Reputation: 862681

Set index values for each Series for avoid mismatch between default RangeIndex of each Series and new index values from capt:

capt=[True, False, True, True]
nm=pd.Series(['p1','p2', 'p3', 'p4'], index=capt)
ag=pd.Series([12,17,14, 19], index=capt)
sc=pd.Series([120, 130, 150, 100], index=capt)

Cricket=pd.DataFrame({"Name":nm,"Age":ag ,"Score":sc})
print(Cricket) 
      Name  Age  Score
True    p1   12    120
False   p2   17    130
True    p3   14    150
True    p4   19    100

Detail:

print(pd.Series(['p1','p2', 'p3', 'p4'])) 
0    p1
1    p2
2    p3
3    p4
dtype: object

print(pd.Series(['p1','p2', 'p3', 'p4'], index=capt)) 
True     p1
False    p2
True     p3
True     p4
dtype: object

Boolean indexing is filtration:

capt=[True, False, True, True]
nm=pd.Series(['p1','p2', 'p3', 'p4'])
ag=pd.Series([12,17,14, 19])
sc=pd.Series([120, 130, 150, 100])

Cricket=pd.DataFrame({"Name":nm,"Age":ag ,"Score":sc})
print(Cricket) 
  Name  Age  Score
0   p1   12    120
1   p2   17    130
2   p3   14    150
3   p4   19    100

print (Cricket[capt])
  Name  Age  Score
0   p1   12    120
2   p3   14    150
3   p4   19    100

Upvotes: 1

Related Questions