Reputation: 15
I'm trying to apply boolean indexing to a pandas DataFrame.
nm - stores the names of players
ag- stores the player ages
sc - stores the scores
capt - stores boolean index values
import pandas as pd
nm=pd.Series(['p1','p2', 'p3', 'p4'])
ag=pd.Series([12,17,14, 19])
sc=pd.Series([120, 130, 150, 100])
capt=[True, False, True, True]
Cricket=pd.DataFrame({"Name":nm,"Age":ag ,"Score":sc}, index=capt)
print(Cricket)
Output:
Name Age Score
True NaN NaN NaN
False NaN NaN NaN
True NaN NaN NaN
True NaN NaN NaN
Whenever I run the code above, I get a DataFrame filled with NaN values. The only case in which this seems to work is when capt doesn't have repeating elements.
i.e When capt=[False, True]
(and reasonable values are given for nm, ag and sc) this code works as expected.
I'm running python 3.8.5, pandas 1.1.1 Is this a deprecated functionality?
Desired output:
Name Age Score
True p1 12 120
False p2 17 130
True p3 14 150
True p4 19 100
Upvotes: 1
Views: 80
Reputation: 862681
Set index values for each Series for avoid mismatch between default RangeIndex of each Series and new index values from capt
:
capt=[True, False, True, True]
nm=pd.Series(['p1','p2', 'p3', 'p4'], index=capt)
ag=pd.Series([12,17,14, 19], index=capt)
sc=pd.Series([120, 130, 150, 100], index=capt)
Cricket=pd.DataFrame({"Name":nm,"Age":ag ,"Score":sc})
print(Cricket)
Name Age Score
True p1 12 120
False p2 17 130
True p3 14 150
True p4 19 100
Detail:
print(pd.Series(['p1','p2', 'p3', 'p4']))
0 p1
1 p2
2 p3
3 p4
dtype: object
print(pd.Series(['p1','p2', 'p3', 'p4'], index=capt))
True p1
False p2
True p3
True p4
dtype: object
Boolean indexing
is filtration:
capt=[True, False, True, True]
nm=pd.Series(['p1','p2', 'p3', 'p4'])
ag=pd.Series([12,17,14, 19])
sc=pd.Series([120, 130, 150, 100])
Cricket=pd.DataFrame({"Name":nm,"Age":ag ,"Score":sc})
print(Cricket)
Name Age Score
0 p1 12 120
1 p2 17 130
2 p3 14 150
3 p4 19 100
print (Cricket[capt])
Name Age Score
0 p1 12 120
2 p3 14 150
3 p4 19 100
Upvotes: 1