Reputation: 93
I have Data science-related project about a course students took in 2016. I have a column which shows at what dates did the students upgrade their course. If the course has not been upgraded the value is Null. What I want is to create a new data frame consisting of only this upgraded column consisting of "yes" or "no". I have attempted the following code and it works, Except I get the following warning: "SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame." I am putting a sample dataset, the code and the output I got. If someone can tell me a more efficient way with an explanation, It will be great.
import pandas as pd
registration = pd.DataFrame({'upgraded':['2016-08-12 19:42:07+00:00', '2016-08-14 11:51:21+00:00',
'2016-07-22 17:24:59+00:00', None, None, '2016-07-12 10:33:02+00:00']})
upgraded_1 = registration[['upgraded']]
for i in range(len(upgraded_1['upgraded'])):
if pd.isnull(upgraded_1['upgraded'][i]):
upgraded_1['upgraded'][i] = "No"
else:
upgraded_1['upgraded'][i] = "Yes"
Output:
upgraded_1
0 Yes
1 Yes
2 Yes
3 No
4 No
5 Yes
Upvotes: 3
Views: 667
Reputation: 78690
You can achieve this with the isna
method and numpy.where
(think of it as numpy.if_then_else
).
>>> pd.DataFrame(np.where(registration.isna(), 'No', 'Yes'))
0
0 Yes
1 Yes
2 Yes
3 No
4 No
5 Yes
Upvotes: 3