Faraz Ali Khan
Faraz Ali Khan

Reputation: 93

Create a pandas dataframe column depending if a value is null or not

I have Data science-related project about a course students took in 2016. I have a column which shows at what dates did the students upgrade their course. If the course has not been upgraded the value is Null. What I want is to create a new data frame consisting of only this upgraded column consisting of "yes" or "no". I have attempted the following code and it works, Except I get the following warning: "SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame." I am putting a sample dataset, the code and the output I got. If someone can tell me a more efficient way with an explanation, It will be great.

import pandas as pd

registration = pd.DataFrame({'upgraded':['2016-08-12 19:42:07+00:00', '2016-08-14 11:51:21+00:00',
    '2016-07-22 17:24:59+00:00', None, None, '2016-07-12 10:33:02+00:00']})

upgraded_1 = registration[['upgraded']]
for i in range(len(upgraded_1['upgraded'])):
    if pd.isnull(upgraded_1['upgraded'][i]):
        upgraded_1['upgraded'][i] = "No"
    else:
        upgraded_1['upgraded'][i] = "Yes"

Output:

 upgraded_1
    0   Yes
    1   Yes
    2   Yes
    3   No
    4   No
    5   Yes

Upvotes: 3

Views: 667

Answers (1)

timgeb
timgeb

Reputation: 78690

You can achieve this with the isna method and numpy.where (think of it as numpy.if_then_else).

>>> pd.DataFrame(np.where(registration.isna(), 'No', 'Yes'))
     0
0  Yes
1  Yes
2  Yes
3   No
4   No
5  Yes

Upvotes: 3

Related Questions