Reputation: 27
I want to create a new column in a pandas dataframe that evaluates to either True, False, or NaN depending on values found in two other columns in the same dataframe, which also only contain either True, False, or NaN values. Specifically, as shown below, row values in the new column should be: (a) True if either of the reference column values is True, (b) False if both reference column values are False or if they are a combination of False and NaN, or (c) NaN if both reference column values are NaN.
col_A + col_B => new_col (desired values)
Initially, as shown below, I tried assigning values to the new column where either col_A or col_B were True, but understandably it returned False whenever one or both columns contained NaN.
df[new_col] = df[col_A] | df[col_B]
I also figured out how to create a column that returns True for rows where both col_A and col_B are NaN, but I'm still struggling to figure out the next step.
df[new_col] = pd.isnull(df[col_A]) & pd.isnull(df[col_B])
I feel like I could eventually get the correct output using a for loop/if statements, but my understanding is that this would be super inefficient and it seems like there should be a more efficient/straightforward way to get the result I'm looking for.
Upvotes: 1
Views: 2372
Reputation: 3184
Use .any
Set up the dataframe:
dict = {
'col_A': [True, True, True, False, False, False, np.NaN, np.NaN, np.NaN],
'col_B': [True, False, np.NaN, False, True, np.NaN, np.NaN,True, False]
}
df = pd.DataFrame(dict)
print(df)
col_A col_B
0 True True
1 True False
2 True NaN
3 False False
4 False True
5 False NaN
6 NaN NaN
7 NaN True
8 NaN False
Create a new column and set all values to false. We will fill in the other values.
df['new_col'] = False
Use .any() to fill in the remaining True and NaN values.
df.loc[df[['col_A', 'col_B']].any(1), 'new_col'] = True
print(df)
col_A col_B new_col
0 True True True
1 True False True
2 True NaN True
3 False False False
4 False True True
5 False NaN False
6 NaN NaN False
7 NaN True True
8 NaN False False
Upvotes: 1