Reputation: 457
I have two columns like this:
NaN 35078
33515 33515
NaN 123841
NaN 124074
NaN 5699
5700 5703
NaN 7506
NaN 9142
9100 9146
NaN 10736
14557 14557
NaN 17229
What I want to do is to compare both columns and get another new column with True or False. The condition is that the values of each column for each row should be different, but without the NaN values.
I should get something like this:
False
False
False
False
False
True
False
False
True
False
False
False
What I have is df.loc[:, 'column1'] != df.loc[:, 'column2']
, but the NaN values are not ignored. Thanks!
Upvotes: 2
Views: 773
Reputation: 150805
If I understand correctly, (None < None)
is False
so straight comparison works:
df = pd.DataFrame({0: {0: None, 1: 33515.0, 2: None, 3: None,
4: None, 5: 5700.0, 6: None, 7: None,
8: 9100.0, 9: None, 10: 14557.0, 11: None},
1: {0: 35078, 1: 33515, 2: 123841, 3: 124074,
4: 5699, 5: 5703, 6: 7506, 7: 9142, 8: 9146,
9: 10736, 10: 14557, 11: 17229}})
df[0] < df[1]
yields:
0 False
1 False
2 False
3 False
4 False
5 True
6 False
7 False
8 True
9 False
10 False
11 False
dtype: bool
Upvotes: 0
Reputation: 863521
Chain another boolean mask with &
for bitwise AND
with Series.notna
:
mask = (df['column1'] != df['column2']) & (df['column1'].notna())
print (mask)
0 False
1 False
2 False
3 False
4 False
5 True
6 False
7 False
8 True
9 False
10 False
11 False
dtype: bool
If need test both columns:
mask = (df['column1'] != df['column2']) & (df[['column1', 'column1']].notna().all(axis=1))
Upvotes: 5