Why does Category dtype not handle != NaN comparisons correctly?

Question

There seems to be inconsistent behavior in the != comparison depending upon whether the item belongs to the categories. If the value is in the categories != NaN returns False, seemingly inconsistent with how the normal != NaN comparison would evaluate. When the value is not in the categories, the behavior seems expected.

import pandas as pd
import numpy as np

# Standard evaluation
'11' != np.NaN
#True

'A' != np.NaN
#True

s = pd.Series([np.NaN, '11']).astype('category')

s.ne('11')
#0    False   # <- What?
#1    False
#dtype: bool

s.ne('A')
#0    True
#1    True
#dtype: bool

# Without the category type the behavior is correct
pd.Series([np.NaN, '11']).ne('11')
#0     True
#1    False
#dtype: bool

Is this a bug, or for some reason the expected NaN behavior within categories? pd.__version__ = 0.25.0, but also appears on 1.0.

Why does Category dtype not handle != NaN comparisons correctly?

Answers (1)

Related Questions