user2646234
user2646234

Reputation: 332

How to do boolean algebra on missing values?

I want to replicate boolean NA values as they behave in R:

NA is a valid logical object. Where a component of x or y is NA, the result will be NA if the outcome is ambiguous. In other words NA & TRUE evaluates to NA, but NA & FALSE evaluates to FALSE. http://stat.ethz.ch/R-manual/R-devel/library/base/html/Logic.html

I have seen None being recommended for missing values, but Python converts None to False when evaluating boolean expressions, and computes None or False to False. The result should of course have been None, as no conclusions can be made given the missing value.

How do I achieve this in Python?

EDIT Accepted answer computes correctly with bitwise boolean operators, but to achieve the same behavior with logical operators not, or and and, seems to require a change in the Python programming language.

Upvotes: 14

Views: 1512

Answers (3)

Chris Barker
Chris Barker

Reputation: 2389

As other have said, you can define your own class.

class NA_(object):
    instance = None # Singleton (so `val is NA` will work)
    def __new__(self):
        if NA_.instance is None:
            NA_.instance = super(NA_, self).__new__(self)
        return NA_.instance
    def __str__(self): return "NA"
    def __repr__(self): return "NA_()"
    def __and__(self, other):
        if self is other or other:
            return self
        else:
            return other
    __rand__ = __and__
    def __or__(self, other):
        if self is other or other:
            return other
        else:
            return self
    __ror__ = __or__
    def __xor__(self, other):
        return self
    __rxor__ = __xor__
    def __eq__(self, other):
        return self is other
    __req__ = __eq__
    def __nonzero__(self):
        raise TypeError("bool(NA) is undefined.")
NA = NA_()

Use:

>>> print NA & NA
NA
>>> print NA & True
NA
>>> print NA & False
False
>>> print NA | True
True
>>> print NA | False
NA
>>> print NA | NA
NA
>>> print NA ^ True
NA
>>> print NA ^ NA
NA
>>> if NA: print 3
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 28, in __nonzero__
TypeError: bool(NA) is undefined.
>>> if NA & False: print 3
...
>>>
>>> if NA | True: print 3
...
3
>>>

Upvotes: 9

Blair
Blair

Reputation: 6693

You can do this by creating a class and overriding the boolean operation methods.

>>> class NA_type(object):
        def __and__(self,other):
                if other == True:
                        return self
                else:
                        return False
        def __str__(self):
                return 'NA'


>>> 
>>> NA = NA_type()
>>> print NA & True
NA
>>> print NA & False
False

Upvotes: 6

freakish
freakish

Reputation: 56467

You can define a custom class (singleton?) and define custom __and__ (and whatever other you neeed) function. See this:

http://docs.python.org/2/reference/datamodel.html#emulating-numeric-types

Upvotes: 0

Related Questions