CODEWITHSUNDEEP

Reputation: 1865

How do NAs behave in R

The behaviour of NA in R is often times confusing. For example, NA + 1 returns NA but NA^0 returns 1. This post has two fold objectives:

To answer: why does NA behave like this?
And more personal, to put out in wider audience, what I have come to understand over time and validate that understanding

Upvotes: 2

Views: 65

Answers (2)

Lajos Arpad

Reputation: 76454

Let's view this philosophically. NA is the missing value indicator. It means that there is no value. If you do not know a number x, then x + 1 is un-kown as well.

However, for any x number, x^0 = 1, therefore we can ignore the input of x to actually compute the output.

Upvotes: 2

Reputation: 1865

The documentation define NA as:

NA is a logical constant of length 1 which contains the missing value indicator. NA can be coerced to any other vector type except raw. There are also constants NA_integer_, NA_real_, NA_complex_ and NA_character_ of the other atomic vector types which support missing values

In simple words, it just tells you that a value is missing. The type of NA is logical, whic you can acertain by:

typeof(NA)
#> [1] "logical"

^{Created on 2020-06-19 by the reprex package (v0.3.0)}

Note this does not mean, the missing value is of type logical. It could be of anytype, after all it's missing.

The computations with NA usually returns NA (except some cases).

Maths with NA

Numeric calculations with NA usually returns NA.

NA + 1
#> [1] NA

^{Created on 2020-06-19 by the reprex package (v0.3.0)}

But not NA^0

NA^0
#> [1] 1

^{Created on 2020-06-19 by the reprex package (v0.3.0)}

To understand it better, one thing to remember in R is that everything that happens in R is a functional call. So you could rewrite the above as following.

`^`(NA,0)
#> [1] 1

Base functions (generally), attempts to coerce one atomic vector to another, if in case the two are of different type. One example of this is following, in which, LHS 1, which is of numeric type gets coerced to character type.

1 == "1"
#> [1] TRUE

^{Created on 2020-06-19 by the reprex package (v0.3.0)}

When this happens, the NA gets coerced to NA_ineger_ and then it becomes an identity i.e. any number raised to power zero is 1. As expected, when tried NA_character_, it throws an error.

NA_integer_^0
#> [1] 1
NA_real_^0
#> [1] 1
NA_complex_^0
#> [1] 1+0i
NA_character_^0
#> Error in NA_character_^0: non-numeric argument to binary operator

^{Created on 2020-06-19 by the reprex package (v0.3.0)}

Logics with `NA`

Often times we want to check if a particular variable (say x) is NA or not and intuitively we run the following expecting it to return either TRUE or FALSE.

x == NA
#> NA

^{Created on 2020-06-19 by the reprex package (v0.3.0)}

This might seem unexpected, but it makes sense, as x could be anything (like a vector, NA, integer etc) therefore the result is ambiguous. Next, the obvious question is what happens if we compare NA==NA. That too results in NA, still the result is ambiguous as it doesn't make sense to compare two missing values. Same goes with other operators as well like >, <, !=, &, |, ||, &&. To check if a value is NA, use is.na().

NA == NA
#> [1] NA

^{Created on 2020-06-19 by the reprex package (v0.3.0)}

But in case of certain identities, which are TRUE (or FALSE) irrespective of the values, the result is TRUE (or FALSE). For example:

The result of expression a AND b is TRUE if and only if both a and b are TRUE. In all other cases it's FALSE.¹

FALSE && NA
#> [1] FALSE

^{Created on 2020-06-19 by the reprex package (v0.3.0)}

The result of expression a OR b is FALSE if and only if both a and b are FALSE. In all other cases it's TRUE.²

TRUE || NA
#> [1] TRUE

^{Created on 2020-06-19 by the reprex package (v0.3.0)}

Upvotes: 2

Related Questions