Why does is.na() change its argument?

Question

I just discovered the following behaviour of the is.na() function which I don't understand:

df <- data.frame(a = 5:1, b = "text")
df
##   a    b
## 1 5 text
## 2 4 text
## 3 3 text
## 4 2 text
## 5 1 text
is.na(df)
##          a     b
## [1,] FALSE FALSE
## [2,] FALSE FALSE
## [3,] FALSE FALSE
## [4,] FALSE FALSE
## [5,] FALSE FALSE
is.na(df) <- "0"
df
##   a    b  0
## 1 5 text NA
## 2 4 text NA
## 3 3 text NA
## 4 2 text NA
## 5 1 text NA

My question
Why does is.na() change its argument (and in this case adds an extra column to the data frame)? In this case its behaviour seems extra puzzling (or at least unexpected) because the result of the query is FALSE for all instances.

NB
This question is not about subsetting and changing the NA values in a data frame - I know how to do that (df[is.na(df)] <- "0"). This question is about the behaviour of the is.na function! Why is an assignment to a is.something function changing the argument itself - this is unexpected.

Axeman · Accepted Answer

The actual function being used here is not is.na() but the assignment function `is.na<-`, for which the default method is `is.na<-.default`. Printing that function to console we see:

function (x, value) 
{
    x[value] <- NA
    x
}

So clearly, value is supposed to be an index here. If you index a data.frame like df["0"], it will try to select the column named "0". If you assign something to df["0"], the column will be created and filled with (in this case) NA.

To clarify, `is.na<-` sets values to NA, it does not replace NA values with something else.

Why does is.na() change its argument?

Answers (1)

Related Questions