John legend2
John legend2

Reputation: 920

failure to detect NA

While I am trying to find NA I have a problem as below.

dta = data.frame(group0 = c(1,1,1,2,2,2,3,3),
  date0 = c(as.Date("2018-09-01",format="%Y-%m-%d"),
                              as.Date("2018-09-02",format="%Y-%m-%d"),
                              as.Date("2018-09-03",format="%Y-%m-%d"),
                              as.Date("2018-09-04",format="%Y-%m-%d"),
                              as.Date("2018-10-01",format="%Y-%m-%d"),
                              as.Date("2018-10-02",format="%Y-%m-%d"),
                              as.Date("2018-10-02",format="%Y-%m-%d"),
                              as.Date("2018-10-03",format="%Y-%m-%d")),
  type0 = c("A","A","B","A","B","B","B","B"))

I have data like this and try to have a min date for each group with this a condition as below.

    dta2 = dta %>% group_by(group0) %>% summarise(tmp_date0 = min(date0[type0 == "A"]))

Then, I have this

> dta2
# A tibble: 3 x 2
  group0 tmp_date0 
   <dbl> <date>    
1      1 2018-09-01
2      2 2018-09-04
3      3 NA        

When I run this

> is.na(dta2$tmp_date0)
[1] FALSE FALSE FALSE

Why the third one is FALSE?

Upvotes: 1

Views: 40

Answers (1)

akrun
akrun

Reputation: 887128

There is a coercion from Inf because there are no values that matching the value of "A" in the logical expression returning logical(0)

min(logical(0))
#[1] Inf

because it is Date class, the coercion to NA is not a real NA

as.Date(Inf) 
# NA

dput(as.Date(Inf))
#structure(Inf, class = "Date")

as.Date(Inf) %>%
    is.na
#[1] FALSE

It gets coerced to NA, but it is not NA, if we check the dput

dput(dta2$tmp_date0)
#structure(c(17775, 17778, Inf), class = "Date")

A check with is.finite further proves it

is.finite(dta2$tmp_date0)
#[1]  TRUE  TRUE FALSE

In order to prevent the min acting on logical(0) an option is to use an if/else condition

dta3 <- dta %>% 
    group_by(group0) %>%
    summarise(tmp_date0 = if(any(type0 == 'A')) min(date0[type0 == 'A']) else NA)

Now, the is.na correctly picks up

is.na(dta3$tmp_date0)
#[1] FALSE FALSE  TRUE

Upvotes: 2

Related Questions