LeGeniusII
LeGeniusII

Reputation: 960

NA to dates with data.table

I have a data.table

require(data.table)
require(lubridate)
testDT <- data.table(dateA = c(NA,NA), dateB = c(ymd("20110101"),ymd("20100101")))
testDT
#       dateA      dateB
#    1:    NA 2011-01-01
#    2:    NA 2010-01-01

I would like to do the following operation: if dateA is NA, then use the same value as in dateB. I've tried the following command:

> testDT[is.na(dateA), dateA := dateB]
Warning message:
In `[.data.table`(testDT, is.na(dateA), `:=`(dateA, dateB)) :
  Coerced 'double' RHS to 'logical' to match the column's type; may have truncated precision. Either change the target column ['dateA'] to 'double' first (by creating a new 'double' vector length 2 (nrows of entire table) and assign that; i.e. 'replace' column), or coerce RHS to 'logical' (e.g. 1L, NA_[real|integer]_, as.*, etc) to make your intent clear and for speed. Or, set the column type correctly up front when you create the table and stick to it, please.

As you can see, there was a warning and the result is weird:

> testDT
   dateA      dateB
1:  TRUE 2011-01-01
2:  TRUE 2010-01-01

Why doesn't it work?

P.S. I know we can use:

> testDT[,dateA := ifelse(is.na(dateA), dateB, dateA)]
> testDT
   dateA      dateB
1: 14975 2011-01-01
2: 14610 2010-01-01
> testDT[,dateA := as.Date(dateA, origin = "1970-01-01")]
> testDT
        dateA      dateB
1: 2011-01-01 2011-01-01
2: 2010-01-01 2010-01-01

Upvotes: 1

Views: 403

Answers (2)

Emmanuel-Lin
Emmanuel-Lin

Reputation: 1943

Since you only have NAs in first column, it guesses it is Logical.

If you add one elemnt that isn't NA, it works perfectly:

Your example with one more element

require(data.table)
require(lubridate)
testDT <- data.table(dateA = c(NA,NA, ymd("20110101")), dateB = c(ymd("20110101"),ymd("20100101"), ymd("20100101")))

testDT[is.na(dateA), dateA := dateB]

The result:

> testDT
   dateA      dateB
1: 14975 2011-01-01
2: 14610 2010-01-01
3: 14975 2010-01-01

So why do you have only NAs?

Upvotes: 0

Jaap
Jaap

Reputation: 83235

You get that warning message because the dateA-column doesn't have the right class (as already mentioned by @Emmanuel-Lin):

> str(testDT)
Classes ‘data.table’ and 'data.frame':    2 obs. of  2 variables:
 $ dateA: logi  NA NA
 $ dateB: Date, format: "2011-01-01" "2010-01-01"
 - attr(*, ".internal.selfref")=<externalptr>

A possible solution is to convert the dateA-column to a date class first with either as.Date or the build-in date functions of :

# convert 'dateA'-column to 'Date'- class first
testDT[, dateA := as.Date(dateA)]   # alternatively: as.IDate(dateA)

# fill the 'NA' values in the 'dateA'-column
testDT[is.na(dateA), dateA := dateB][]

which gives:

> testDT
        dateA      dateB
1: 2011-01-01 2011-01-01
2: 2010-01-01 2010-01-01

Upvotes: 2

Related Questions