Reputation: 3081
I have a rather basic question. Can someone explain to me why the former works, while the latter does not and why the Date data type matters?
library(data.table)
test.table <- data.table(Dates =
as.Date(c("2020-08-31", "2020-01-31", "2020-08-31", "2010-01-01")))
test.table[Dates == "2020-08-31"]
test.table[Dates %in% c("2020-08-31")]
Upvotes: 0
Views: 42
Reputation: 132706
This is not specific to data.table. The documentation in help("%in%)
says this:
Factors, raw vectors and lists are converted to character vectors, and then x and table are coerced to a common type (the later of the two types in R's ordering, logical < integer < numeric < complex < character) before matching.
The common type between a Date variable and a character variable is "character". Since the documentation refers to types and not to classes, as.character.Date
is not involved. I assume the internal doubles of the Date variable are coerced and compared.
You should never rely on the automatic coercion for comparisons. Always use explicit coercion:
Dates %in% as.Date("2020-08-31")
Dates == as.Date("2020-08-31")
Upvotes: 3
Reputation: 2096
Regarding
test.table[Dates %in% c("2020-08-31")]
c("2020-08-31")
is treated as a character class while test.table$Dates
is a Date class. Therefore, they are not a match when using %in%
.
If you convert the Dates
as character or c("2020-08-31")
as Date, you will get the same match.
test.table[as.character(Dates) %in% c("2020-08-31")]
test.table[Dates %in% as.Date("2020-08-31")]
Upvotes: 0