Reputation: 1875
I am trying to select rows with a date value less than a value. It doesn't seem to be working as I am getting all date values, not just those less than a value.
Here's the df structure:
str(sawdf)
'data.frame': 83597 obs. of 10 variables:
$ actiondate : Date, format: "2016-05-08" "2016-05-08" "2016-05-09" ...
And here's some sample data:
head(sawdf)
actiondate
2016-05-14
2016-05-15
2016-05-16
2016-05-17
2016-05-18
And here is my sql:
sqldf("select distinct actiondate from sawdf where actiondate < '2016-05-18'")
And here's some of the results:
...
6 2016-05-13
7 2016-05-14
8 2016-05-15
9 2016-05-16
10 2016-05-17
11 2016-05-18
12 2016-05-19
As you can see data beyond 2016-05-18 are being selected.
I've tried several approaches but am getting the same results.
Thanks
Upvotes: 0
Views: 1201
Reputation: 269491
1) sqlite Assuming you are using the default SQLite backend, SQLite does not have a date type so the dates are transferred to SQLite as the number of days since the UNIX Epoch. That is on the SQLite side actiondate
is a column of numbers. (If x
were a "Date"
class R variable then as.numeric(x)
gives the number(s) that is/are transferred to SQLite.) We need to compare these numbers to an appropriate number, not to a character string. This would work as it also converts the comparison date in the same way (i.e. it replaces $date0
with 16939 which is the number of days since the UNIX Epoch represented by that date):
library(sqldf)
date0 <- as.Date("2016-05-18")
fn$sqldf("select distinct actiondate from sawdf where actiondate < $date0")
There is more information on date processing in sqldf with SQLite on the sqldf home page on github: https://github.com/ggrothendieck/sqldf
1a) This would also work since all dates get transferred in the same way:
library(sqldf)
Date0 <- data.frame(date0 = as.Date("2016-05-18"))
sqldf("select distinct actiondate from sawdf where actiondate < (select date0 from Date0)")
1b) Although it is a bit messy, rather than convert the comparison date to numeric one could convert the actiondate column to character using an SQLite builtin function:
sqldf("select distinct actiondate from sawdf
where strftime('%Y-%m-%d', actiondate * 3600 * 24, 'unixepoch') < '2016-05-18'")
2) H2 Alternately use the H2 backend which does have a date type. In that case the code in the question does work. Install RH2 (which includes H2) and also make sure you have java installed on your machine. Then:
library(RH2)
library(sqldf)
sqldf("select distinct actiondate from sawdf where actiondate < '2016-05-18'")
Note: The input we assumed, in reproducible form, is:
Lines <- "actiondate
2016-05-14
2016-05-15
2016-05-16
2016-05-17
2016-05-18"
sawdf <- read.csv(text = Lines)
sawdf$actiondate <- as.Date(sawdf$actiondate)
Upvotes: 1
Reputation: 156
I can't comment yet, but @Gregor has a great solution. If you are bound and determined to use SQL though, you can first convert the date into a character (since SQLite doesn't have a date type):
sawdf <- data.frame(actiondate = as.Date(c("2016-05-14", "2016-05-15", "2016-05-30")))
sawdf$actiondate <- as.character(sawdf$actiondate)
str(sawdf)
sqldf("select actionDate
from sawdf where substr(actionDate,1,4)||substr(actionDate,6,2)||substr(actionDate,9,2) < '20160520'")
actiondate
1 2016-05-14
2 2016-05-15
Upvotes: 1