JohnGagliano
JohnGagliano

Reputation: 54

sqldf and POSIXct

Having issues wit sqldf and & POSIXct. Will appreciate any help

Sys.setenv(TZ = "America/Los_Angeles")
user <- read.csv("user_ori.csv",stringsAsFactors=FALSE)
user$created_on <- as.POSIXct(user$created_on,format="%Y-%m-%d %H:%M:%S",tz="America/Los_Angeles")

head(user$created_on)
[1] "2017-02-15 19:39:10 PST" "2016-05-11 22:44:58 PDT"
[3] "2017-02-06 19:25:48 PST" "2016-09-22 22:25:50 PDT"
[5] "2016-07-11 10:36:58 PDT" "2015-02-23 20:42:12 PST"

Not sure why I have a mix of PST and PDT, but moving on...

user.mindate <- sqldf("select min(u.created_on) as min_date
                from user u")
> user.mindate
    min_date
1 1380749081

> as.Date(user.mindate$min_date,tz="America/Los_Angeles")
[1] "3782332-08-02"

Any suggestions on how to convert min_date back to Date?

Thank you, John

Upvotes: 2

Views: 983

Answers (2)

r2evans
r2evans

Reputation: 160417

SQLite stores datetime-like objects as numeric, typically epoch seconds. This is unavoidable.

If you read ?as.POSIXct, you'll see

## S3 method for class 'numeric'
as.POSIXlt(x, tz = "", origin, ...)

and later

origin: a date-time object, or something which can be coerced by as.POSIXct(tz = "GMT") to such an object.

as.POSIXct(1380749081, origin="1970-01-01")
# [1] "2013-10-02 14:24:41 PDT"

Upvotes: 0

G. Grothendieck
G. Grothendieck

Reputation: 269491

The problem is that SQLite has no datetime type. R stores POSIXct objects as the number of seconds since 1970-01-01 (the Epoch) and when such an object is sent to SQLite since there is no corresponding type it is just sent as the raw number of seconds, i.e. a number. When it is sent back to R after processing it is still just a number.

There are a few ways to handle this:

1) column name Give the output column name that is to be POSIXct the same name as it had originally. sqldf has a heuristic in which it looks at the input and output column names and if any of them match then it assumes that the output should be converted to the same class that the input column of that name has.

sqldf("select min(u.created_on) as created_on from user u")
##            created_on
## 1 2017-02-15 19:39:10

2) method argument Explicitly tell sqldf what class to assign using the method argument:

sqldf("select min(u.created_on) as min_date__POISXt from user u", method = "POSIXct")
##              min_date
## 1 2017-02-15 19:39:10

3) H2 Rather than using the default SQLite backend use the H2 backend. H2 does support datetime classes so it works as expected. If you load the RH2 driver package for H2 then sqldf will notice it and use that database instead of SQLite. (Also the "sqldf.driver" option and the drv= argument to sqldf can be used to specify the backend -- see ?sqldf and the information on the sqldf github home page.)

library(RH2)

sqldf("select min(u.created_on) as min_date from user u")
##              min_date
## 1 2017-02-15 19:39:10

Upvotes: 3

Related Questions