user2654270
user2654270

Reputation: 39

Weird conversion problems with POSIXct

I need some help understanding and working around some odd behavior associated with as.POSIXct.

I'm working with three years of light data, collected at 15 minute intervals, from three different meters. In order to match this data up with another dataset, I'm trying to create interpolated light measurements at 5 minutes intervals using zoo.

Weird problems occur when I convert my dates and times with as.POSIXct. The conversion works correctly with most of the data set, but consistently sets the time back one hour for three specific time periods: 03/13/11 2:00:00 - 2:45:00, 03/11/12 2:00:00 - 2:45:00, and 03/10/13 2:00:00 - 2:45:00. I've posted the results from the R console below. Can anyone explain this behavior and suggest a solution?

    > Ldata <- read.csv("9813306_LI_AIRTEM_HENGILL2_SUM.csv")

    > head(Ldata)

      X.     date     time Temp_C Light_lux
      1  1 10/29/10 15:30:00  2.195    3444.5
      2  2 10/29/10 15:45:00  1.330    3100.0
      3  3 10/29/10 16:00:00  1.330    3100.0
      4  4 10/29/10 16:15:00  1.221    2927.8
      5  5 10/29/10 16:30:00  1.221    2152.8
      6  6 10/29/10 16:45:00  1.112    1463.9

    > str(Ldata)
      'data.frame': 88417 obs. of  5 variables:
       $ X.       : int  1 2 3 4 5 6 7 8 9 10 ...
       $ date     : Factor w/ 923 levels "1/1/11","1/1/13",..: 105 105 105 105 105 105 105 105 105 105 ...
       $ time     : Factor w/ 98 levels "0:00:00","0:15:00",..: 33 34 35 36 37 38 39 40 41 42 ...
       $ Temp_C   : num  2.19 1.33 1.33 1.22 1.22 ...
       $ Light_lux: num  3444 3100 3100 2928 2153 ...

This is the section of data that gives me trouble

      > Ldata[c(12908:12911, 44194:44197, 79136:79139),]
              X.    date    time Temp_C Light_lux
      12908 12908 3/13/11 2:00:00 -5.496         0
      12909 12909 3/13/11 2:15:00 -5.249         0
      12910 12910 3/13/11 2:30:00 -5.995         0
      12911 12911 3/13/11 2:45:00 -6.246         0
      44194 44197 3/11/12 2:00:00  0.674         0
      44195 44198 3/11/12 2:15:00  0.563         0
      44196 44199 3/11/12 2:30:00  0.453         0
      44197 44200 3/11/12 2:45:00  0.343         0
      79136 79139 3/10/13 2:00:00 -2.494         0
      79137 79140 3/10/13 2:15:00 -2.610         0
      79138 79141 3/10/13 2:30:00 -2.377         0
      79139 79142 3/10/13 2:45:00 -2.610         0

> P_datetime <- as.POSIXct(paste(Ldata$date, Ldata$time), format = "%m/%d/%y %H:%M:%S")
> Ldata_m <- cbind(Ldata, P_datetime)

Notice how P_datetime is 1 hour earlier than the time column.

    > Ldata_m[duplicated(Ldata_m$P_datetime),]
     X.    date    time Temp_C Light_lux          P_datetime
    12908 12908 3/13/11 2:00:00 -5.496         0 2011-03-13 01:00:00
    12909 12909 3/13/11 2:15:00 -5.249         0 2011-03-13 01:15:00
    12910 12910 3/13/11 2:30:00 -5.995         0 2011-03-13 01:30:00
    12911 12911 3/13/11 2:45:00 -6.246         0 2011-03-13 01:45:00
    44194 44197 3/11/12 2:00:00  0.674         0 2012-03-11 01:00:00
    44195 44198 3/11/12 2:15:00  0.563         0 2012-03-11 01:15:00
    44196 44199 3/11/12 2:30:00  0.453         0 2012-03-11 01:30:00
    44197 44200 3/11/12 2:45:00  0.343         0 2012-03-11 01:45:00
    79136 79139 3/10/13 2:00:00 -2.494         0 2013-03-10 01:00:00
    79137 79140 3/10/13 2:15:00 -2.610         0 2013-03-10 01:15:00
    79138 79141 3/10/13 2:30:00 -2.377         0 2013-03-10 01:30:00
    79139 79142 3/10/13 2:45:00 -2.610         0 2013-03-10 01:45:00

Upvotes: 2

Views: 221

Answers (1)

Joshua Ulrich
Joshua Ulrich

Reputation: 176718

This is caused by daylight saving time. You need to set the tz= argument in your as.POSIXct call to match the timezone the data were recorded in. For example, there's no issue if you set tz="UTC":

> as.POSIXct(paste(Ldata$date, Ldata$time), format="%m/%d/%y %H:%M:%S", tz="UTC")
#  [1] "2011-03-13 02:00:00 UTC" "2011-03-13 02:15:00 UTC"
#  [3] "2011-03-13 02:30:00 UTC" "2011-03-13 02:45:00 UTC"
#  [5] "2012-03-11 02:00:00 UTC" "2012-03-11 02:15:00 UTC"
#  [7] "2012-03-11 02:30:00 UTC" "2012-03-11 02:45:00 UTC"
#  [9] "2013-03-10 02:00:00 UTC" "2013-03-10 02:15:00 UTC"
# [11] "2013-03-10 02:30:00 UTC" "2013-03-10 02:45:00 UTC"

Upvotes: 2

Related Questions