Reputation: 39
I need some help understanding and working around some odd behavior associated with as.POSIXct
.
I'm working with three years of light data, collected at 15 minute intervals, from three different meters. In order to match this data up with another dataset, I'm trying to create interpolated light measurements at 5 minutes intervals using zoo.
Weird problems occur when I convert my dates and times with as.POSIXct
. The conversion works correctly with most of the data set, but consistently sets the time back one hour for three specific time periods: 03/13/11 2:00:00 - 2:45:00, 03/11/12 2:00:00 - 2:45:00, and 03/10/13 2:00:00 - 2:45:00. I've posted the results from the R console below. Can anyone explain this behavior and suggest a solution?
> Ldata <- read.csv("9813306_LI_AIRTEM_HENGILL2_SUM.csv")
> head(Ldata)
X. date time Temp_C Light_lux
1 1 10/29/10 15:30:00 2.195 3444.5
2 2 10/29/10 15:45:00 1.330 3100.0
3 3 10/29/10 16:00:00 1.330 3100.0
4 4 10/29/10 16:15:00 1.221 2927.8
5 5 10/29/10 16:30:00 1.221 2152.8
6 6 10/29/10 16:45:00 1.112 1463.9
> str(Ldata)
'data.frame': 88417 obs. of 5 variables:
$ X. : int 1 2 3 4 5 6 7 8 9 10 ...
$ date : Factor w/ 923 levels "1/1/11","1/1/13",..: 105 105 105 105 105 105 105 105 105 105 ...
$ time : Factor w/ 98 levels "0:00:00","0:15:00",..: 33 34 35 36 37 38 39 40 41 42 ...
$ Temp_C : num 2.19 1.33 1.33 1.22 1.22 ...
$ Light_lux: num 3444 3100 3100 2928 2153 ...
This is the section of data that gives me trouble
> Ldata[c(12908:12911, 44194:44197, 79136:79139),]
X. date time Temp_C Light_lux
12908 12908 3/13/11 2:00:00 -5.496 0
12909 12909 3/13/11 2:15:00 -5.249 0
12910 12910 3/13/11 2:30:00 -5.995 0
12911 12911 3/13/11 2:45:00 -6.246 0
44194 44197 3/11/12 2:00:00 0.674 0
44195 44198 3/11/12 2:15:00 0.563 0
44196 44199 3/11/12 2:30:00 0.453 0
44197 44200 3/11/12 2:45:00 0.343 0
79136 79139 3/10/13 2:00:00 -2.494 0
79137 79140 3/10/13 2:15:00 -2.610 0
79138 79141 3/10/13 2:30:00 -2.377 0
79139 79142 3/10/13 2:45:00 -2.610 0
> P_datetime <- as.POSIXct(paste(Ldata$date, Ldata$time), format = "%m/%d/%y %H:%M:%S")
> Ldata_m <- cbind(Ldata, P_datetime)
Notice how P_datetime
is 1 hour earlier than the time column.
> Ldata_m[duplicated(Ldata_m$P_datetime),]
X. date time Temp_C Light_lux P_datetime
12908 12908 3/13/11 2:00:00 -5.496 0 2011-03-13 01:00:00
12909 12909 3/13/11 2:15:00 -5.249 0 2011-03-13 01:15:00
12910 12910 3/13/11 2:30:00 -5.995 0 2011-03-13 01:30:00
12911 12911 3/13/11 2:45:00 -6.246 0 2011-03-13 01:45:00
44194 44197 3/11/12 2:00:00 0.674 0 2012-03-11 01:00:00
44195 44198 3/11/12 2:15:00 0.563 0 2012-03-11 01:15:00
44196 44199 3/11/12 2:30:00 0.453 0 2012-03-11 01:30:00
44197 44200 3/11/12 2:45:00 0.343 0 2012-03-11 01:45:00
79136 79139 3/10/13 2:00:00 -2.494 0 2013-03-10 01:00:00
79137 79140 3/10/13 2:15:00 -2.610 0 2013-03-10 01:15:00
79138 79141 3/10/13 2:30:00 -2.377 0 2013-03-10 01:30:00
79139 79142 3/10/13 2:45:00 -2.610 0 2013-03-10 01:45:00
Upvotes: 2
Views: 221
Reputation: 176718
This is caused by daylight saving time. You need to set the tz=
argument in your as.POSIXct
call to match the timezone the data were recorded in. For example, there's no issue if you set tz="UTC"
:
> as.POSIXct(paste(Ldata$date, Ldata$time), format="%m/%d/%y %H:%M:%S", tz="UTC")
# [1] "2011-03-13 02:00:00 UTC" "2011-03-13 02:15:00 UTC"
# [3] "2011-03-13 02:30:00 UTC" "2011-03-13 02:45:00 UTC"
# [5] "2012-03-11 02:00:00 UTC" "2012-03-11 02:15:00 UTC"
# [7] "2012-03-11 02:30:00 UTC" "2012-03-11 02:45:00 UTC"
# [9] "2013-03-10 02:00:00 UTC" "2013-03-10 02:15:00 UTC"
# [11] "2013-03-10 02:30:00 UTC" "2013-03-10 02:45:00 UTC"
Upvotes: 2