Reputation: 1247
I am new user of R. I have a time series with four years of data (say observation from different stations, a-f) and the interval is 12 hours. I actually added the first column using
t<-seq(from=as.POSIXct("2009-9-25 18:00", tz="cet"),by="12 hours", length.out=NROW(obs))
obsf<-cbind(t,obs)
where 'obs' being the observation matrix. Find below the first four rows of the data frame. (I don't know why 't' column appears as numeric and not time stamps)
t a b c d e f
[1,] 1253894400 108.6912 107.7886 107.1125 106.7521 106.7440 107.0581
[2,] 1253937600 109.1711 108.8854 108.6159 108.4135 108.2789 108.1683
[3,] 1253980800 104.1059 103.2223 102.5102 102.0592 101.9324 102.1317
[4,] 1254024000 104.7609 104.5823 104.3817 104.2230 104.1266 104.0673
I want to divide the data frame in to yearly & monthly for some analyses. I think there are many ways to do it. I don't know which one is more convenient for this case. Can anybody help? I don't want to use any package and want to try with basic R functions as it would help me to understand R better.
Upvotes: 0
Views: 605
Reputation: 41
Since you mention you are a new user of R, and you're using time series, I strongly recommend the online free book "A Little Book of R for Time Series" which briefly covers reading, plotting, and modelling with time series data.
http://a-little-book-of-r-for-time-series.readthedocs.org/en/latest/
Upvotes: 2
Reputation: 3597
cbind
and rbind
enforce a common class
on all of their inputs. e.g:
cbind(character=letters[1:5],numeric=seq(1:5))
as
character numeric
[1,] "a" "1"
[2,] "b" "2"
[3,] "c" "3"
[4,] "d" "4"
[5,] "e" "5"
Here the numeric
inputs were converted to character
, to match the class
of column 1.
The same behavior is observed in:
Using cbind:
cbind(date=seq(from=as.POSIXct("2009-9-25 18:00", tz="cet"),by="12 hours", length.out=5),Variable=seq(1:5))
Output:
date Variable
[1,] 1253894400 1
[2,] 1253937600 2
[3,] 1253980800 3
[4,] 1254024000 4
[5,] 1254067200 5
Using data.frame:
data.frame(date=seq(from=as.POSIXct("2009-9-25 18:00", tz="cet"),by="12 hours", length.out=5),Variable=seq(1:5))
Output
date Variable
1 2009-09-25 18:00:00 1
2 2009-09-26 06:00:00 2
3 2009-09-26 18:00:00 3
4 2009-09-27 06:00:00 4
5 2009-09-27 18:00:00 5
You can use time series packages such as xts to subset by time frame:
Conversion from data.frame to xts
time index goes into order.by
and the rest of data as first input.
test.df<-data.frame(date=seq(from=as.POSIXct("2009-9-25 18:00", tz="cet"),by="12 hours", length.out=200),Variable=seq(1:200))
test.xts<-xts(test.df[,-1],order.by=test.df[,1])
Subset
endpoints
gives the time indexes according to input for option on=
, days
,months
, years
,
test.xts[endpoints(test.xts,on="years",k=1),]
[,1]
2009-12-31 17:00:00 195
2010-01-03 05:00:00 200
test.xts[endpoints(test.xts,on="months",k=1),]
[,1]
2009-09-30 18:00:00 11
2009-10-31 17:00:00 73
2009-11-30 17:00:00 133
2009-12-31 17:00:00 195
2010-01-03 05:00:00 200
Upvotes: 1