rm167
rm167

Reputation: 1247

R_Dividing time series in to monthly, annual, seasonal

I am new user of R. I have a time series with four years of data (say observation from different stations, a-f) and the interval is 12 hours. I actually added the first column using

 t<-seq(from=as.POSIXct("2009-9-25 18:00", tz="cet"),by="12 hours", length.out=NROW(obs))
 obsf<-cbind(t,obs)

where 'obs' being the observation matrix. Find below the first four rows of the data frame. (I don't know why 't' column appears as numeric and not time stamps)

              t        a        b         c       d        e        f              
[1,] 1253894400 108.6912 107.7886 107.1125 106.7521 106.7440 107.0581  
[2,] 1253937600 109.1711 108.8854 108.6159 108.4135 108.2789 108.1683  
[3,] 1253980800 104.1059 103.2223 102.5102 102.0592 101.9324 102.1317  
[4,] 1254024000 104.7609 104.5823 104.3817 104.2230 104.1266 104.0673  

I want to divide the data frame in to yearly & monthly for some analyses. I think there are many ways to do it. I don't know which one is more convenient for this case. Can anybody help? I don't want to use any package and want to try with basic R functions as it would help me to understand R better.

Upvotes: 0

Views: 605

Answers (2)

spatton
spatton

Reputation: 41

Since you mention you are a new user of R, and you're using time series, I strongly recommend the online free book "A Little Book of R for Time Series" which briefly covers reading, plotting, and modelling with time series data.

http://a-little-book-of-r-for-time-series.readthedocs.org/en/latest/

Upvotes: 2

Silence Dogood
Silence Dogood

Reputation: 3597

cbind and rbind enforce a common class on all of their inputs. e.g:

cbind(character=letters[1:5],numeric=seq(1:5))

as

     character numeric
[1,] "a"       "1"    
[2,] "b"       "2"    
[3,] "c"       "3"    
[4,] "d"       "4"    
[5,] "e"       "5" 

Here the numeric inputs were converted to character, to match the class of column 1.

The same behavior is observed in:

Using cbind:

cbind(date=seq(from=as.POSIXct("2009-9-25 18:00", tz="cet"),by="12 hours", length.out=5),Variable=seq(1:5))

Output:

           date Variable
[1,] 1253894400        1
[2,] 1253937600        2
[3,] 1253980800        3
[4,] 1254024000        4
[5,] 1254067200        5

Using data.frame:

data.frame(date=seq(from=as.POSIXct("2009-9-25 18:00", tz="cet"),by="12 hours", length.out=5),Variable=seq(1:5))

Output

                 date Variable
1 2009-09-25 18:00:00        1
2 2009-09-26 06:00:00        2
3 2009-09-26 18:00:00        3
4 2009-09-27 06:00:00        4
5 2009-09-27 18:00:00        5

You can use time series packages such as xts to subset by time frame:

Conversion from data.frame to xts

time index goes into order.by and the rest of data as first input.

test.df<-data.frame(date=seq(from=as.POSIXct("2009-9-25 18:00", tz="cet"),by="12 hours", length.out=200),Variable=seq(1:200))
test.xts<-xts(test.df[,-1],order.by=test.df[,1])

Subset

endpoints gives the time indexes according to input for option on=, days,months, years,

test.xts[endpoints(test.xts,on="years",k=1),]
                    [,1]
2009-12-31 17:00:00  195
2010-01-03 05:00:00  200

test.xts[endpoints(test.xts,on="months",k=1),]
                    [,1]
2009-09-30 18:00:00   11
2009-10-31 17:00:00   73
2009-11-30 17:00:00  133
2009-12-31 17:00:00  195
2010-01-03 05:00:00  200

Upvotes: 1

Related Questions