Reputation: 399
I am loading in time-series data into R for analysis. I am trying to lag one of the variables in order to difference the series. Unfortunately, the values of the differences variables all equal 0, because R wasn't successful at lagging the weight variable. I know I am supposed to use the as.ts(data$date) to specify that that "date" variable is a time series but every time I do so it changes the "date" variable into numeric numbers. Not to mention I thought I specified that the "date" column in the dataset was a time/date variable when I initially loaded it. How can I specify the data.frame as a time series?
data=read.csv("filelocation",header=T,colClasses=c("Date","numeric")
date weight
2010-10-04 52495
2010-10-01 53000
2010-09-30 52916
2010-09-29 52785
2010-09-28 53348
2010-09-27 52885
2010-09-24 52174
2010-09-23 51461
2010-09-22 51286
2010-09-21 50968
2010-09-20 49250
data=data[order(data$date),]
diffweight1=weight-lag(weight,1)
Upvotes: 3
Views: 1677
Reputation: 269371
Try this:
library(zoo)
z <- read.zoo("filelocation", header = TRUE, sep = ",")
diff(z)
Upvotes: 4
Reputation: 121568
when you manipulate times series it is better to use (zoo or xts) packages. Many time series operations as lags, diff become very simple.
here an example using xts package ( I prefer this one)
# I read your data
dat <- read.table (text = 'date weight
2010-10-04 52495
2010-10-01 53000
2010-09-30 52916
2010-09-29 52785
2010-09-28 53348
2010-09-27 52885
2010-09-24 52174
2010-09-23 51461
2010-09-22 51286
2010-09-21 50968
2010-09-20 49250',header=TRUE)
# I construct my xts object
dat.xts <- xts(dat$weight,order.by=as.POSIXct(dat$date))
# new 2 columns withs lags(1) and diff
merge(dat.xts, ll = lag(dat.xts),dd =diff(dat.xts))
dat.xts ll dd
2010-09-20 49250 NA NA
2010-09-21 50968 49250 1718
2010-09-22 51286 50968 318
2010-09-23 51461 51286 175
2010-09-24 52174 51461 713
2010-09-27 52885 52174 711
2010-09-28 53348 52885 463
2010-09-29 52785 53348 -563
2010-09-30 52916 52785 131
2010-10-01 53000 52916 84
2010-10-04 52495 53000 -505
Upvotes: 3
Reputation: 13363
Time-series objects are designed to track data sampled at equally spaced points in time. You have an uneven sampling interval, but ts(data)
seems to do what you're looking for.
Upvotes: 1
Reputation: 126
What I feel you need is difference between adjacent rows for weight col You can try :
weight <- c(20,40,70,110)
diff(weight)
[1] 20 30 40
since 40 - 20 = 20, 70 - 40 = 30 and so on similarly try difftime for time series in case you need that
Upvotes: 1