boshek
boshek

Reputation: 4406

Method to compare previous day to current day values

I am looking for a better way to compare a value from a day (day X) to the previous day (day X-1). Here I am using the airquality dataset. Suppose I am interested in comparing the wind from one day to the wind from the previous day. Right now I am using merge() to bring together two dataframes - one current day dataframe and one from the previous day. I am also just subtracting 1 from the Day column to get the PrevDay column:

airquality$PrevDay=airquality$Day-1

airquality.comp <- merge(
  airquality[,c("Wind","Day")],
  airquality[,c("Temp","PrevDay")], 
  by.x=c("Day"),by.y=c("PrevDay"))

My issue here is that I'd need to create another dataframe if I wanted to look back 2 days or if I wanted to switch Wind and Temp and look at them the other way. This just seems clunky. Can anyone recommend a better way of doing this?

Upvotes: 1

Views: 284

Answers (4)

Mark S
Mark S

Reputation: 613

If you are interested in autocorrelation or cross-correlation, in particular, then you might also consider something like mutual information, which will work for non-Gaussian data as well. Both the infotheo and entropy (more here) packages for R have built-in functions to do so.

Upvotes: 1

Joe R
Joe R

Reputation: 51

It depends on what questions you are trying to answer, but I would look into Autocorrelation (the correlation of a time series with its own lagged values). You may want to look into the acf() function to compare the time series to itself since this will help you highlight which lags are significantly correlated.

Or if you want to compare 2 different metrics (such as Wind and Temp), then you may want to try the ccf() function since it allows you to input 2 different vectors and it will compute the cross correlation with lags. For example:

ccf(airquality$Wind,airquality$Temp)

Upvotes: 1

Pierre L
Pierre L

Reputation: 28441

IMO data.table may be harder to get used to compared to dplyr, but it will save your tail later when you need robust analysis:

setDT(airquality)[, shift(Wind, n=2L, type="lag") < Wind]

In base R, you can add an NA value and eliminate the last for comparison:

with(airquality, c(NA,head(Wind,-1)) < Wind)

Upvotes: 3

DatamineR
DatamineR

Reputation: 9618

Whar kind of comparison do you need?

For example, to check if the followonf values is greater you could use:

library(dplyr)
with(airquality, lag(Wind) < Wind)

Or with two lags:

with(airquality, lag(Wind, 2) < Wind)

Upvotes: 1

Related Questions