Reputation: 3003
I have a table of all of the trades on a market exchange for one security over a period of time.
Epoch Date Time Price
1.452033e+12 2016-01-05 14:37:38 0.00220556
1.452033e+12 2016-01-05 14:37:31 0.00220888
1.452033e+12 2016-01-05 14:37:15 0.00220888
I want to see if there is any connection between price and time of day. My plan is to take mean price for each day and subtract it from each price for that day to get the difference from the mean. Then, for each hour interval, calculate the mean difference. Then, for each of the 24 hourly periods of the day, calculate the mean hourly mean difference.
So far I figured out how to use tapply() to get the mean price for each day. I could use for
loops for everything else, but I want to learn the tricks for doing it more concisely in R.
Upvotes: 1
Views: 120
Reputation: 10483
Using the dplyr package you can do something like this:
library(dplyr)
df %>% group_by(Date) %>% mutate(normalizedPrice = Price - mean(Price, na.rm = TRUE))
To do the same on an hourly basis, you can add a new column for 'hour of day' and perform a very similar operation:
df$hourOfDay <- format(df$Time, '%H') # You will need time in POSIXct format
df %>% group_by(hourOfDay) %>% mutate(normalizedHourOfDayPrice = Price - mean(Price, na.rm = TRUE))
Note, if you leave time in the current format, you can also use substr()
to extract hourOfDay.
You can also group by day and hourOfDay like this:
df %>% group_by(Date, hourOfDay) %>% mutate(normalizedDailyHourOfDayPrice = Price - mean(Price, na.rm = TRUE))
And so on....does this work?
Upvotes: 3