Reputation: 25
Using data(FANG), say I know that there is a smoothed relationship between volume and opening price. Also I know the length of the most predictive rolling mean varies by stock. For some it is short, day 2 days. For others 10. I’d like to create multiple rolling means of lengths between 2 and 10 days for each stock.
So far I tried the tibbletime package and got a start so that I can calculate the multiple rolling means for one.
library(tibbletime)
library(tidyverse)
data("FANG")
FB <- FANG %>% filter(symbol == “FB”)
meanstep <- seq(2, 10, 1)
col_names <- map_chr(meanstep, ~paste0("rollmean_", .x))
rollers <- map(meanstep, ~rollify(mean, window = .x)) %>% set_names(nm = col_names)
FB_multiroll<- bind_cols(FB, invoke_map(rollers, x = FB$volume))
However, I can’t seem to figure out how to make this work when grouping by multiple stocks.
I tried adding:
FANG_with_multiroll<- FANG %>% group_by(symbol) %>% bind_cols(FANG, invoke_map(rollers, x =FANG$volume)
But that didn’t work. It created the rolling means, but not by group. It instead just takes the entire dataframe regardless 'symbol'. Any ideas would be appreciated. One I get it to work, I plan on finding the highest correlation or rsquared for each symbol. If you have ideas about better ways to do that too, I’m interested.
Upvotes: 0
Views: 309
Reputation: 42544
The OP has tagged this question with dplyr and purr but the question has not been honoured by an answer for 6 month now.
With version 1.12.0 (on CRAN 13 Jan 2019), the data.table package has gained a frollmean()
function which can be used to create multiple rolling means of different lengths by group.
data(FANG, package = "tibbletime")
library(data.table) # version 1.12.0 +
meanstep <- 2:10
FANG_with_multiroll <- as.data.table(FANG)[
, sprintf("rollmean_%02i", meanstep) := frollmean(volume, meanstep), by = symbol][]
FANG_with_multiroll
symbol date open high low close volume adjusted rollmean_02 rollmean_03 1: FB 2013-01-02 27.44 28.18 27.420 28.00 69846400 28.00 NA NA 2: FB 2013-01-03 27.88 28.47 27.590 27.77 63140600 27.77 66493500 NA 3: FB 2013-01-04 28.01 28.93 27.830 28.76 72715400 28.76 67928000 68567466.7 4: FB 2013-01-07 28.69 29.79 28.650 29.42 83781800 29.42 78248600 73212600.0 5: FB 2013-01-08 29.51 29.60 28.860 29.06 45871300 29.06 64826550 67456166.7 --- 4028: GOOG 2016-12-23 790.90 792.74 787.280 789.91 623400 789.91 796250 933733.3 4029: GOOG 2016-12-27 790.68 797.86 787.657 791.55 789100 791.55 706250 793866.7 4030: GOOG 2016-12-28 793.70 794.23 783.200 785.05 1132700 785.05 960900 848400.0 4031: GOOG 2016-12-29 783.33 785.93 778.920 782.79 742200 782.79 937450 888000.0 4032: GOOG 2016-12-30 782.75 782.78 770.410 771.82 1760200 771.82 1251200 1211700.0 rollmean_04 rollmean_05 rollmean_06 rollmean_07 rollmean_08 rollmean_09 rollmean_10 1: NA NA NA NA NA NA NA 2: NA NA NA NA NA NA NA 3: NA NA NA NA NA NA NA 4: 72371050 NA NA NA NA NA NA 5: 66377275 67071100 NA NA NA NA NA --- 4028: 931575 990440 1230083.3 1286314 1333588 1420944 1488560 4029: 897575 903080 956883.3 1167086 1224163 1273089 1357760 4030: 878575 944600 941350.0 982000 1162788 1214000 1259050 4031: 821850 851300 910866.7 912900 952025 1116056 1166820 4032: 1106050 1009520 1002783.3 1032200 1018813 1041822 1180470
In order to demonstrate that this is working for each group we can print the first few lines of each group (and also only the first 10 column):
FANG_with_multiroll[, head(.SD, 3), .SDcols = 1:10, by = symbol]
symbol symbol date open high low close volume adjusted rollmean_02 rollmean_03 1: FB FB 2013-01-02 27.4400 28.1800 27.4200 28.0000 69846400 28.00000 NA NA 2: FB FB 2013-01-03 27.8800 28.4700 27.5900 27.7700 63140600 27.77000 66493500 NA 3: FB FB 2013-01-04 28.0100 28.9300 27.8300 28.7600 72715400 28.76000 67928000 68567467 4: AMZN AMZN 2013-01-02 256.0800 258.1000 253.2600 257.3100 3271000 257.31000 NA NA 5: AMZN AMZN 2013-01-03 257.2700 260.8800 256.3700 258.4800 2750900 258.48001 3010950 NA 6: AMZN AMZN 2013-01-04 257.5800 259.8000 256.6500 259.1500 1874200 259.14999 2312550 2632033 7: NFLX NFLX 2013-01-02 95.2100 95.8100 90.6900 92.0100 19431300 13.14429 NA NA 8: NFLX NFLX 2013-01-03 91.9700 97.9200 91.5300 96.5900 27912500 13.79857 23671900 NA 9: NFLX NFLX 2013-01-04 96.5400 97.7100 95.5400 95.9800 17761100 13.71143 22836800 21701633 10: GOOG GOOG 2013-01-02 719.4212 727.0013 716.5512 723.2512 5101500 361.26435 NA NA 11: GOOG GOOG 2013-01-03 724.9313 731.9312 720.7212 723.6713 4653700 361.47415 4877600 NA 12: GOOG GOOG 2013-01-04 729.3412 741.4713 727.6812 737.9713 5547600 368.61701 5100650 5100933
Upvotes: 2