Bob SomeAle
Bob SomeAle

Reputation: 25

R multiple rolling mean windows for data groups

Using data(FANG), say I know that there is a smoothed relationship between volume and opening price. Also I know the length of the most predictive rolling mean varies by stock. For some it is short, day 2 days. For others 10. I’d like to create multiple rolling means of lengths between 2 and 10 days for each stock.

So far I tried the tibbletime package and got a start so that I can calculate the multiple rolling means for one.

library(tibbletime)
library(tidyverse)

data("FANG")

FB <- FANG %>% filter(symbol == “FB”)

meanstep <- seq(2, 10, 1)

col_names <- map_chr(meanstep, ~paste0("rollmean_", .x))

rollers <- map(meanstep, ~rollify(mean, window = .x)) %>% set_names(nm = col_names)

FB_multiroll<- bind_cols(FB, invoke_map(rollers, x = FB$volume))

However, I can’t seem to figure out how to make this work when grouping by multiple stocks.

I tried adding:

FANG_with_multiroll<- FANG %>% group_by(symbol) %>% bind_cols(FANG, invoke_map(rollers, x =FANG$volume)

But that didn’t work. It created the rolling means, but not by group. It instead just takes the entire dataframe regardless 'symbol'. Any ideas would be appreciated. One I get it to work, I plan on finding the highest correlation or rsquared for each symbol. If you have ideas about better ways to do that too, I’m interested.

Upvotes: 0

Views: 309

Answers (1)

Uwe
Uwe

Reputation: 42544

The OP has tagged this question with and but the question has not been honoured by an answer for 6 month now.

With version 1.12.0 (on CRAN 13 Jan 2019), the package has gained a frollmean() function which can be used to create multiple rolling means of different lengths by group.

data(FANG, package = "tibbletime")
library(data.table)   # version 1.12.0 +
meanstep <- 2:10
FANG_with_multiroll <- as.data.table(FANG)[
  , sprintf("rollmean_%02i", meanstep) := frollmean(volume, meanstep), by = symbol][]
FANG_with_multiroll
      symbol       date   open   high     low  close   volume adjusted rollmean_02 rollmean_03
   1:     FB 2013-01-02  27.44  28.18  27.420  28.00 69846400    28.00          NA          NA
   2:     FB 2013-01-03  27.88  28.47  27.590  27.77 63140600    27.77    66493500          NA
   3:     FB 2013-01-04  28.01  28.93  27.830  28.76 72715400    28.76    67928000  68567466.7
   4:     FB 2013-01-07  28.69  29.79  28.650  29.42 83781800    29.42    78248600  73212600.0
   5:     FB 2013-01-08  29.51  29.60  28.860  29.06 45871300    29.06    64826550  67456166.7
  ---                                                                                         
4028:   GOOG 2016-12-23 790.90 792.74 787.280 789.91   623400   789.91      796250    933733.3
4029:   GOOG 2016-12-27 790.68 797.86 787.657 791.55   789100   791.55      706250    793866.7
4030:   GOOG 2016-12-28 793.70 794.23 783.200 785.05  1132700   785.05      960900    848400.0
4031:   GOOG 2016-12-29 783.33 785.93 778.920 782.79   742200   782.79      937450    888000.0
4032:   GOOG 2016-12-30 782.75 782.78 770.410 771.82  1760200   771.82     1251200   1211700.0
      rollmean_04 rollmean_05 rollmean_06 rollmean_07 rollmean_08 rollmean_09 rollmean_10
   1:          NA          NA          NA          NA          NA          NA          NA
   2:          NA          NA          NA          NA          NA          NA          NA
   3:          NA          NA          NA          NA          NA          NA          NA
   4:    72371050          NA          NA          NA          NA          NA          NA
   5:    66377275    67071100          NA          NA          NA          NA          NA
  ---                                                                                    
4028:      931575      990440   1230083.3     1286314     1333588     1420944     1488560
4029:      897575      903080    956883.3     1167086     1224163     1273089     1357760
4030:      878575      944600    941350.0      982000     1162788     1214000     1259050
4031:      821850      851300    910866.7      912900      952025     1116056     1166820
4032:     1106050     1009520   1002783.3     1032200     1018813     1041822     1180470

In order to demonstrate that this is working for each group we can print the first few lines of each group (and also only the first 10 column):

FANG_with_multiroll[, head(.SD, 3), .SDcols = 1:10, by = symbol]
    symbol symbol       date     open     high      low    close   volume  adjusted rollmean_02 rollmean_03
 1:     FB     FB 2013-01-02  27.4400  28.1800  27.4200  28.0000 69846400  28.00000          NA          NA
 2:     FB     FB 2013-01-03  27.8800  28.4700  27.5900  27.7700 63140600  27.77000    66493500          NA
 3:     FB     FB 2013-01-04  28.0100  28.9300  27.8300  28.7600 72715400  28.76000    67928000    68567467
 4:   AMZN   AMZN 2013-01-02 256.0800 258.1000 253.2600 257.3100  3271000 257.31000          NA          NA
 5:   AMZN   AMZN 2013-01-03 257.2700 260.8800 256.3700 258.4800  2750900 258.48001     3010950          NA
 6:   AMZN   AMZN 2013-01-04 257.5800 259.8000 256.6500 259.1500  1874200 259.14999     2312550     2632033
 7:   NFLX   NFLX 2013-01-02  95.2100  95.8100  90.6900  92.0100 19431300  13.14429          NA          NA
 8:   NFLX   NFLX 2013-01-03  91.9700  97.9200  91.5300  96.5900 27912500  13.79857    23671900          NA
 9:   NFLX   NFLX 2013-01-04  96.5400  97.7100  95.5400  95.9800 17761100  13.71143    22836800    21701633
10:   GOOG   GOOG 2013-01-02 719.4212 727.0013 716.5512 723.2512  5101500 361.26435          NA          NA
11:   GOOG   GOOG 2013-01-03 724.9313 731.9312 720.7212 723.6713  4653700 361.47415     4877600          NA
12:   GOOG   GOOG 2013-01-04 729.3412 741.4713 727.6812 737.9713  5547600 368.61701     5100650     5100933

Upvotes: 2

Related Questions