Reputation: 499
I have trying to find an efficient way to do the code below:
library(zoo)
MaPrice <- function(x,N) {
Mavg <- rollapply(x, N, mean)
colnames(Mavg) <- "MaPrice"
Mavg
}
Price.MA.1Hr <- MaPrice(out, 12)
Price.MA.2Hr <- MaPrice(out, 24)
Price.MA.4Hr <- MaPrice(out, 48)
Price.MA.6Hr <- MaPrice(out, 72)
The solution I came up with is the following:
MaPrice <- function(x,N) {
MA <- matrix( ,nrow = nrow(x), ncol = length(N))
for (i in 1:length(N)) {
MA[,i]<- rollapply(x, N[i], mean)
}
MA
}
N <- c(1,2,4,6,8,12)
Price.MA <- MaPrice(Price, N)
Price is a vector (ncol = 1)
This still provides the answer I am looking for, but I am looking to see if there is an alternate maybe an efficient way. Any help is greatly appreciated.
Note: Already looked at the question "rollapply multiple time with diffrent arguments" on SO. Didnt understand the process.
Upvotes: 0
Views: 863
Reputation: 13591
N <- c(1,2,4,6,8,12)
set.seed(1)
Price <- data.frame(x=runif(20)*10)
# x
# 1 2.6550866
# 2 3.7212390
# 3 5.7285336
# 4 9.0820779
# etc
NOTE Price
can also be a vector and the solution works
Rewrite your function to return a data.frame with the N
value used
MaPrice <- function(x,N) {
Mavg <- data.frame(N = N, avg = rollapply(x, N, mean))
Mavg
}
You can use purrr::map_df
to iterate through N
library(purrr)
Price.MA <- map_df(N, ~MaPrice(Price,.x))
N x
1 1 2.6550866
2 1 3.7212390
3 1 5.7285336
4 1 9.0820779
5 1 2.0168193
# etc
Since you might be interested in performance
Make Price
a vector of 25,000 elements
N <- c(1,2,4,6,8,12)
set.seed(1)
Price <- runif(25000)*10
# parallel solution
library(parallel)
library(zoo)
PoGibas <- function(Price, N) {
res <- mclapply(N, function(i)
data.frame(i, rollapply(Price, i, mean)))
# Final result
do.call("rbind", res)
}
# map_df solution
library(purrr)
MaPrice <- function(x,N) {
Mavg <- data.frame(N = N, avg = rollapply(x, N, mean))
Mavg
}
CP <- function(Price, N) {
Price.MA <- map_df(N, ~MaPrice(Price,.x))
}
# mapply solution
out <- tbl_df(Price)
CArendt <- function() {
mapply(function(x, n) {
rollapply(x, n, mean, fill = NA, align = "right")
}, list(out), list(1, 2, 4, 6, 8, 12)) %>% tbl_df()
}
# lapply zoo solution
library(zoo)
library(dplyr)
GG <- function(v, w) {
z <- zoo(v)
zz <- do.call("merge", lapply(setNames(w, w), rollmeanr, x = z))
}
Using microbechmark
to compare solutions
library(microbenchmark)
microbenchmark(CP(Price,N), PoGibas(Price,N), CArendt())
expr min lq mean median uq max
CP(Price, N) 298.7038 308.9860 345.8867 334.0053 377.5278 468.1461
PoGibas(Price, N) 306.3882 319.5721 358.8717 372.9655 388.6214 488.5565
CArendt() 2589.2316 2647.2216 2762.0759 2682.7357 2733.5398 8746.8235
GG(Price, N) 785.3042 853.5904 876.4554 869.0996 895.1906 1010.1746
neval
100
100
100
100
The mean time of solutions is 353, 371, 876, and >2,000 ms
Upvotes: 2
Reputation: 270448
Assuming that the input vector is v
this gives a zoo object zz
whose ith column was formed using w[i]
. as.data.frame(zz)
or coredata(zz)
could be used to produce a data.frame or matrix respectively if needed. setNames(w, w)
could be reduced to just w
if column names are not important.
# inputs
v <- 1:100 # data
w <- c(12, 24, 48, 72)
z <- zoo(v)
zz <- do.call("merge", lapply(setNames(w, w), rollmeanr, x = z))
or if a list of vectors is sufficient then:
lapply(setNames(w, w), rollmean, x = v)
Upvotes: 3
Reputation: 1757
So in the future, providing example data can make things easier for those who are trying to help. Also, it can speed things up to include packages and library
statements and such so that they can run your code verbatim. (See the reprex
package for useful tools in asking a good question).
I am a fan of using the apply
family and list-based processing, so I would lean towards the following, along with dplyr
. Getting used to the apply
family can make this sort of task straightforward. Essentially, mapply
loops over lists, applying the ith element to the ith call of the function (and recycling where needed).
library(zoo)
library(dplyr)
out <- tbl_df(randu[, 1])
## example with one
out %>% mutate(test = rollapply(., 12, mean, fill = NA))
#> # A tibble: 400 x 2
#> value test
#> <dbl> <dbl>
#> 1 0.000031 NA
#> 2 0.044495 NA
#> 3 0.822440 NA
#> 4 0.322291 NA
#> 5 0.393595 NA
#> 6 0.309097 0.4633195
#> 7 0.826368 0.5074730
#> 8 0.729424 0.5794351
#> 9 0.317649 0.5804980
#> 10 0.599793 0.5593651
#> # ... with 390 more rows
## example with multiple, using mapply - basically just applying rollapply...
mapply(function(x, n) {
rollapply(x, n, mean, fill = NA, align = "right")
}, list(out), list(1, 2, 4, 6, 8, 12)) %>% tbl_df()
#> # A tibble: 400 x 6
#> V1 V2 V3 V4 V5 V6
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0.000031 NA NA NA NA NA
#> 2 0.044495 0.0222630 NA NA NA NA
#> 3 0.822440 0.4334675 NA NA NA NA
#> 4 0.322291 0.5723655 0.2973143 NA NA NA
#> 5 0.393595 0.3579430 0.3957053 NA NA NA
#> 6 0.309097 0.3513460 0.4618558 0.3153248 NA NA
#> 7 0.826368 0.5677325 0.4628377 0.4530477 NA NA
#> 8 0.729424 0.7778960 0.5646210 0.5672025 0.4309676 NA
#> 9 0.317649 0.5235365 0.5456345 0.4830707 0.4706699 NA
#> 10 0.599793 0.4587210 0.6183085 0.5293210 0.5400821 NA
#> # ... with 390 more rows
## with lapply - probably more appropriate
lapply(list(1, 2, 4, 6, 8, 12)
, FUN = function(x, n) {
return(rollapply(x, n, mean, fill = NA, align = "right"))
}, x = out) %>% setNames(., paste0("v", 1:6)) %>% do.call(bind_cols, .)
#> # A tibble: 400 x 6
#> v1 v2 v3 v4 v5 v6
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0.000031 NA NA NA NA NA
#> 2 0.044495 0.0222630 NA NA NA NA
#> 3 0.822440 0.4334675 NA NA NA NA
#> 4 0.322291 0.5723655 0.2973143 NA NA NA
#> 5 0.393595 0.3579430 0.3957053 NA NA NA
#> 6 0.309097 0.3513460 0.4618558 0.3153248 NA NA
#> 7 0.826368 0.5677325 0.4628377 0.4530477 NA NA
#> 8 0.729424 0.7778960 0.5646210 0.5672025 0.4309676 NA
#> 9 0.317649 0.5235365 0.5456345 0.4830707 0.4706699 NA
#> 10 0.599793 0.4587210 0.6183085 0.5293210 0.5400821 NA
#> # ... with 390 more rows
One last note - I definitely recommend avoiding .
in your variable names, as the period is used for S3 class dispatch (and it has been painful to remove the periods out of all of my code written before I knew that handy tidbit). Further reading on style
Upvotes: 2
Reputation: 28379
As you wanted alternative efficient way, here's solution using parallel
. I'm applying rollapply
on N
vector (no reason to loop), but instead of usual apply we're running those things in parallel.
# Packages
library(parallel)
library(zoo)
# Input
N <- 1:4
Price <- 1:10
# Main computation
res <- mclapply(N, function(i)
data.frame(i, rollapply(Price, i, mean)))
# Final result
do.call("rbind", res)
Upvotes: 2