Reputation: 501
I have the following working code:
x <- c(10.7, 13.0, 11.4, 11.5, 12.5, 14.1, 14.8, 14.1, 12.6, 16.0, 11.7, 10.6,
10.0, 11.4, 7.9, 9.5, 8.0, 11.8, 10.5, 11.2, 9.2, 10.1, 10.4, 10.5)
cusum <- function(x) {
s <- NA
mn <- mean(x)
for (i in seq_along(x)) {
if (i == 1)
s[i] <- 0 + x[i] - mn
else
s[i] <- s[i - 1] + x[i] - mn
}
s
}
cusum(x)
I wish to vectorize my code for performance, but I do not know how because:
s[i - 1]
to compute s[i]
) in lapply
and al.Reduce
(i.e. s
and x
)How can I vectorize my function in base R? I work in a restrictive environment where I only have access to base R.
Upvotes: 1
Views: 140
Reputation: 160417
@Cettt's answer is the most direct, and if that's what is needed, stick with it. However, if your point is that you don't know how to vectorize an arbitrary function, then ...
When your calculations rely on a value and the previous value, that's one thing, and vectorizing can be done. However, it looks as if your calculations rely on a previous calculation, in which case it's a different thing altogether. Your function appears to be the latter, so here's an attempt:
func <- function(x) {
mn <- mean(x)
Reduce(function(a, b) {
a + b - mn
}, x, init = 0, accumulate = TRUE)[-1]
}
func(x)
# [1] -6.96e-01 9.08e-01 9.12e-01 1.02e+00 2.12e+00 4.82e+00 8.23e+00
# [8] 1.09e+01 1.21e+01 1.67e+01 1.70e+01 1.62e+01 1.49e+01 1.49e+01
# [15] 1.14e+01 9.47e+00 6.07e+00 6.47e+00 5.58e+00 5.38e+00 3.19e+00
# [22] 1.89e+00 8.96e-01 -1.60e-14
Walk-through:
Reduce
first calls func(x[1], x[2])
, let's call that result res[1]
. It then calls func(res[1], x[2])
(and stores in res[2]
). Then func(res[2], x[3])
, etc.
Since you have a different first-condition (if (i == 1)
), then we use init=0
to seed that initial state. We could also have done something like:
func <- function(x) {
mn <- mean(x)
Reduce(function(a, b) {
if (is.null(a)) {
0 + b - mn
} else {
a + b - mn
}
}, x, init = NULL, accumulate = TRUE)[-1]
}
allowing for a more-complex initial condition.
Since we defined init=
, we also need to drop the first returned value in this case, ergo the [-1]
.
Reduce
defaults to only returning the last calculated value (-1.60e-14
here), so we add accumulate=TRUE
to capture all of the calculated values.Upvotes: 2
Reputation: 11981
You don't need loops for this particular iteration: just use cumsum
on x - mean(x)
:
cumsum(x - mean(x))
[1] -6.958333e-01 9.083333e-01 9.125000e-01 1.016667e+00 2.120833e+00 4.825000e+00 8.229167e+00 1.093333e+01 1.213750e+01
[10] 1.674167e+01 1.704583e+01 1.625000e+01 1.485417e+01 1.485833e+01 1.136250e+01 9.466667e+00 6.070833e+00 6.475000e+00
[19] 5.579167e+00 5.383333e+00 3.187500e+00 1.891667e+00 8.958333e-01 -1.598721e-14
Upvotes: 2