Reputation: 4930
In R, I have two vectors:
a <- c(1, 2, 3, 4)
b <- c(NA, 6, 7, 8)
How do I find the element-wise mean of the two vectors, removing NA, without a loop? i.e. I want to get the vector of
(1, 4, 5, 6)
I know the function mean()
, I know the argument na.rm = 1
. But I don't know how to put things together. To be sure, in reality I have thousands of vectors with NA appearing at various places, so any dimension-dependent solution wouldn't work. Thanks.
Upvotes: 39
Views: 22598
Reputation: 52049
Another option is collapse::fmean
, which defaults to column-wise means on matrices and na.rm = TRUE
:
fmean(rbind(a, b))
#[1] 1 4 5 6
Benchmark
Vectors a
and b
(length = 4):
microbenchmark::microbenchmark(
collapse = fmean(rbind(a, b)),
rowMeans = rowMeans(cbind(a, b), na.rm=TRUE),
colMeans = colMeans(rbind(a, b), na.rm=TRUE),
purrr = purrr::map2_dbl(a,b, ~mean(c(.x,.y), na.rm=T)),
apply = apply(rbind(a,b),2,mean,na.rm = TRUE)
)
# Unit: microseconds
# expr min lq mean median uq max neval
# collapse 6.501 7.9020 10.72705 9.7010 10.8010 56.101 100
# rowMeans 4.601 6.0505 9.21504 7.8010 9.4515 28.102 100
# colMeans 4.700 5.7010 7.76410 6.8515 8.2015 27.301 100
# purrr 94.101 104.4505 140.36694 108.8010 121.9510 2120.901 100
# apply 50.301 55.1010 65.37305 59.9005 65.6510 156.700 100
Large vectors (size 1e6):
a = sample(1e6)
b = sample(1e6)
# Unit: milliseconds
# expr min lq mean median uq max neval
# collapse 8.384401 9.621752 13.02568 10.160101 18.83060 34.2746 100
# rowMeans 18.504201 21.513251 27.88083 23.509051 31.28925 94.2124 100
# colMeans 8.117601 9.344551 12.69392 9.897702 12.50430 54.1703 100
Upvotes: 0
Reputation: 6165
A tidyverse
solution usign purrr
:
library(purrr)
a <- c(1, 2, 3, 4)
b <- c(NA, 6, 7, 8)
# expected:
c(1, 4, 5, 6)
#> [1] 1 4 5 6
# actual:
map2_dbl(a,b, ~mean(c(.x,.y), na.rm=T)) # actual
#> [1] 1 4 5 6
And for any number of vectors:
> pmap_dbl(list(a,b, a, b), compose(partial(mean, na.rm = T), c))
[1] 1 4 5 6
Upvotes: 2
Reputation: 11764
how about:
rowMeans(cbind(a, b), na.rm=TRUE)
or
colMeans(rbind(a, b), na.rm=TRUE)
Upvotes: 45
Reputation: 18782
I'm not exactly sure what you are asking for, but does
apply(rbind(a,b),2,mean,na.rm = TRUE)
do what you want?
Upvotes: 5