Reputation: 821
I have a matrix with 5 columns and 4 rows. I also have a vector with 3 columns. I want to subtract the values in the vector from columns 3,4 and 5 respectively at each row of the matrix.
b <- matrix(rep(1:20), nrow=4, ncol=5)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
[3,] 3 7 11 15 19
[4,] 4 8 12 16 20
c <- c(5,6,7)
to get
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 4 7 10
[2,] 2 6 5 8 11
[3,] 3 7 6 9 12
[4,] 4 8 7 10 13
Upvotes: 44
Views: 68256
Reputation: 102900
We can use outer
to create a masking matrix msk
and then substract it from b
, e.g.,
> x <- c(5, 6, 7)
> (msk <- outer(rep(1, nrow(b)), replace(rep(0, ncol(b)), c(3, 4, 5), x)))
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 5 6 7
[2,] 0 0 5 6 7
[3,] 0 0 5 6 7
[4,] 0 0 5 6 7
> (b <- b - msk)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 4 7 10
[2,] 2 6 5 8 11
[3,] 3 7 6 9 12
[4,] 4 8 7 10 13
Upvotes: 0
Reputation: 829
For performance considerations, the operator %r-%
from {collapse} package would be the best. See the following benchmark:
# the original question:
b <- matrix(rep(1:20), nrow=4, ncol=5)
c <- c(5,6,7)
box::use(collapse[`%r-%`], rray[`%b-%`])
bench::mark(
collapse = b[, 3:5] %r-% c,
transpose = t(t(b[, 3:5]) - c),
sweep = sweep(b[, 3:5], 2, c),
rray = b[, 3:5] %b-% matrix(c, nrow = 1),
check = F
)
#> # A tibble: 4 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 collapse 5.2µs 5.8µs 145873. 0B 14.6
#> 2 transpose 7.9µs 8.7µs 97164. 0B 9.72
#> 3 sweep 26.6µs 29.4µs 28480. 50.6KB 19.9
#> 4 rray 85.5µs 91.7µs 9262. 184.7KB 14.9
# further testing for larger data
data <- matrix(rnorm(100*1000), nrow = 100)
bench::press(
ncol = c(10, 100, 1000), {
b <- data[, 1:ncol]
c <- rnorm(ncol)
bench::mark(
collapse = b %r-% c,
transpose = t(t(b) - c),
sweep = sweep(b, 2, c),
rray = b %b-% matrix(c, nrow = 1),
check = F
)
}
)
#> Running with:
#> ncol
#> 1 10
#> 2 100
#> 3 1000
#> # A tibble: 12 × 7
#> expression ncol min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <dbl> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 collapse 10 5.3µs 7.3µs 84959. 7.86KB 8.50
#> 2 transpose 10 11.6µs 15.2µs 49990. 15.72KB 20.0
#> 3 sweep 10 29.9µs 33µs 24897. 15.72KB 17.4
#> 4 rray 10 56.6µs 66.1µs 13142. 9.94KB 12.7
#> 5 collapse 100 12.6µs 49.1µs 17810. 78.17KB 29.5
#> 6 transpose 100 49µs 119.3µs 7639. 156.34KB 26.1
#> 7 sweep 100 72.9µs 153.8µs 5882. 156.34KB 18.2
#> 8 rray 100 92.3µs 129.2µs 6690. 79KB 13.2
#> 9 collapse 1000 78µs 441.8µs 2023. 781.3KB 36.1
#> 10 transpose 1000 689.2µs 1.11ms 824. 1.53MB 28.9
#> 11 sweep 1000 459.8µs 1.18ms 748. 1.53MB 29.9
#> 12 rray 1000 402.2µs 758.4µs 1212. 789.16KB 25.9
Created on 2023-05-22 with reprex v2.0.2
Although {rray} %b-%
can be faster than sweep
for larger matrices, {collapse} %r-%
outperforms all the other methods.
Upvotes: 2
Reputation: 93938
This is exactly what sweep
was made for:
b <- matrix(rep(1:20), nrow=4, ncol=5)
x <- c(5,6,7)
b[,3:5] <- sweep(b[,3:5], 2, x)
b
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 5 4 7 10
#[2,] 2 6 5 8 11
#[3,] 3 7 6 9 12
#[4,] 4 8 7 10 13
..or even without subsetting or reassignment:
sweep(b, 2, c(0,0,x))
Upvotes: 80
Reputation: 9536
This can be done with the rray
package in a very satisfying way (using its (numpy-like) broadcasting -
operator %b-%
):
#install.packages("rray")
library(rray)
b <- matrix(rep(1:20), nrow=4, ncol=5)
x <- c(5, 6, 7)
b[, 3:5] <- b[, 3:5] %b-% matrix(x, 1)
b
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 5 4 7 10
#> [2,] 2 6 5 8 11
#> [3,] 3 7 6 9 12
#> [4,] 4 8 7 10 13
For large matrices this is even faster than sweep
:
#install.packages("bench")
res <- bench::press(
size = c(10, 1000, 10000),
frac_selected = c(0.1, 0.5, 1),
{
B <- matrix(sample(size*size), nrow=size, ncol=size)
B2 <- B
x <- sample(size, size=ceiling(size*frac_selected))
idx <- sample(size, size=ceiling(size*frac_selected))
bench::mark(rray = {B2[, idx] <- B[, idx, drop = FALSE] %b-% matrix(x, nrow = 1); B2},
sweep = {B2[, idx] <- sweep(B[, idx, drop = FALSE], MARGIN = 2, x); B2}
)
}
)
plot(res)
Upvotes: 0
Reputation: 206606
Perhaps not that elegant, but
b <- matrix(rep(1:20), nrow=4, ncol=5)
x <- c(5,6,7)
b[,3:5] <- t(t(b[,3:5])-x)
should do the trick. We subset the matrix to change only the part we need, and we use t()
(transpose) to flip the matrix so simple vector recycling will take care of subtracting from the correct row.
If you want to avoid the transposed, you could do something like
b[,3:5] <- b[,3:5]-x[col(b[,3:5])]
as well. Here we subset twice, and we use the second to get the correct column for each value in x
because both those matrices will index in the same order.
I think my favorite from the question that @thelatemail linked was
b[,3:5] <- sweep(b[,3:5], 2, x, `-`)
Upvotes: 12
Reputation: 89
Another way, with apply:
b[,3:5] <- t(apply(b[,3:5], 1, function(x) x-c))
Upvotes: 4
Reputation: 21067
A simple solution:
b <- matrix(rep(1:20), nrow=4, ncol=5)
c <- c(5,6,7)
for(i in 1:nrow(b)) {
b[i,3:5] <- b[i,3:5] - c
}
Upvotes: 2