Reputation: 821

subtract a constant vector from each row in a matrix in r

I have a matrix with 5 columns and 4 rows. I also have a vector with 3 columns. I want to subtract the values in the vector from columns 3,4 and 5 respectively at each row of the matrix.

b <- matrix(rep(1:20), nrow=4, ncol=5)
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    5    9   13   17
[2,]    2    6   10   14   18
[3,]    3    7   11   15   19
[4,]    4    8   12   16   20

c <- c(5,6,7)

to get

     [,1] [,2] [,3] [,4] [,5]
[1,]    1    5    4    7   10
[2,]    2    6    5    8   11
[3,]    3    7    6    9   12
[4,]    4    8    7   10   13

Upvotes: 44

Answers (7)

ThomasIsCoding

Reputation: 102900

We can use outer to create a masking matrix msk and then substract it from b, e.g.,

> x <- c(5, 6, 7)

> (msk <- outer(rep(1, nrow(b)), replace(rep(0, ncol(b)), c(3, 4, 5), x)))
     [,1] [,2] [,3] [,4] [,5]
[1,]    0    0    5    6    7
[2,]    0    0    5    6    7
[3,]    0    0    5    6    7
[4,]    0    0    5    6    7

> (b <- b - msk)
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    5    4    7   10
[2,]    2    6    5    8   11
[3,]    3    7    6    9   12
[4,]    4    8    7   10   13

Upvotes: 0

Liang Zhang

Reputation: 829

For performance considerations, the operator %r-% from {collapse} package would be the best. See the following benchmark:

# the original question:
b <- matrix(rep(1:20), nrow=4, ncol=5)
c <- c(5,6,7)
box::use(collapse[`%r-%`], rray[`%b-%`])
bench::mark(
  collapse = b[, 3:5] %r-% c,
  transpose = t(t(b[, 3:5]) - c),
  sweep = sweep(b[, 3:5], 2, c),
  rray = b[, 3:5] %b-% matrix(c, nrow = 1),
  check = F
)
#> # A tibble: 4 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 collapse      5.2µs    5.8µs   145873.        0B    14.6 
#> 2 transpose     7.9µs    8.7µs    97164.        0B     9.72
#> 3 sweep        26.6µs   29.4µs    28480.    50.6KB    19.9 
#> 4 rray         85.5µs   91.7µs     9262.   184.7KB    14.9

# further testing for larger data
data <- matrix(rnorm(100*1000), nrow = 100)
bench::press(
  ncol = c(10, 100, 1000), {
    b <- data[, 1:ncol]
    c <- rnorm(ncol)
    bench::mark(
      collapse = b %r-% c,
      transpose = t(t(b) - c),
      sweep = sweep(b, 2, c),
      rray = b %b-% matrix(c, nrow = 1),
      check = F
    )
  }
)
#> Running with:
#>    ncol
#> 1    10
#> 2   100
#> 3  1000
#> # A tibble: 12 × 7
#>    expression  ncol      min   median `itr/sec` mem_alloc `gc/sec`
#>    <bch:expr> <dbl> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#>  1 collapse      10    5.3µs    7.3µs    84959.    7.86KB     8.50
#>  2 transpose     10   11.6µs   15.2µs    49990.   15.72KB    20.0 
#>  3 sweep         10   29.9µs     33µs    24897.   15.72KB    17.4 
#>  4 rray          10   56.6µs   66.1µs    13142.    9.94KB    12.7 
#>  5 collapse     100   12.6µs   49.1µs    17810.   78.17KB    29.5 
#>  6 transpose    100     49µs  119.3µs     7639.  156.34KB    26.1 
#>  7 sweep        100   72.9µs  153.8µs     5882.  156.34KB    18.2 
#>  8 rray         100   92.3µs  129.2µs     6690.      79KB    13.2 
#>  9 collapse    1000     78µs  441.8µs     2023.   781.3KB    36.1 
#> 10 transpose   1000  689.2µs   1.11ms      824.    1.53MB    28.9 
#> 11 sweep       1000  459.8µs   1.18ms      748.    1.53MB    29.9 
#> 12 rray        1000  402.2µs  758.4µs     1212.  789.16KB    25.9

^{Created on 2023-05-22 with reprex v2.0.2}

Although {rray} %b-% can be faster than sweep for larger matrices, {collapse} %r-% outperforms all the other methods.

Upvotes: 2

thelatemail

Reputation: 93938

This is exactly what sweep was made for:

b <- matrix(rep(1:20), nrow=4, ncol=5)
x <- c(5,6,7)

b[,3:5] <- sweep(b[,3:5], 2, x)
b

#     [,1] [,2] [,3] [,4] [,5]
#[1,]    1    5    4    7   10
#[2,]    2    6    5    8   11
#[3,]    3    7    6    9   12
#[4,]    4    8    7   10   13

..or even without subsetting or reassignment:

sweep(b, 2, c(0,0,x))

Upvotes: 80

jan-glx

Reputation: 9536

This can be done with the rray package in a very satisfying way (using its (numpy-like) broadcasting - operator %b-%):

#install.packages("rray")
library(rray)

b <- matrix(rep(1:20), nrow=4, ncol=5)
x <- c(5, 6, 7)

b[, 3:5] <- b[, 3:5] %b-% matrix(x, 1)
b
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    1    5    4    7   10
#> [2,]    2    6    5    8   11
#> [3,]    3    7    6    9   12
#> [4,]    4    8    7   10   13

For large matrices this is even faster than sweep:

#install.packages("bench")
res <- bench::press(
  size = c(10, 1000, 10000),
  frac_selected = c(0.1, 0.5, 1),
  {
  B <- matrix(sample(size*size), nrow=size, ncol=size)
  B2 <- B
  x <- sample(size, size=ceiling(size*frac_selected))
  idx <- sample(size, size=ceiling(size*frac_selected))

  bench::mark(rray = {B2[, idx] <- B[, idx, drop = FALSE] %b-% matrix(x, nrow = 1); B2}, 
              sweep = {B2[, idx] <- sweep(B[, idx, drop = FALSE], MARGIN = 2, x); B2}
  )
  }
)
plot(res)

Upvotes: 0

MrFlick

Reputation: 206606

Perhaps not that elegant, but

b <- matrix(rep(1:20), nrow=4, ncol=5)
x <- c(5,6,7)

b[,3:5] <- t(t(b[,3:5])-x)

should do the trick. We subset the matrix to change only the part we need, and we use t() (transpose) to flip the matrix so simple vector recycling will take care of subtracting from the correct row.

If you want to avoid the transposed, you could do something like

b[,3:5] <- b[,3:5]-x[col(b[,3:5])]

as well. Here we subset twice, and we use the second to get the correct column for each value in x because both those matrices will index in the same order.

I think my favorite from the question that @thelatemail linked was

b[,3:5] <- sweep(b[,3:5], 2, x, `-`)

Upvotes: 12

psadosky

Reputation: 89

Another way, with apply:

b[,3:5] <- t(apply(b[,3:5], 1, function(x) x-c))

Upvotes: 4

Barranka

Reputation: 21067

A simple solution:

b <- matrix(rep(1:20), nrow=4, ncol=5)
c <- c(5,6,7)

for(i in 1:nrow(b)) {
  b[i,3:5] <- b[i,3:5] - c
}

Upvotes: 2

subtract a constant vector from each row in a matrix in r

Answers (7)

Related Questions