How to apply the function to each row?

Question

I want to generate 4 new columns from an existing variable total by random sampling. the results for each row should meet the condition s1 + s2 + s3 + s4 == total. Fro example,

> tabulate(sample.int(4, 100, replace = TRUE))
[1] 22 21 27 30

The following code does not work since the function appears to recycle the first row and applies it column-wise.

 DT <- data.table(total = c(100, 110, 90, 92))
 DT[, c(paste0("s", 1:4)) := tabulate(sample.int(4, total, replace = TRUE))]

> DT
   total s1 s2 s3 s4
1:   100 31 31 31 31
2:   110 25 25 25 25
3:    90 22 22 22 22
4:    92 22 22 22 22

How to get around this? I am clearly missing some basic understanding on how R vector/list work. Your help will be much appreciated.

eliot.mcintire · Accepted Answer

Edited following edited question:

data.table will expect a list internally when you want to assign to many columns. To get it so each row is unique, then you can do that by adding a by each row:

DT <- data.table(total = c(100, 110, 90, 102, 92))
DT[, c(paste0("s", 1:4)) := {
  as.list(tabulate(sample.int(4, total, replace = TRUE)))
  }, by = seq(NROW(DT))]

Which outputs the following, satisfying the OP criteria:

> DT
   total s1 s2 s3 s4
1:   100 27 28 28 17
2:   110 25 23 36 26
3:    90 26 19 26 19
4:   102 28 24 21 29
5:    92 17 27 22 26
> apply(DT[, 2:5],1, sum)
[1] 100 110  90 102  92

How to apply the function to each row?

Answers (2)

Related Questions