Deviation from means in data table in R

Question

I have a big data table called "dt", and I want to produce a data table of the same dimensions which gives the deviation from the row mean of each entry in dt.

This code works but it seems very slow to me. I hope there's a way to do it faster? Maybe I'm building my table wrong so I'm not taking advantage of the by-reference assignment. Or maybe this is as good as it gets?

(I'm a R novice so any other tips are appreciated!)

Here is my code:

library(data.table)

r <- 100 # of rows
c <- 100 # of columns

# build a data table with random cols 
# (maybe not the best way to build, but this isn't important)
dt <- data.table(rnorm(r))
for (i in c(1:(c-1))) {
  dt <- cbind(dt,rnorm(r))
}
colnames(dt) <- as.character(c(1:c))

devs <- copy(dt) 
means <- rowMeans(dt)

for (i in c(1:nrow(devs))) {
    devs[i, colnames(devs) := abs(dt[i,] - means[[i]])]
}

IceCreamToucan · Accepted Answer

If you subtract a vector from a data.frame (or data.table), that vector will be subtracted from every column of the data.frame (assuming they're all numeric). Numeric functions like abs also work on all-numeric data.frames. So, you can compute devs with

devs <- abs(dt - rowMeans(dt))

You don't need a loop to create dt either, you can use replicate, which replicates its second argument a number of times specified by the first argument, and arranges the results in a matrix (unless simplify = FALSE is given as an argument)

dt <- as.data.table(replicate(r, rnorm(r)))

Deviation from means in data table in R

Answers (2)

Related Questions