Reputation: 463
I have a big data table called "dt", and I want to produce a data table of the same dimensions which gives the deviation from the row mean of each entry in dt.
This code works but it seems very slow to me. I hope there's a way to do it faster? Maybe I'm building my table wrong so I'm not taking advantage of the by-reference assignment. Or maybe this is as good as it gets?
(I'm a R novice so any other tips are appreciated!)
Here is my code:
library(data.table)
r <- 100 # of rows
c <- 100 # of columns
# build a data table with random cols
# (maybe not the best way to build, but this isn't important)
dt <- data.table(rnorm(r))
for (i in c(1:(c-1))) {
dt <- cbind(dt,rnorm(r))
}
colnames(dt) <- as.character(c(1:c))
devs <- copy(dt)
means <- rowMeans(dt)
for (i in c(1:nrow(devs))) {
devs[i, colnames(devs) := abs(dt[i,] - means[[i]])]
}
Upvotes: 2
Views: 82
Reputation: 333
Not sure if its what you are looking for, but the sweep
function will help you applying operation combining matrices and vectors (like your row means).
table <- matrix(rnorm(r*c), nrow=r, ncol=c) # generate random matrix
means <- apply(table, 1, mean) # compute row means
devs <- abs(sweep(table, 1, means, "-")) # compute by row the deviation from the row mean
Upvotes: 0
Reputation: 28695
If you subtract a vector from a data.frame
(or data.table
), that vector will be subtracted from every column of the data.frame
(assuming they're all numeric). Numeric functions like abs
also work on all-numeric data.frame
s. So, you can compute devs
with
devs <- abs(dt - rowMeans(dt))
You don't need a loop to create dt
either, you can use replicate
, which replicates its second argument a number of times specified by the first argument, and arranges the results in a matrix (unless simplify = FALSE
is given as an argument)
dt <- as.data.table(replicate(r, rnorm(r)))
Upvotes: 2