Raffael
Raffael

Reputation: 20045

How to row-wise subtract a vector keeping the means of a data frames (df) columns from df?

Sorry for the confusing title ... here is what I want to do with a possible solution:

> df <- data.frame(a=c(1,2,3),b=c(4,5,6))
> v <- colMeans(df)

> df
  a b
1 1 4
2 2 5
3 3 6

> v
a b 
2 5 

> t(t(df)-v)
      a  b
[1,] -1 -1
[2,]  0  0
[3,]  1  1

But the data frame will have named columns and rows and be quite large. Which is why I am not comfortable with this solution and would like to know if there is a programmatical one out there which does (of course) not resort to loops and has no need for clumsy double-transpositions (maybe even fits neatly into a single line).

Upvotes: 2

Views: 4488

Answers (3)

Mariano
Mariano

Reputation: 21

In the answer from Hong Ooi, you can obtain directly a data.frame using:

df <- data.frame(scale(df, center=TRUE, scale=FALSE))

Upvotes: 0

HenriV
HenriV

Reputation: 578

Another option:

sweep(df, 2, v)

Upvotes: 2

Hong Ooi
Hong Ooi

Reputation: 57686

You want to mean-correct all columns in your data frame?

df <- scale(df, center=TRUE, scale=FALSE)

If there are columns that aren't numeric (factors and character) then you'll have to test for them:

numeric <- sapply(df, is.numeric)
df[numeric] <- scale(df[numeric], center=TRUE, scale=FALSE)

Note that this converts your df into a matrix as part of the scaling. If you don't want the conversion to happen, you could also do:

df[] <- lapply(df, function(x) x - mean(x))

Upvotes: 8

Related Questions