user3576287
user3576287

Reputation: 1032

How to do median centering for my dataset in R?

I would like to know how can I do median centering in R for my dataset. I have done that in two ways but I am not sure about them if they are correct or not.

here are my codes:

simple dataset:

set.seed(100)
df = (matrix(rnorm(20), 5, 4))
df   
         [,1]       [,2]        [,3]        [,4]
[1,] -0.50219235  0.3186301  0.08988614 -0.02931671
[2,]  0.13153117 -0.5817907  0.09627446 -0.38885425
[3,] -0.07891709  0.7145327 -0.20163395  0.51085626
[4,]  0.88678481 -0.8252594  0.73984050 -0.91381419
[5,]  0.11697127 -0.3598621  0.12337950  2.31029682

by using scale function: (which I've read in some forums)

scale(df,center = T)

            [,1]       [,2]       [,3]       [,4]
[1,] -1.21733894  0.7238418 -0.2307246 -0.2640103
[2,]  0.04109693 -0.6766529 -0.2122225 -0.5541570
[3,] -0.37680714  1.3396200 -1.0750402  0.1719093
[4,]  1.54086497 -1.0553388  1.6517068 -0.9777997
[5,]  0.01218417 -0.3314701 -0.1337194  1.6240577

by subtracting the row-median from each entries in whole data.frame

df - rowMedians(df)

            [,1]       [,2]         [,3]        [,4]
[1,] -0.532477068  0.2883454  0.059601427 -0.05960143
[2,]  0.277821059 -0.4355008  0.242564354 -0.24256435
[3,] -0.294886674  0.4985631 -0.417603536  0.29488667
[4,]  0.929494272 -0.7825500  0.782549963 -0.87110472
[5,] -0.003204115 -0.4800375  0.003204115  2.19012144

but these two results are not the same which makes me confuse, if I used the right function to do it or now.

I do appreciate your help if you could help me out with this problem or give me more suggestions.

Best,

Upvotes: 1

Views: 10066

Answers (3)

user2110417
user2110417

Reputation:

You can create a function and try as follows:

median_center <- function(x) {
    apply(x, 2, function(y) y - median(y))
}

# apply it
median_center(df)

Upvotes: 0

Jon
Jon

Reputation: 70

This can be achieved using the center argument of the base scale function e.g. scale(x, center = median(x), scale = F) (or scale = T if you also want to scale your data)

Upvotes: 0

yrjo
yrjo

Reputation: 103

Row medians can be obtained by:

rowmed <- apply(df,1,median)

Then you can simply subtract the row medians from the rows:

df - rowmed

Upvotes: 1

Related Questions