user1828605
user1828605

Reputation: 1735

How to sort a column of data frame based on the index of the original data in R?

This may be very simple to achieve, but has me stumped.

I have a data frame:

    chip1 chip2
P1  1.57  2.13
P2  2.04  1.92
P3  1.90  2.11
P4  1.48  2.24

The next step for quantile normalization is to sort each column and then generate the row wise mean, like this:

   chip1 chip2     M
P1  1.48  1.92 1.700
P2  1.57  2.11 1.840
P3  1.90  2.13 2.015
P4  2.04  2.24 2.140

Then the final normalized data is:

       chip1  chip2
   P1  1.840  2.015
   P2  2.140  1.700
   P3  2.015  1.840
   P4  1.700  2.140

The normalized data is generated using the M column of the previous data frame which is reordered based on chip1 and chip2 from the first data frame. How can I order the M column using the index from the original columns? I'm little lost?

Thank you.

Upvotes: 2

Views: 88

Answers (2)

Randy Lai
Randy Lai

Reputation: 3174

Based on what you have mentioned, I wish it is what you mean.

> X = cbind(rnorm(5, 1), rnorm(5,0))
> X
          [,1]         [,2]
[1,] 2.2629543 -1.539950042
[2,] 0.6737666 -0.928567035
[3,] 2.3297993 -0.294720447
[4,] 2.2724293 -0.005767173
[5,] 1.4146414  2.404653389
> Y = apply(X,2,sort)
> cbind(Y, rowSums(Y))
          [,1]         [,2]       [,3]
[1,] 0.6737666 -1.539950042 -0.8661834
[2,] 1.4146414 -0.928567035  0.4860744
[3,] 2.2629543 -0.294720447  1.9682338
[4,] 2.2724293 -0.005767173  2.2666621
[5,] 2.3297993  2.404653389  4.7344527
> X[order(rowSums(Y)),]
          [,1]         [,2]
[1,] 2.2629543 -1.539950042
[2,] 0.6737666 -0.928567035
[3,] 2.3297993 -0.294720447
[4,] 2.2724293 -0.005767173
[5,] 1.4146414  2.404653389

Upvotes: 1

BrodieG
BrodieG

Reputation: 52637

nrm <- rowMeans(sapply(df, sort))
sapply(df, function(x) nrm[rank(x)])

produces:

     chip1 chip2
[1,] 1.840 2.015
[2,] 2.140 1.700
[3,] 2.015 1.840
[4,] 1.700 2.140

Upvotes: 2

Related Questions