Reputation: 109
I would like to rearrange dataframe1
by the order of the dataframe2
column.
Dataframe1:
N02_M N05_F N06_M N07_F N08_F N09_M N02_M N026_F N03_M
2237895 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225
586 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364
2255280 0.995 0.995 0.995 0.995 0.995 0.995 0.995 0.995 0.995
7294280 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478
7499 0.803 0.803 0.803 0.803 0.803 0.803 0.803 0.803 0.803
35209 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.94
Here is data frame 2:
Dataframe2:
586
2237895
7499
35209
2255280
7294280
Result:
N02_M N05_F N06_M N07_F N08_F N09_M N02_M N026_F N03_M
586 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364
2237895 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225
7499 0.803 0.803 0.803 0.803 0.803 0.803 0.803 0.803 0.803
35209 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.94
2255280 0.995 0.995 0.995 0.995 0.995 0.995 0.995 0.995 0.995
7294280 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478
I tried the command awk
'FNR==NR {x2[$1] = $0; next} $1 in x2 {print x2[$1]}' df2 df1
but it doesn't work, it doesn't change the order of the df2
Upvotes: 0
Views: 166
Reputation: 203645
$ cat tst.awk
NR==FNR {
if (NR==1) {
print
}
else {
map[$1] = $0
}
next
}
{ print map[$1] }
$ awk -f tst.awk dataframe1 dataframe2
N02_M N05_F N06_M N07_F N08_F N09_M N02_M N026_F N03_M
586 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364
2237895 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225 0.6225
7499 0.803 0.803 0.803 0.803 0.803 0.803 0.803 0.803 0.803
35209 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.94
2255280 0.995 0.995 0.995 0.995 0.995 0.995 0.995 0.995 0.995
7294280 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478 0.8478
Upvotes: 1
Reputation: 6727
Let us prepare the data.
Dataframe1 <- cbind(a=rnorm(6), b=rnorm(6))
rownames(Dataframe1) <- round(rnorm(6)*1000000)
Dataframe2 <- cbind(sample(as.integer(rownames(Dataframe1))))
Dataframe1:
a b
255728 -0.2063562 -0.804514251
215067 0.2756317 -0.865951064
2079729 1.1376277 0.001123908
-898713 0.1846505 -0.908527352
-717953 1.7424731 0.462088593
-1028794 0.4162211 1.548455286
Dataframe2:
[,1]
[1,] -1028794
[2,] -898713
[3,] -717953
[4,] 215067
[5,] 2079729
[6,] 25572
And now... Hocus Pocus!
Dataframe1[order(rownames(Dataframe1))[rank(Dataframe2)],]
a b
-1028794 0.4162211 1.548455286
2079729 1.1376277 0.001123908
215067 0.2756317 -0.865951064
255728 -0.2063562 -0.804514251
-898713 0.1846505 -0.908527352
-717953 1.7424731 0.462088593
The idea for such sorting is to use order(x)[rank(y)]
.
Upvotes: 0
Reputation: 7153
Here's a solution with dplyr
library(dplyr)
df1$id <- rownames(df1) # create common column "id" based on rownames
names(df2) <- "id" #change column name of df2 to common name "id"
df2$id <- as.character(df2$id) #ensure that the structure for df1$id and df2$id are in similar structure
left_join(df2, df1)
id N02_M N05_F N06_M N07_F N08_F N09_M N02_M.1 N026_F N03_M
1 34387 NA NA NA NA NA NA NA NA NA
2 202783 NA NA NA NA NA NA NA NA NA
3 887320 NA NA NA NA NA NA NA NA NA
4 35444 NA NA NA NA NA NA NA NA NA
5 88035 NA NA NA NA NA NA NA NA NA
6 239862 NA NA NA NA NA NA NA NA NA
7 9266359 NA NA NA NA NA NA NA NA NA
8 586 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364 0.8364
9 375535 NA NA NA NA NA NA NA NA NA
10 3794260 NA NA NA NA NA NA NA NA NA
11 6687758 NA NA NA NA NA NA NA NA NA
12 6983267 NA NA NA NA NA NA NA NA NA
13 704346 NA NA NA NA NA NA NA NA NA
14 35209 0.9400 0.9400 0.9400 0.9400 0.9400 0.9400 0.9400 0.9400 0.9400
15 795668 NA NA NA NA NA NA NA NA NA
By the way, df1 == Dataframe1 and df2 == Dataframe2
Upvotes: 0