Reputation: 177
There are several posts about returning the column name of the largest value of a data frame. (like this post: For each row return the column name of the largest value)
However, my problem is a bit more complicated than this, I am wondering what code should I use if I would like to return the column names of the largest two (or three, or even ten) data by R? To make it more clear, you can use this example code:
DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(9,6,4))
Which will return something like:
V1 V2 V3
1 2 7 9
2 8 3 6
3 1 5 4
I want to get the column names of the largest two columns, so in this case, it should be something like:
1 V3 V2
2 V1 V3
3 V2 V3
Thanks very much for your help in advance! :)
Upvotes: 1
Views: 104
Reputation: 887223
Using pmap
library(purrr)
pmap(DF, ~ {tmp <- c(...); head(names(tmp)[order(-tmp)], 2)})
-output
[[1]]
[1] "V3" "V2"
[[2]]
[1] "V1" "V3"
[[3]]
[1] "V2" "V3"
or with dapply
from collapse
library(collapse)
slt(dapply(DF, MARGIN = 1, FUN = function(x) colnames(DF)[order(-x)]), 1:2)
V1 V2
1 V3 V2
2 V1 V3
3 V2 V3
Upvotes: 1
Reputation: 4140
DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(9,6,4))
DF
#> V1 V2 V3
#> 1 2 7 9
#> 2 8 3 6
#> 3 1 5 4
largest <- colnames(DF)[apply(DF, 1, FUN = function(x) which(x == sort(x, decreasing = TRUE)[1]))]
secondlargest <- colnames(DF)[apply(DF, 1, FUN = function(x) which(x == sort(x, decreasing = TRUE)[2]))]
cbind(largest, secondlargest)
#> largest secondlargest
#> [1,] "V3" "V2"
#> [2,] "V1" "V3"
#> [3,] "V2" "V3"
Upvotes: 0