Orion
Orion

Reputation: 1104

Getting index of column of minimum value in a data.frame row

DF = structure(list(a = c(1L, 2L, 5L), b = c(2L, 3L, 3L), c = c(3L, 1L, 2L)), .Names = c("a", "b", "c"), row.names = c(NA, -3L), class = "data.frame")

a b c 
1 2 3 
2 3 1 
5 3 2

How do I create additional columns, each including the names or indices of the columns of the row minimum, middle and maximum as follows?

a b c min middle max
1 2 3   a      b   c
2 3 1   c      a   b
5 3 2   c      b   a

Upvotes: 1

Views: 473

Answers (2)

akrun
akrun

Reputation: 886938

As the OP mentioned about data.table, here is one way with data.table. Convert the 'data.frame' to 'data.table' (setDT(DF)), grouped by the sequence of rows, we unlist the dataset, order the values, use it as index to order column names, create three columns by assigning (after converting to list).

 library(data.table)
 setDT(DF)[, c('min', 'middle', 'max') :=
    as.list(names(DF)[order(unlist(.SD))]) ,1:nrow(DF)][]
 #   a b c min middle max
 #1: 1 2 3   a      b   c
 #2: 2 3 1   c      a   b
 #3: 5 3 2   c      b   a

Upvotes: 3

josliber
josliber

Reputation: 44299

One approach would be to loop through the rows with apply, returning the column names in the indicated order:

cbind(DF, t(apply(DF, 1, function(x) setNames(names(DF)[order(x)],
                                              c("min", "middle", "max")))))
#   a b c min middle max
# 1 1 2 3   a      b   c
# 2 2 3 1   c      a   b
# 3 5 3 2   c      b   a

This solution assumes you have exactly three columns (so the middle is the second largest). If that is not the case, you could generalize to any number of columns with the following modification:

cbind(DF, t(apply(DF, 1, function(x) {
  ord <- order(x)
  setNames(names(DF)[c(ord[1], ord[(length(x)+1)/2], tail(ord, 1))],
           c("min", "middle", "max"))
})))
#   a b c min middle max
# 1 1 2 3   a      b   c
# 2 2 3 1   c      a   b
# 3 5 3 2   c      b   a

Upvotes: 4

Related Questions