cyrusjan
cyrusjan

Reputation: 647

How to find the smallest number of several variables and return the column name

I have some data like this

       A    B    C    D
[1,]  23   23   12   34
[2,]  12   13   11   9 
[3,]  70   80   67   76
[4,]  43   23   25   40

I would like to generate two new variable.

  1. the smallest number of each row

  2. the column name of the smallest value

i.e. the result should be like:

      A     B    C     D   min   minCol
1    23    23   12    34    12        C
2    12    13   11     9     9        D
3    70    80   67    76    67        C
4    43    23   25    40    23        B

I have tried the below script, and I did got the "min" variable, however, when I tried to generate the "minCol", some of the values were put down as "min", instead of A,B,C or D.

data <- transform(data, min=apply(data[,c(1:4)],1, min,na.rm = TRUE))

After I run this, I got the below data frame, which is fine.

      A     B    C     D   min   
1    23    23   12    34    12       
2    12    13   11     9     9        
3    70    80   67    76    67        
4    43    23   25    40    23        

Then I run the below script

data <- data.frame(data, minCol= apply(data, 1, function(row)
        {colnames(data)[[which.min(row)]]}))

and got something like this

      A     B    C     D   min   minCol
1    23    23   12    34    12        C
2    12    13   11     9     9        D
3    70    80   67    76    67        C
4    43    23   25    40    23        min

Can anyone help?

Upvotes: 1

Views: 2082

Answers (1)

David Arenburg
David Arenburg

Reputation: 92300

One simple approach would be (assuming your data called df)

df[c("min", "minCol")] <- t(apply(df, 1, function(x) c(min(x), names(x[which.min(x)]))))
df
#    A  B  C  D min minCol
# 1 23 23 12 34  12      C
# 2 12 13 11  9   9      D
# 3 70 80 67 76  67      C
# 4 43 23 25 40  23      B

Another approach (offered by @akrun) would be a combination of pmin, do.call and max.col

val <- do.call(pmin, c(df, na.rm = TRUE)) 
transform(df, min=val, minCol = names(df)[max.col(df == val, 'first')])

Data

df <- structure(list(A = c(23L, 12L, 70L, 43L), B = c(23L, 13L, 80L, 
      23L), C = c(12L, 11L, 67L, 25L), D = c(34L, 9L, 76L, 40L)), .Names = c("A", 
      "B", "C", "D"), class = "data.frame", row.names = c("1", "2", 
      "3", "4"))

Upvotes: 1

Related Questions