New York Crosser
New York Crosser

Reputation: 65

Select minimum row and return the column name using R

df <- data.frame(PATIENT_ID=c(1,2,3,4,5,6,7),
             A=c(2,4,6,7,8,9,2),
             B=c(3,2,3,6,6,4,3),
             C=c(1,2,3,4,6,3,2))

I want to create a variable named 'type', the value of the 'type' variable is the A,B,C's column name which has the min value. And if A=B=C, fill A, if B=C

So the output should be:

df <- data.frame(PATIENT_ID=c(1,2,3,4,5,6,7),
             A=c(2,4,6,7,8,9,2),
             B=c(3,2,3,6,6,4,3),
             C=c(1,2,3,4,6,3,2),
          type=c(C,B,B,C,B,C,A))

Upvotes: 0

Views: 66

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388807

We can use max.col which returns the column number of maximum value in each row but since we want min value here we can negate it. In case of a tie first minimum value will be returned this can be specified via ties.method.

names(df)[-1][max.col(-df[-1], ties.method = 'first')]
#[1] "C" "B" "B" "C" "B" "C" "A"

-1 here is to ignore the first column i.e PATIENT_ID.

We can also use apply :

names(df[-1])[apply(df[-1], 1, which.min)]

You can select only specific columns by :

cols <- c('A', 'B', 'C')
cols[max.col(-df[cols], ties.method = 'first')]

Or

cols[apply(df[cols], 1, which.min)]

Upvotes: 4

Related Questions