Reputation: 43
I've spent a good deal of time looking into this subject, but have not been able to find much. I would like a new column of data titled "Max Region", that gives the name of the column for which the maximum value occurs in each row.
df <- data.frame(Head=c(9, 6, 2, NA), Thorax=c(9, 2, NA, NA), Abdomen=c(NA, NA, 5, 5), Neck=c(4, 3, 5, 2))
# Head Thorax Abdomen Neck
# 9 9 NA 4
# 6 2 NA 3
# 2 NA 5 5
# NA NA 5 2
So far, I've used:
df$MaxRegion <- names(df)[apply(df, 1, which.max)]
However, in the case of a tie, I would really like both columns to result (ie HeadThorax or AbdomenNeck), or just result with "NA". Is this possible with which.max? I've also looked into max.col, but it also doesn't seem to have this function. Thank you so much!
Upvotes: 1
Views: 75
Reputation: 101099
Another base R option
df$MaxRegion <- mapply(
subset,
list(names(df)),
asplit(df == do.call(pmax, c(df, na.rm = TRUE)), 1)
)
gives
> df
Head Thorax Abdomen Neck MaxRegion
1 9 9 NA 4 Head, Thorax
2 6 2 NA 3 Head
3 2 NA 5 5 Abdomen, Neck
4 NA NA 5 2 Abdomen
Upvotes: 1
Reputation: 886948
Using the OP's code, if we need to get all the tied max
element column names, use %in%
(returns FALSE where there are NA) or ==
on the max
, and paste
the corresponding names
apply(df, 1, function(x) toString(names(x)[x %in% max(x, na.rm = TRUE)]))
#[1] "Head, Thorax" "Head" "Abdomen, Neck" "Abdomen"
NOTE: which.max
returns only the first index of the max
value
Upvotes: 0