Reputation: 1
I can only find information for finding the max value for each row. But I need the max value among multiple rows and columns and to find the column name corresponding to it. e.g if my dataset looks like:
data <- data.frame(Year = c(2001, 2002, 2003),
X = c(3, 2, 45),
Y = c(6, 20, 23),
Z = c(10, 4, 4))
I want my code to return "X" because 45 is the maximum.
Upvotes: 0
Views: 519
Reputation: 10996
A base R solution:
Assuming that you want to exclude the Year
variable from this analysis:
dat <- data.frame(Year = c(2000, 2001, 2002),
X = c(1, 2, 45),
Y = c(3, 4, 5))
dat_ex_year <- dat[, !names(dat) %in% c("Year")]
names(dat_ex_year)[which(dat_ex_year == max(dat_ex_year), arr.ind = TRUE)[,2]]
which gives:
[1] "X"
EDIT: I slightly adjusted the code so that it would return all column names in case the maximum value is found in several columns, e.g. with :
dat <- data.frame(Year = c(2000, 2001, 2002),
X = c(1, 2, 45),
Y = c(3, 45, 5))
the code gives:
[1] "X" "Y"
Upvotes: 1
Reputation: 1868
I suppose one way to approach this is to turn your wide dataset into a long (tidy) table and then filter for the max value and extract that value name.
library(tidyverse)
df <- read.table(text = "Year X Y Z
2001 3 6 10
2002 2 20 4
2003 45 23 4", header = T)
df %>%
pivot_longer(cols = c("X", "Y", "Z"), names_to = "column") %>%
filter(max(value) == value) %>%
pull(column)
# [1] "X"
And if you have a large number of columns, one method to "pivot" your data from wide to long without specifying all the columns names (as I do in the pivot_longer(...)
command), you can run this instead:
df %>%
pivot_longer(cols = setdiff(names(.), "Year"), names_to = "column") %>%
filter(max(value) == value) %>%
pull(column)
Upvotes: 1