Reputation: 437
How can we create columns with highest values for each row ?
References:
https://rdrr.io/cran/dplyr/man/top_n.html
Selecting top N values within a group in a column using R
For e.g.
library(tidyverse)
iris %>% glimpse()
# my attempt
x = iris %>%
select(-Species) %>%
gather(measure,values) %>%
# hereafter got stuck
mutate(top_1 =
top_2 =
top3_3 = )
# expected_output contains same number of rows as input
expected_output = iris %>% mutate(top_1 = 1st highest value from the row (row wise),
top_2 = 2nd highest value from the row (row wise),
top_3 = 3rd highest value from the row (row wise))
# expected output first 3 rows looks like below:
iris[1:3,] %>%
mutate(top_1 = c(5.1,4.9,4.7), top_2 = c(3.5,3.0,3.2), top_3 = c(1.4,1.4,1.3))
Upvotes: 1
Views: 191
Reputation: 388817
We can use apply
row-wise, sort
the vector in decreasing order and get top 3 values using head
df <- iris
df[paste0("top_", 1:3)] <- t(apply(df[-5], 1, function(x)
head(sort(x, decreasing = TRUE), 3)))
head(df)
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species top_1 top_2 top_3
#1 5.1 3.5 1.4 0.2 setosa 5.1 3.5 1.4
#2 4.9 3.0 1.4 0.2 setosa 4.9 3.0 1.4
#3 4.7 3.2 1.3 0.2 setosa 4.7 3.2 1.3
#4 4.6 3.1 1.5 0.2 setosa 4.6 3.1 1.5
#5 5.0 3.6 1.4 0.2 setosa 5.0 3.6 1.4
#6 5.4 3.9 1.7 0.4 setosa 5.4 3.9 1.7
A tidyverse
alternative which involves some reshaping
library(dplyr)
library(tidyr)
iris %>%
mutate(row = row_number()) %>%
select(-Species) %>%
gather(key, value, -row) %>%
group_by(row) %>%
top_n(3, value) %>%
mutate(key = paste0("top", 1:3)) %>%
spread(key, value) %>%
ungroup %>%
select(-row)
Upvotes: 3