Reputation: 73
My dataframe set up is:
ID Var1 Var2 Var3 ... Var50
The sum of the variables are 1 for every row. I've been trying to get the top 3 Variables.
ID 1st 2nd 3rd
Would grouping by ID and using the top_n() work?
Upvotes: 1
Views: 840
Reputation: 6226
With data.table
you can reshape your data to long format and select the three maximum values by group ("ID")
library(data.table)
df_long <- melt(df1, id.vars = "ID")[order(ID, desc(value))]
df_long[,.SD[1:3], by = "ID"]
Upvotes: 1
Reputation: 886948
If it is by row, then we can use apply
t(apply(df1[-1], 1, function(x) head(sort(-x), 3)))
Or with pmap
returning a list
column with 3 values per row
library(purrr)
library(dplyr)
df1 %>%
mutate(top3 = select(., -ID) %>% pmap(~ head(sort(-c(...)), 3)))
If we want to use top_n
, one option is to reshape to 'long' format
library(tidyr)
df1 %>%
pivot_longer(cols = -ID) %>%
group_by(ID) %>%
top_n(3, value)
Upvotes: 3