Ajaff
Ajaff

Reputation: 73

Top 3 Values Per Row in R

My dataframe set up is:

ID Var1 Var2 Var3 ... Var50

The sum of the variables are 1 for every row. I've been trying to get the top 3 Variables.

ID 1st 2nd 3rd

Would grouping by ID and using the top_n() work?

Upvotes: 1

Views: 840

Answers (2)

linog
linog

Reputation: 6226

With data.table you can reshape your data to long format and select the three maximum values by group ("ID")

library(data.table)
df_long <- melt(df1, id.vars = "ID")[order(ID, desc(value))]
df_long[,.SD[1:3], by = "ID"]

Upvotes: 1

akrun
akrun

Reputation: 886948

If it is by row, then we can use apply

t(apply(df1[-1], 1, function(x) head(sort(-x), 3)))

Or with pmap returning a list column with 3 values per row

library(purrr)
library(dplyr)
df1 %>%
    mutate(top3 = select(., -ID) %>% pmap(~ head(sort(-c(...)), 3)))

If we want to use top_n, one option is to reshape to 'long' format

library(tidyr)
df1 %>% 
    pivot_longer(cols = -ID) %>%
    group_by(ID) %>%
    top_n(3, value)

Upvotes: 3

Related Questions