Janice Neumann
Janice Neumann

Reputation: 39

R function to display only the 20% highest values of a column

I'm new to R. I have a matrix of numeric values and want to display only the highest 20% of a specific column.

Any help is appreciated!

Upvotes: 1

Views: 704

Answers (2)

linog
linog

Reputation: 6226

With a data.table object, you would do:

library(data.table)
df <- as.data.table(m1)
col <- colnames(df)
m1[get(col) >= quantile(get(col), probs = .8)]

This is probably the fastest method if you have a voluminous dataset

Upvotes: 1

akrun
akrun

Reputation: 886948

We can use quantile to create a logical vector and extract the elements from the column (here it is assumed that it is the first column)

m1[,1][m1[,1] >= quantile(m1[,1], 0.8)]

If it is a data.frame, we can use top_frac

library(dplyr)
as.data.frame(m1) %>%
    top_frac(n = 0.2, wt = col1)

Or with slice_max

as.data.frame(m1) %>%
     slice_max(col1, prop = 0.2)

Upvotes: 0

Related Questions