damian
damian

Reputation: 21

Subset dataframe based on statistical range of each column

I would like to subset a dataframe by selecting only columns that exceed a specific range. IE, I would like to evaluate max-min for each column individually and select only columns whose range is greater than a given value. For example, given the following simple dataframe, I would like to create a subset dataframe that only contains columns with a range > 99. (Columns b an c.)

d <- data.frame(a=seq(0,10,1),b=seq(0,100,10),c=seq(0,200,20))

I have tried modifying the example here: Subset a dataframe based on a single condition applied to multiple columns, but have had no luck. I'm sure I'm missing something simple.

Upvotes: 0

Views: 415

Answers (1)

Didzis Elferts
Didzis Elferts

Reputation: 98449

You can use sapply() to apply function to each column of d and then calculate difference for range of column values. Then compare it to 99. As result you will get TRUE or FALSE and then use it to subset columns.

d[,sapply(d,function(x) diff(range(x))>99)]

Upvotes: 2

Related Questions