choij
choij

Reputation: 257

How to filter, remove or subset rows "certain range of values" between variables?

Is it possible to filter, remove or subset rows "certain range of value" between variables?

Here is dummy data:

a <- data.frame(c('b1', 'b2', 'b3'),
                c(0.2, 1.5, 0.5),
                c(0.4, 1.0, 0.3),
                c(0.5, 0.5, 0.1),
                c(-0.5, -2.5, -0.2),
                c(-0.3, -3.0, -0.4),
                c(-0.5, -1.7, -0.4),
                stringsAsFactors = FALSE)

colnames(a) <- c('id', 'var1', 'var2', 'var3', 'var4', 'var5', 'var6')
rownames(a) <- a$id

a_subset <- a[, 2:7]
a_subset

#    var1 var2 var3 var4 var5 var6
# b1  0.2  0.4  0.5 -0.5 -0.3 -0.5
# b2  1.5  1.0  0.5 -2.5 -3.0 -1.7
# b3  0.5  0.3  0.1 -0.2 -0.4 -0.4


#'[ Here we can see in the b1 row between variables ranges are between -0.5 to 0.5 and total range is 1.0 between minimum and maximum values.]

#'[Expected output]

#'[For example: if we want to filter out rows with range 1 between variables, we will have below result, because b2 rows total range is 4.5 between maximum and minimum values.]


#    var1 var2 var3 var4 var5 var6
# b1  0.2  0.4  0.5 -0.5 -0.3 -0.5
# b3  0.5  0.3  0.1 -0.2 -0.4 -0.4

So is it possible to filter, subset, or remove rows based on the specific ranges between variables? any approach will be helpful. Thank you.

Upvotes: 0

Views: 235

Answers (1)

Ma&#235;l
Ma&#235;l

Reputation: 51914

base R

range <- apply(a_subset, 1, function(x) diff(range(x)))
a_subset[which(range <= 1),]

   var1 var2 var3 var4 var5 var6
b1  0.2  0.4  0.5 -0.5 -0.3 -0.5
b3  0.5  0.3  0.1 -0.2 -0.4 -0.4

tidyr

In tidyr, it is easier to work with tidy data:

a_subset %>% 
  rownames_to_column() %>% 
  pivot_longer(cols = -rowname) %>% 
  group_by(rowname) %>% 
  filter(diff(range(value)) <= 1) %>% 
  pivot_wider()

Upvotes: 2

Related Questions