Reputation: 257
Is it possible to filter, remove or subset rows "certain range of value" between variables?
Here is dummy data:
a <- data.frame(c('b1', 'b2', 'b3'),
c(0.2, 1.5, 0.5),
c(0.4, 1.0, 0.3),
c(0.5, 0.5, 0.1),
c(-0.5, -2.5, -0.2),
c(-0.3, -3.0, -0.4),
c(-0.5, -1.7, -0.4),
stringsAsFactors = FALSE)
colnames(a) <- c('id', 'var1', 'var2', 'var3', 'var4', 'var5', 'var6')
rownames(a) <- a$id
a_subset <- a[, 2:7]
a_subset
# var1 var2 var3 var4 var5 var6
# b1 0.2 0.4 0.5 -0.5 -0.3 -0.5
# b2 1.5 1.0 0.5 -2.5 -3.0 -1.7
# b3 0.5 0.3 0.1 -0.2 -0.4 -0.4
#'[ Here we can see in the b1 row between variables ranges are between -0.5 to 0.5 and total range is 1.0 between minimum and maximum values.]
#'[Expected output]
#'[For example: if we want to filter out rows with range 1 between variables, we will have below result, because b2 rows total range is 4.5 between maximum and minimum values.]
# var1 var2 var3 var4 var5 var6
# b1 0.2 0.4 0.5 -0.5 -0.3 -0.5
# b3 0.5 0.3 0.1 -0.2 -0.4 -0.4
So is it possible to filter, subset, or remove rows based on the specific ranges between variables? any approach will be helpful. Thank you.
Upvotes: 0
Views: 235
Reputation: 51914
range <- apply(a_subset, 1, function(x) diff(range(x)))
a_subset[which(range <= 1),]
var1 var2 var3 var4 var5 var6
b1 0.2 0.4 0.5 -0.5 -0.3 -0.5
b3 0.5 0.3 0.1 -0.2 -0.4 -0.4
In tidyr
, it is easier to work with tidy data:
a_subset %>%
rownames_to_column() %>%
pivot_longer(cols = -rowname) %>%
group_by(rowname) %>%
filter(diff(range(value)) <= 1) %>%
pivot_wider()
Upvotes: 2