Reputation: 123
I need to select random rows from different numeric intervals I´ve stipulated. The following topic is very related, but in this case the rows were selected from levels:
selecting n random rows across all levels of a factor within a dataframe
Using the same sample example:
df <- data.frame(matrix(rnorm(80), nrow=40))
df$color <- rep(c("blue", "red", "yellow", "pink"), each=10)'
How could I select 4 rows (or any other n) from where -1< X1<0 and 4 rows where 0 ≤ X1<2?
Upvotes: 0
Views: 49
Reputation: 886938
Try this
n <- 4
indx1 <- with(df, which(X1>-1 & X1 <0))
indx2 <- with(df, which(X1>=0 & X1 <2))
df[sample(indx1,n,replace=FALSE),]
df[sample(indx2,n,replace=FALSE),]
If you need to select a sample of 'n' rows per each grouping variable 'color' based on the condition in 'X1' variable
library(data.table)#v1.9.5+
setDT(df)[between(X1, -1,0), if(n > .N) .SD else
.SD[sample(.N, n, replace=FALSE)] , by = color]
You can use the second condition for "X1" similarly
Upvotes: 2
Reputation: 20463
With dplyr
:
library(dplyr)
df %>%
filter(X1 > -1 & X1 < 0) %>%
sample_n(4)
df %>%
filter(X1 >= 0 & X1 < 2) %>%
sample_n(4)
You could abstract the number to select by doing something like this:
num_to_select <- 4
df %>%
filter(X1 > -1 & X1 < 0) %>%
sample_n(num_to_select)
Likewise, you could do the same with the lower and upper cutoffs:
num_to_select <- 4
lower_cutoff <- -1
upper_cutoff <- 0
df %>%
filter(X1 > lower_cutoff & X1 < upper_cutoff) %>%
sample_n(num_to_select)
Upvotes: 0
Reputation: 5169
set.seed(1234)
df <- data.frame(matrix(rnorm(80), nrow=40))
df$color <- rep(c("blue", "red", "yellow", "pink"), each=10)
s1 = subset(df,df$X1<0 & df$X1 > -1)
s2 = subset(df,df$X1<2 & df$X1 >= 0)
r1 = s1[sample(nrow(s1), 4), ]
r2 = s2[sample(nrow(s2), 4), ]
> r1
X1 X2 color
18 -0.9111954 -0.7733534 red
22 -0.4906859 2.5489911 yellow
17 -0.5110095 1.6478175 red
11 -0.4771927 -1.8060313 red
> r2
X1 X2 color
2 0.2774292 -1.068642724 blue
15 0.9594941 -0.162309524 red
6 0.5060559 -0.968514318 blue
31 1.1022975 0.006892838 pink
Upvotes: 0