Reputation: 113
I'm super new to R and struggling with the following excercise:
"Choose a random species from Iris and pull without putting back 50 rows that are not that species".
Iris:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
..
150 5.9 3.0 5.1 1.8 virginica
I've come up with this so far:
set.seed(1)
y <- sample(150, 1)
y
x <- iris[y,5]
x
Which results in:
> set.seed(1)
> y <- sample(150, 1)
> y
[1] 68
> x <- iris[y,5]
> x
[1] versicolor
Levels: setosa versicolor virginica
Now I know that I have to sample all of Iris and choose 50 that are not versicolor. How could I do that?
I've tried something like this:
z <- sample(iris, 50, replace = FALSE, iris.species != x)
z
If anyone could enlighten me on how to use the sample command I'd be thankful.
Thanks
Upvotes: 1
Views: 358
Reputation: 389135
You need to choose a random Species
, unique(iris$Species)
gives unique values of Species
and sample
to get 1 random value.
select_species <- sample(unique(iris$Species), 1)
Use subset
to drop that species from the dataset.
result <- subset(iris, Species != select_species)
To chose random 50 rows from result
you may do
sample_50 <- result[sample(nrow(result), 50), ]
Or as @r2evans suggested -
sample_50 <- dplyr::slice_sample(result, n = 50)
Upvotes: 1