Reputation: 528
I have a data frame filled with factor columns, and I want to add a random factor-valued row. How do I do it?
> df = as.data.frame(list(a="YES", b="other", c="do_not_know"))
> levels(df$c) <- c("do_not_know", "yes", "no")
> df2 <- subset(df, subset=(a=="NO"))
> df2
[1] a b c
<0 rows> (or 0-length row.names)
> str(df2)
'data.frame': 0 obs. of 6 variables:
$ a : Factor w/ 1 level "YES":
$ b : Factor w/ 1 level "other":
$ c : Factor w/ 3 levels "do_not_know",..:
Now, I'd like random_row(df2)
to produce either list("YES", "other", "do_not_know")
, list("YES", "other", "yes")
,list("YES", "other", "no")
randomly.
(Its not always the same data frame either, I want a generic function. The constraint is that all columns will be always factor-valued.)
Upvotes: 0
Views: 1058
Reputation: 60944
If you mean by random factor-valued that you want to generate a new row in the dataset that, for each individual factor, i.e. column, draws a random value from the available levels in that factor (column). For lack of a reproducible example, I can only provide you some untested R code. It first extracts all the possible levels from the factor variables and then randomly draws from those levels to create a new random row. I use apply style loops.
available_levels = lapply(df2, levels)
new_row = sapply(available_levels, sample, size = 1)
Upvotes: 2