Adding random factor-valued row to a data frame

Question

I have a data frame filled with factor columns, and I want to add a random factor-valued row. How do I do it?

> df = as.data.frame(list(a="YES", b="other", c="do_not_know"))
> levels(df$c) <- c("do_not_know", "yes", "no")
> df2 <- subset(df, subset=(a=="NO"))
> df2
[1] a b c
<0 rows> (or 0-length row.names)
> str(df2)
'data.frame':   0 obs. of  6 variables:
$ a          : Factor w/ 1 level "YES": 
$ b          : Factor w/ 1 level "other": 
$ c          : Factor w/ 3 levels "do_not_know",..:

Now, I'd like random_row(df2) to produce either list("YES", "other", "do_not_know"), list("YES", "other", "yes"),list("YES", "other", "no") randomly.

(Its not always the same data frame either, I want a generic function. The constraint is that all columns will be always factor-valued.)

Paul Hiemstra · Accepted Answer

If you mean by random factor-valued that you want to generate a new row in the dataset that, for each individual factor, i.e. column, draws a random value from the available levels in that factor (column). For lack of a reproducible example, I can only provide you some untested R code. It first extracts all the possible levels from the factor variables and then randomly draws from those levels to create a new random row. I use apply style loops.

available_levels = lapply(df2, levels)
new_row = sapply(available_levels, sample, size = 1)

Adding random factor-valued row to a data frame

Answers (1)

Related Questions