Reputation: 177
I'm preparing an old dataset for analysis, and need to convert strings to factors, using levels. I've used a (equally old) data dictionary to set the levels, but have just noticed that it's not entirely correct -- some strings in some variables are not in the data dictionary.
I'd like to prevent strings being dropped (converted to NA) without warning--ideally I'd like things to stop completely if a string is not in the level definition. Is that possible?
df <- data.frame(c1 = letters[1:3])
factor(df$c1, levels = letters[1:2])
# [1] a b <NA>
Happy to use dplyr
, forcats
or something else.
Upvotes: 1
Views: 45