Reputation: 196
I have a dataframe with a column that contains levels "Excellent, Very Good, Good, Fair, Poor." I would like to average these values, and work with them in other ways, by assigning the value 5 to "Excellent", 4 to "Very Good", and so on.
My various attempts are confounded by the fact that the default assignment of numerical values seems to take the levels in alphabetical order, so that "Excellent" is 1, "Fair" is 2, and so on.
Thanks for the help.
Upvotes: 0
Views: 246
Reputation: 373
Do you need it to be an ordered factor? If so, using factor
maybe your best bet.
Sample data
column <- c("Excellent", "Very Good", "Good", "Fair", "Poor",
"Good", "Fair", "Poor")
col.f <- factor(column,
levels = c("Poor","Fair" , "Good" , "Very Good", "Excellent"),
labels = c("Poor","Fair" , "Good" , "Very Good", "Excellent"),
ordered = TRUE)
col.f
[1] Excellent Very Good Good Fair Poor Good Fair Poor
Levels: Poor < Fair < Good < Very Good < Excellent
Then you can call as.numeric(col.f)
to get numeric values.
Upvotes: 2
Reputation: 60944
I'd use a named vector as lookup table:
options = c('Excellent' = 5, 'Very Good' = 4, 'Good' = 3, 'Fair' = 2, 'Poor' = 1)
df = data.frame(grade = sample(names(options), 100, replace = TRUE))
head(df)
grade
1 Very Good
2 Good
3 Excellent
4 Very Good
5 Fair
6 Good
df = within(df, {
grade_numeric = options[grade]
})
head(df)
grade grade_numeric
1 Very Good 1
2 Good 3
3 Excellent 5
4 Very Good 1
5 Fair 4
6 Good 3
Upvotes: 2