Reputation: 565
I want to assign factor levels but I'm not always going to know all of the values so I want to make sure that only a few factors are at the beginning if they are present. Example would be I want strawberries to be factor level 1 and kiwis be factor level 2 and all the rest get assigned alphabetically
data <- data.frame(
parameter = c(rep("apple",3), rep("banana", 3), rep("strawberry", 3), rep("kiwi", 3)),
date = c(rep(c(as.Date('2021-01-01'), as.Date('2021-01-02'), as.Date('2021-01-03')), 4)),
value = c(0,2,3,0,0,1,2,3,4,0,0,0)
)
If I were to order by parameter it would go strawberry, kiwi, apple, then banana. Unfortunately I won't always know what the other factors may be. Sometimes it may be apple and banana, or it could be apple, bananas, and pears. The possibilities are endless.
If you need extra context, a user will upload a csv to a shiny app with the 3 columns but the parameters could be different for ever user. If strawberry and kiwi are present in the parameters they need to be assigned a factor level first and all other factors assigned alphabetically.
Thanks in advance!
Upvotes: 1
Views: 424
Reputation: 887118
We can use setdiff
to change the order
in levels
of factor
v1 <- c('strawberry', 'kiwi')
data$parameter <- with(data, droplevels(factor(parameter,
levels = c(v1, sort(setdiff(parameter, v1))))))
levels(data$parameter)
#[1] "strawberry" "kiwi" "apple" "banana"
NOTE: It may be better to wrap with droplevels
(in case the 'strawberry' or 'kiwi' is not present in the data).
The above code may look perplexing. The logic is
setdiff
- returns the unique elements of the column without the values in 'v1'sort
- the elements (in default alphabetic order)c
- concatenate the 'v1' elements at the start in the vector
levels
- specify the unique sort
ed vector as levels
argument in factor
factor
to the original columndroplevels
- remove unused levels in case the elements in 'v1' are not presentOr another option is fct_relevel
library(forcats)
data$parameter <- fct_relevel(data$parameter, v1)
If we need to use tidyverse
, just copy the code within the with
and specify it in mutate
library(dplyr)
data <- data %>%
mutate(parameter = droplevels(factor(parameter,
levels = c(v1, sort(setdiff(parameter, v1))))))
Upvotes: 1