donnek
donnek

Reputation: 221

R tidyverse: fct_relevel "unknown levels"

I'm trying to use forcats::fct_relevel to specify the levels in a column, the way I've used it in ggplot, but it's giving an error about "unknown levels".

Here is a chart of the cheeses I have eaten per month:

cheeses<-tribble(
  ~mymonth, ~Brie, ~Stilton,
  1, 4, 2, 
  2, 4, 1,
  3, 1, 3,
  4, 1, 5,
  5, 2, 4,
  6, 3, 1
)

and a list of the months:

cheesemonth<-c("Jan", "Feb", "Mar", "Apr", "May", "Jun")

According to pages like this one, I should be able to do the following:

cheeses %>% 
  mutate(mymonth=factor(mymonth)) %>% 
  mutate(mymonth=fct_relevel(mymonth, cheesemonth))

and have the items in mymonth replaced by the items in cheesemonth. But instead I get:

6 unknown levels in `f`: Jan, Feb, Mar, Apr, May, and Jun 

and I'm at a loss to understand why.

If I replace the last line with:

mutate(mymonth=case_match(mymonth, "1" ~ "Jan", "2" ~ "Feb", "3" ~ "Mar", "4" ~ "Apr", "5" ~ "May", "6" ~ "Jun"))

then it's fine, but this is more typing, and means I can't re-use the cheesemonth list.

So why do I get the unknown levels error?

Upvotes: 1

Views: 733

Answers (3)

Adriano Mello
Adriano Mello

Reputation: 2132

For reference, here´s a late but very simple solution with lubridate::month().

library(lubridate)

cheeses <- mutate(cheeses, mymonth = month(mymonth, label = TRUE, abbr = TRUE))

# A tibble: 6 × 3
  mymonth  Brie Stilton
  <ord>   <dbl>   <dbl>
1 jan         4       2
2 fev         4       1
3 mar         1       3
4 abr         1       5
5 mai         2       4
6 jun         3       1

# ---------------------

str(cheeses)

tibble [6 × 3] (S3: tbl_df/tbl/data.frame)
 $ mymonth: Ord.factor w/ 12 levels "jan"<"fev"<"mar"<..: 1 2 3 4 5 6
 $ Brie   : num [1:6] 4 4 1 1 2 3
 $ Stilton: num [1:6] 2 1 3 5 4 1

Upvotes: 0

G. Grothendieck
G. Grothendieck

Reputation: 270298

fct_relevel reorders levels. To change the labels, which forcats calls values, use lvls_revalue

library(forcats)

lvls_revalue(as.character(cheeses$mymonth), cheesemonth)
$$ [1] Jan Feb Mar Apr May Jun
## Levels: Jan Feb Mar Apr May Jun

or use fct

library(forcats)

fct(cheesemonth[cheeses$mymonth], cheesemonth)
## [1] Jan Feb Mar Apr May Jun
## Levels: Jan Feb Mar Apr May Jun

It is even easier with base R:

factor(cheeses$mymonth, labels = cheesemonth)
## [1] Jan Feb Mar Apr May Jun
## Levels: Jan Feb Mar Apr May Jun

or given that months have a natural order you may wish to create an ordered factor (also base R):

ordered(cheeses$mymonth, labels = cheesemonth)
## [1] Jan Feb Mar Apr May Jun
## Levels: Jan < Feb < Mar < Apr < May < Jun

Note that R has a built-in month.abb vector (English only) so we could eliminate cheesemonth and write:

month.abb
## [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"

ordered(cheeses$mymonth, labels = month.abb[1:6])
## [1] Jan Feb Mar Apr May Jun
## Levels: Jan < Feb < Mar < Apr < May < Jun

or to allow for months that are not present in the data

ordered(cheeses$mymonth, levels = 1:12, labels = month.abb)
## [1] Jan Feb Mar Apr May Jun
## 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec

Upvotes: 1

jay.sf
jay.sf

Reputation: 73782

You have levels 1:6 and the six labels in cheesemonth which can be combined in factor like so:

cheeses$mymonth <- factor(cheeses$mymonth, levels=1:6, labels=cheesemonth)
cheeses
#   mymonth Brie Stilton
# 1     Jan    4       2
# 2     Feb    4       1
# 3     Mar    1       3
# 4     Apr    1       5
# 5     May    2       4
# 6     Jun    3       1

This also works with pipes in base R,

cheeses |> transform(mymonth=factor(mymonth, levels=1:6, labels=cheesemonth))

or using dplyr.

library(magrittr)
cheeses %>% dplyr::mutate(mymonth=factor(mymonth, levels=1:6, labels=cheesemonth))

Data:

cheeses <- structure(list(mymonth = c(1, 2, 3, 4, 5, 6), Brie = c(4, 4, 
1, 1, 2, 3), Stilton = c(2, 1, 3, 5, 4, 1)), class = "data.frame", row.names = c(NA, 
-6L))

Upvotes: 0

Related Questions