pluke
pluke

Reputation: 4346

dplyr mutate on dataframe .Label value, not reference

I have the following dataframe:

temp <- structure(list(ID = c("1234", "1223", "5555", "2344", "4567", "6543"), 
       Eat = structure(c(6L,1L, 5L, 2L, 3L, 4L), 
       .Label = c("", "Cabbage", "Carrot", "Lettuce", "Potato","Asparagus", "Mushroom", "Apple"), class = "factor")), 
      row.names = c(NA, 6L), class = "data.frame", .Names = c("ID", "Eat"))

I want to note each time there is nothing to Eat:

temp %>% mutate(Eat = ifelse(Eat != "" & !is.na(Eat), Eat, "Nothing!"))

However, the result is the mutate on the Eat structure values,:

    ID      Eat
1 1234        6
2 1223 Nothing!
3 5555        5
4 2344        2
5 4567        3
6 6543        4

How can I get the .Labels carried across to make:

    ID      Eat
1 1234Asparagus
2 1223 Nothing!
3 5555   Potato
4 2344  Cabbage
5 4567   Carrot
6 6543  Lettuce

Upvotes: 1

Views: 772

Answers (2)

Uwe
Uwe

Reputation: 42544

If it's not an requirement in your project, try to avoid factor. character are much easier to handle and are stored as memory efficient as factor. I only use factor when it comes to plotting or some specific sort order other than alphabetical is needed.

"... R has a global string pool. This means that each unique string is only stored in one place, and therefore character vectors take up less memory than you might expect" (Hadley Wickham, Advanced R)

This was different in the past which explains why coercion of strings to factor was and still is the default in many functions. You have to call read.csv or data.frame with the explicit parameter stringsAsFactors = FALSE to avoid this.

Recent R packages like data.table or those from Hadley's tidyverse (tibble) never coerce inputs.

But if you need factor you may follow @Alistaire's advice and use Hadley's forecats package.

Upvotes: 1

alistaire
alistaire

Reputation: 43334

The tidyverse way of changing a factor level is forcats::fct_recode, which maintains the factor type but changes any specified levels:

library(forcats)

temp %>% mutate(Eat = fct_recode(Eat, 'Nothing!' = ''))

##     ID       Eat
## 1 1234 Asparagus
## 2 1223  Nothing!
## 3 5555    Potato
## 4 2344   Cabbage
## 5 4567    Carrot
## 6 6543   Lettuce

Upvotes: 2

Related Questions