MarkPhil
MarkPhil

Reputation: 1

Error in mutate() caused by error in case_when(): must be length X or 1, not Y

I am trying to create a new variable (var2) with value 1 when the value of var1 is P or Q, value 2 when the value of var1 is L or M, etc. Basically, a procedure with the form:

library(tidyverse)

#Create example data frame
df_raw <- data.frame(var1 = c("Never", "Always", "Often", "Rarely"))

#Create new variable based on values of var1
df_new <- df_raw %>% 
  mutate(var2 = case_when(
    var1 == "Never" ~ 1,
    . == "Always" ~ 3,
    (. == "Often" | . =="Rarely") ~ 2
  ))

The above works fine using the example data, of course. But when I try this same thing with my actual data, I get this:

Error in mutate(): ! Problem while computing var2 = case_when(...). ✖ var2 must be size 20889 or 1, not 4741803.

I haven't been able to replicate the error with any other dataset (I've tried copying the var1 column from my data spreadsheet into a new spreadsheet and running the code on that, and it works fine).

Notably, the above code also works on my actual data when I only try to make one assignment, i.e., this:

df_new <- df_raw %>% 
  mutate(var2 = case_when(
    var1 == "Never" ~ 1
  ))

works fine. But this:

df_new <- df_raw %>% 
  mutate(var2 = case_when(
    var1 == "Never" ~ 1,
    . == "Always" ~ 3
  ))

gives the error, and so does this:

df_new <- df_raw %>% 
  mutate(var2 = case_when(
    (var1 == "Often" | . =="Rarely") ~ 2
  ))

Knowing it may be difficult for people to say without seeing my actual data, any ideas about what's causing this and how to fix it?

Upvotes: 0

Views: 3613

Answers (1)

akrun
akrun

Reputation: 887118

The . refers to the whole dataset. It can be checked with structure. The length of the data.frame here is the number of columns and it is 1 (and have a different structure). case_when requires all arguments to be of same length, structure and same return type

> df_raw %>% mutate(var2 = {.}) %>% str
'data.frame':   4 obs. of  2 variables:
 $ var1: chr  "Never" "Always" "Often" "Rarely"
 $ var2:'data.frame':   4 obs. of  1 variable:
  ..$ var1: chr  "Never" "Always" "Often" "Rarely"

Instead, we may need to specify the var1

df_raw %>% 
  mutate(var2 = case_when(
    var1 == "Never" ~ 1,
    var1 == "Always" ~ 3,
    (var1 == "Often" | var1 =="Rarely") ~ 2
  ))

-output

    var1 var2
1  Never    1
2 Always    3
3  Often    2
4 Rarely    2

Upvotes: 1

Related Questions