Eureka
Eureka

Reputation: 25

Using case_when() within mutate() to create levels for a factor variable

I want to create a factor variable where all punctuations are properly labeled, and all characters are labeled as "char"

char <- read.xlsx("ccp35.xlsx", sheet="CCP")
chars <- tbl_df(char)
chars$punc <- chars %>%
    mutate(punc = case_when(
        chars$Character =="," ~ "comma",
        chars$Character =="。"| "Character" =="?" ~ "stop"
        TRUE ~ "char"))

I've tried the code without the TRUE ~ "char" line, it worked nicely, with all character labeled as "NA".

But when I added the last line, there was an error:

Error: unexpected numeric constant in:
"chars$Character =="。"| "Character" =="?" ~ "stop"
TRUE"

Upvotes: 0

Views: 2524

Answers (2)

Konrad Rudolph
Konrad Rudolph

Reputation: 545865

There are several errors in your code:

  1. You forgot a comma in your parameter list.
  2. You accidentally put Character in quotes and treat it as a string — syntax highlighting offers a hint here.
  3. mutate returns a tibble, you should assign it, for instance, to chars. Definitely not to chars$punc.
  4. While not an error, the chars$s in your code are redundant.
  5. I also suggest foregoing the intermediate variables with unclear names, and to use a pipeline for the complete expression instead.

This leaves us with:

chars <- read.xlsx("ccp35.xlsx", sheet="CCP") %>%
    as_tibble() %>%
    mutate(
        punc = case_when(
            Character == "," ~ "comma",
            Character == "。" | Character == "?" ~ "stop",
            TRUE ~ "char"
        )
    )

I also urge you to format code consistently, and to always put single spaces around infix operators (as done in my code).

Upvotes: 3

Diego Rojas
Diego Rojas

Reputation: 199

I don't have your data, but it seems that you forgot to add "chars$" before "Character"=="?". Change the chars$Character =="。"| "Character" =="?" ~ "stop" with chars$Character =="。" | chars$Character =="?" ~ "stop" and see what happens.

Upvotes: -1

Related Questions