Shubha
Shubha

Reputation: 13

grouping categorical variables in R

Suppose I have a variable named 'Fever' in which I have 4 options like mild, moderate, severe and very severe. I want to club moderate and mild together and severe and very severe together, how can I do it in 'R'?

Please suggest

Upvotes: 1

Views: 5012

Answers (3)

rps1227
rps1227

Reputation: 539

This can also be done using base:

## If going from character to factor
fever_vec <- c("mild", "moderate", "severe", "very severe")
fever_fact <- factor(fever_vec,
                     levels = c("mild", "moderate", "severe", "very severe"),
                     labels = c("mild/moderate", "mild/moderate",
                                "severe/very severe", "severe/very severe"))

## If already going from a factor
fever_already_fact <- factor(c("mild", "moderate", "severe", "very severe"))
levels(fever_already_fact) <- list("mild/moderate" = c("mild", "moderate"),
                                   "severe/very severe" = c("severe", "very severe"))

Also, the 1st variant only works from R version >= 3.5.0.

Upvotes: 1

Moritz Schwarz
Moritz Schwarz

Reputation: 2489

I think you're looking for something like this:

library(tidyverse)
df <- tibble(fever = c("mild","moderate","severe","very severe"))
newdf <- mutate(df,highfever = case_when(fever == "mild" | fever == "moderate" ~ 0,
                                         fever == "severe" | fever == "very severe" ~ 1))

Upvotes: 0

AnilGoyal
AnilGoyal

Reputation: 26218

This type of vectors are normally factors.

library(forcats)

First create a vector of fevers


fever_lvl <- c("mild", "moderate", "severe", "very severe")
set.seed(1)
fevers <- factor(sample(fever_lvl, 10, T), levels = fever_lvl)

fevers
> fevers
 [1] mild        very severe severe      mild        moderate    mild        severe     
 [8] severe      moderate    moderate   
Levels: mild moderate severe very severe

Regrouping as desired


fevers_regrouped <- fct_recode(fevers, mild_or_moderate = "mild", mild_or_moderate = "moderate",
                               severe_or_higher = "severe", severe_or_higher = "very severe")

fevers_regrouped
> fevers_regrouped
 [1] mild_or_moderate severe_or_higher severe_or_higher mild_or_moderate mild_or_moderate
 [6] mild_or_moderate severe_or_higher severe_or_higher mild_or_moderate mild_or_moderate
Levels: mild_or_moderate severe_or_higher

or use fct_collapse as-

fevers_regrouped2 <- fct_collapse(fevers, mild_or_mod = c("mild", "moderate"),
                                  severe_or_up = c("severe", "very severe"))
fevers_regrouped2
 [1] mild_or_mod  severe_or_up severe_or_up mild_or_mod  mild_or_mod  mild_or_mod  severe_or_up severe_or_up
 [9] mild_or_mod  mild_or_mod 
Levels: mild_or_mod severe_or_up

Upvotes: 1

Related Questions