Reputation: 330
I am attempting to create a categorical (e.g, string) variable in R using values from a different variable, and according to specific criteria.
This does not seem to actually recode the data. I am transforming the data, but I am not sure in which way.
I have a data frame structured at the country-month unit of analysis. One of the variables is governance
, and is continuous. It ranges from 0.750 to 4.333.
I am attempting to create a categorical variable in which I create labels for 4 different broad groups of governance.
Here is what I tried:
syndromes$syndrome_cat <- NA
syndromes$syndrome_cat[syndromes$governance <= 1.645] <- "Category 1"
syndromes$syndrome_cat[syndromes$governance >= 1.646 & syndromes$governance <= 2.541] <- "Category 2"
syndromes$syndrome_cat[syndromes$governance >= 2.542 & syndromes$governance <= 3.437] <- "Category 3"
syndromes$syndrome_cat[syndromes$governance >= 3.438] <- "Category 3"
Unfortunately, this does not result in listing the different values, but instead results in this:
summary(variable)
Length Class Mode
14256 character character
When I examine the data, I see this:
head(syndromes$governance)
[1] NA NA NA NA NA NA
What am I doing wrong?
Thank you in advance!
Upvotes: 1
Views: 70
Reputation: 23109
Just use this (as @Rich Scriven also suggested), also you can change the default behavior of cut
by including/excluding the left/right sides of the intervals:
syndromes$syndrome_cat <- cut(syndromes$governance, breaks=c(-Inf,1.645, 2.541, 3.437,Inf),
labels=paste('Category', 1:4))
Upvotes: 2
Reputation: 13680
With dplyr
:
mydf %>%
mutate(group = case_when(.$governance < 1.646 ~ 'Cat1',
between(.$governance, 1.646, 2.541) ~ 'Cat2',
between(.$governance, 2.542, 3.437) ~ 'Cat3',
.$governance > 3.438 ~ 'Cat4'))
Upvotes: 1