Reputation: 23
I am trying to simplify data analysis by combining levels of a categorical variables.
There are 6 levels in this variable Let's say the name of this variable is "candle" and the levels are: "Always", "Nearly always", "Sometimes", "Seldom", "Never", "Never used", NA
I wanted to regroup "Always" and "Nearly Always" as "Yes", leave "Sometimes" as it is, and "Seldom" and "Never" with "No"
I used:
data <- data %>%
mutate(candle_new = ifelse(candle == "Always", "Yes", ifelse(candle == "Nearly always", "Yes", ifelse(candle == "Sometimes", "St",
ifelse(candle == "Never", "No", ifelse(candle == "seldom", "No", NA))))))
Although it runs and does not show any error message, when I check the original data, it does not seem like it worked.
Could you help me to figure out what I did wrong?
Thank you!
Upvotes: 0
Views: 3189
Reputation: 1392
The car
package has an elegant (IMO) recode
function that works over multiple values.
yes.set <- c('Always','Nearly always')
no.set <- c('Seldom','Never','Never used')
# made up data
data <- data.frame(vals=sample(candles,50,replace=T))
data$vals<-recode(data$vals,"yes.set='Yes'; no.set='No'")
Anything that falls outside the desired set can be set to NA
using an else
parameter. You'd have to include the "Sometimes" value explicitly, first.
data$vals<-recode(data$vals,"yes.set='Yes'; no.set='No';'Sometimes'='Sometimes';else=NA")
Upvotes: 0
Reputation: 25385
I think instead of using ifelse
, it would be more appropriate and legible to use match
or left_join
in this case.
So first we make a data.frame called match_df
that looks as follows:
old new
1 Always Yes
2 Nearly Always Yes
3 Sometimes Sometimes
4 Seldom No
5 Never No
And then we look up the new values from that data.frame. We could do that with either a left_join
, or with match:
set.seed(2)
library(dplyr)
# the match dataframe
match_df = data.frame(old = c('Always','Nearly Always','Sometimes','Seldom','Never'),
new = c('Yes','Yes','Sometimes','No','No'))
# sample data
df = data.frame(candle = sample(match_df$old,12,TRUE))
# option 1, with match
df %>% mutate(candle_new = match_df$new[match(candle,match_df$old)])
# option 2, left_join
df %>% left_join(match_df,by=c('candle'='old')) %>% rename(candle_new=new)
Hope this helps!
Upvotes: 1
Reputation: 20095
I can see it working. See the data and result.
data <- data.frame(id = 1:7, candle = c("Always", "Nearly always", "Sometimes", "Seldom", "Never", "Never used", NA))
library(dplyr)
data <- data %>%
mutate(candle_new = ifelse(candle == "Always","Yes",
ifelse(candle == "Nearly always", "Yes",
ifelse(candle == "Sometimes", "St",
ifelse(candle == "Never", "No", ifelse(candle == "Seldom", "No", NA))))))
data
# id candle candle_new
#1 1 Always Yes
#2 2 Nearly always Yes
#3 3 Sometimes St
#4 4 Seldom No
#5 5 Never No
#6 6 Never used <NA>
#7 7 <NA> <NA>
Upvotes: 0
Reputation: 145
There's not enough information but... Could it be that "seldom" within you nested ifelse has a lower case "s" in it?
Upvotes: 0