Reputation: 4645
With the following sample data I'm trying to create a new variable "category" based on the values of three columns variables (type, addict, and sex).
But I would like to combine type
and addict
into one group and sex
in another group.
So I use any
to get logically to a set of logical vectors, is at least one of the values true or both of them true.
df <- data.frame(type = c(NA, "bad",NA), addict=c('visky','wine',NA),
sex=c(NA,'male',NA))
> df
type addict sex
1 <NA> visky <NA>
2 bad wine male
3 <NA> <NA> <NA>
library(dplyr)
df%>%
mutate(category=ifelse(any(is.na(type)&addict=="visky")&any(is.na(sex)),"categ1",
ifelse(any(type=="bad"|addict=="wine")&any(!is.na(sex)),"categ2",
ifelse(any(is.na(type)&is.na(addict))&any(is.na(sex)),"categ3",NA))))
type addict sex category
1 <NA> visky <NA> categ1
2 bad wine male categ1
3 <NA> <NA> <NA> categ1
as it can be seen my ifelse
loop is not working correctly. I cannot figured out why?
the expected output
type addict sex category
1 <NA> visky <NA> categ1
2 bad wine male categ2
3 <NA> <NA> <NA> categ3
Thx in advance
category
One more thing If I wanted to write user defined function without using case_when to do the same operation I would probably use
categ <- function(type,addict,sex){
if (any(is.na(type)&addict=="visky"&is.na(sex))){
"categ1"
}
else{
NA
}
}
but this is also gives
df%>%
mutate(category=categ(type,addict,sex))
type addict sex category
1 <NA> visky <NA> categ1
2 bad wine male categ1
3 <NA> <NA> <NA> categ1
Upvotes: 2
Views: 3053
Reputation: 887891
In the OP's input dataset, all the columns were factor
and along with that NA
s were string "NA"
. Also, the OP's code is checking the entire column with any
which returns a single TRUE/FALSE
and gets recycled which is not the intended output. If we change those to character
class and to NA
s (using case_when
)
df %>%
mutate(category = case_when(
is.na(type) & addict %in% "visky" & is.na(sex) ~ "categ1",
type %in% c("bad", "wine") & !is.na(sex) ~ "categ2",
is.na(type) & is.na(addict) & is.na(sex) ~ "categ3",
TRUE ~ NA_character_))
# type addict sex category
#1 <NA> visky <NA> categ1
#2 bad wine male categ2
#3 <NA> <NA> <NA> categ3
NOTE: Here, we are used %in%
instead of ==
as ==
returns NA for NA elements while %in%
returns FALSE. But, we could still use ==
with a combination of is.na
Based on the OP's comments, we could create a custom function (different function)
categFn <- function(typeCol, addictCol, sexCol) {
if(any(is.na(typeCol) & addictCol== "visky") & any(is.na(sexCol))) {
"categ1"
} else NA
}
df %>%
mutate(categ = categFn(type, addict, sex))
df <- data.frame(type = c(NA, "bad",NA), addict=c('visky','wine',NA),
sex=c(NA,'male',NA), stringsAsFactors = FALSE)
Upvotes: 1