neves
neves

Reputation: 846

`dplyr::case_when` don't give me correct results

case_when don't produces the expected results:

My list:

library(tidyverse)

1:6%>%
  str_c('var',.)%>%
  map(~assign(.,runif(30,20,100),envir=globalenv()))

tibble<-as_tibble(
  bind_cols(mget(ls(pattern='*v')))
)

cluster<-kmeans(tibble,centers=3)
cluster
tibble$kmeans<-as.factor(cluster[['cluster']])

mylist<-split(tibble,tibble$kmeans)
names(mylist)<-str_c('dataset',seq_along(mylist))

My code:

variables<-str_c('var',1:6)

mylist%>%
  map(~mutate_at(.,.vars=vars(variables),
              .funs=funs(.=case_when(
                .%in%c(1:50)~'less',
                .%in%c(51:100)~'more'
              ))))

Output produces NAs into new variables, don't less or more. What's wrong with this function?

Upvotes: 2

Views: 153

Answers (2)

zx8754
zx8754

Reputation: 56159

Maybe use ifelse:

cbind(tibble, ifelse(tibble[ , variables] <= 50, "less", "more"))

Upvotes: 1

joran
joran

Reputation: 173577

Maybe you meant something more like this:

mylist %>%
  map(~mutate_at(.,.vars=vars(starts_with("var")),
                 .funs=funs(.=case_when(
                   . <= 50 ~ 'less',
                   . > 50 ~ 'more'
                 ))))

but this is still very awkward, with badly named variables, and there's really no need to split it into a list first, that makes everything much more complicated than it needs to be. Typically things will be easier to work with if you keep the groups together and just reshape:

tibble %>%
  gather(key = "var",value = "val",var1:var6) %>%
  mutate(x = case_when(val <= 50 ~ "less",
                       val > 50 ~ "more"))

Upvotes: 3

Related Questions