ibm
ibm

Reputation: 874

Calculating each factor level's sd() for a variable

I have been asked by my coauthor to add sd to the factor variables that have more than two levels, and sd(as.numeric(df$factor)) is giving me a single output instead of the sd for each. I imagine purrr::map could handle it but df%>% select(factor) %>% as.numeric %>% map(~(sd(.))) outputs an error Error in function_list[[i]](value) : 'list' object cannot be coerced to type 'double' even though df is not a list.

Upvotes: 0

Views: 1059

Answers (1)

akrun
akrun

Reputation: 887531

If it is the sd for each level of the factor column, we need to use that as a grouping variable

library(dplyr)
df %>%
    group_by(factor) %>%
     summarise(SD = sd(anothercolumn, na.rm = TRUE))

Based on the description, if we need the sd of factor variables having more than two levels

df %>%
     summarise(across(where(~ is.factor(.) && nlevels(.) >2),
         ~ sd(as.numeric(.))))

Upvotes: 1

Related Questions