NewBee
NewBee

Reputation: 1040

Survey data in R: frequencies of responses not selected

I was wondering if anyone have a solution to the following problem when cleaning survey data in R.

Let’s say that a survey has Q1 “What is your gender” : Male, Female, Prefer not to say. In the survey, no one selects “Prefer not to say”, so that when I ran the frequency I only see:

Q1 Male :8, Female :8.

Is there a way to code in “Prefer not to say” into Q1 so that when I run the frequency I see:

Q1 Male : 8, Female: 8, Prefer not to say: 0.

Here is some sample data & code:

 dat_in<-read_table2("ID    Gender
1   1
2   1
3   1
4   1
5   1
6   2
7   2
8   2
9   2
10  2
11  2
12  2
13  1
14  2
15  1
16  2
")
    
data_cat <- dat_in %>% mutate_if(is.numeric,as.character) %>% mutate(across(matches("Gender"), ~fct_recode(., "Female" = "1","Male"="2")))

lapply(select_if(data_cat, is.factor),
       function(x) {
           df = data.frame(table(x))
           return(df)
       })

Upvotes: 0

Views: 86

Answers (1)

akrun
akrun

Reputation: 887048

Change it to factor with levels specified so that even if there is no element, it returns frequency count of 0

table(factor(dat_in$Gender, levels = c("Male", "Female", "Prefer not to say")))

-output

              Male            Female Prefer not to say 
                8                 8                 0 

If there are many variables, that are character/factor class, loop over the columns, add the "Prefer not to say" as a new level

i1 <- sapply(dat_in, function(x) is.character(x)|is.factor(x))
dat_in[i1] <- lapply(dat_in[i1], function(x) {
             if(is.factor(x)) {
               levels(x) <- c(levels(x), "Prefer not to say")
              } else {
                x <- factor(x, levels = c(unique(x), "Prefer not to say"))
            }
           x })

Or if we are using tidyverse, then this can be done with fct_expand from forcats

library(dplyr)
library(forcats)
dat_in <- dat_in %>%
           mutate(across(where(~ is.factor(.)|is.character(.)), ~ 
               fct_expand(., "Prefer not to say")))

Upvotes: 2

Related Questions