Lieke
Lieke

Reputation: 1

Creating frequenty table of multiple columns of a data frame

I have data from a questionaire where people had to score multiple questions on a 5 point scale. It resulted in a data frame that looks like this

Q5_1 <- c("completely agree", "agree a little", "completely agree", "completely agree")
Q5_2 <- c("agree a little", "do not agree or disagree", "agree a little", "completely agree" 
Q5_3 <- c("do not agree or disagree","do not agree or disagree","do not agree or disagree","do not agree or disagree")

Now I want to make a frequency table that shows the values from two columns. And ideally also takes into account the possibilities that are never scored. so it would have to look like this

                              Q5_1        Q5_3
completely disagree           0           0
disagree a little             0           0
do not agree or disagree      0           4
agree a little                2           0
completely agree              2           0 

I tried

table(df$Q5_1, df$Q5_3)

but this resulted in

                         do not agree or disagree
agree a little           2
completely agree         2

I do manage to get something that looks like the table i want but as a dataframe. But for the statistical testing i want to do i need it to be a (frequency) table

Upvotes: 0

Views: 80

Answers (1)

akrun
akrun

Reputation: 887118

If we need the combinations that doesn't exist, convert the columns to factor with levels specified

lvls <-  c("completely disagree", "disagree a little",
    "do not agree or disagree", "agree a little", "completely agree")
df[c("Q5_1", "Q5_3")] <- lapply(df[c("Q5_1", "Q5_3")], factor, levels = lvls
     )
t(sapply(lvls, function(x) colSums(df[c("Q5_1", "Q5_3")] == x)))
                         Q5_1 Q5_3
completely disagree         0    0
disagree a little           0    0
do not agree or disagree    0    4
agree a little              1    0
completely agree            3    0

Or we may use count after converting the columns to factor

library(dplyr)
library(tidyr)
df %>% 
  select(c(Q5_1, Q5_3)) %>%
  mutate(across(c(Q5_1, Q5_3), factor, levels = lvls)) %>%
  pivot_longer(cols = c(Q5_1, Q5_3)) %>%
  count(name, value, .drop = FALSE) %>%
  pivot_wider(names_from = name, values_from = n)
# A tibble: 5 × 3
  value                     Q5_1  Q5_3
  <fct>                    <int> <int>
1 completely disagree          0     0
2 disagree a little            0     0
3 do not agree or disagree     0     4
4 agree a little               1     0
5 completely agree             3     0

Upvotes: 0

Related Questions