John Keya
John Keya

Reputation: 49

How can I clean, describe, perform descriptive analysis and visualize Likert questions in R?

  1. I would like assistance on how I can manipulate, describe, summarize and visualize Likert questions on R.

  2. Here is the dataset I am using: https://docs.google.com/spreadsheets/d/1Kje8K4Ow_Io4wdMikntO1vB-g12fLzJK5fPEBYRhIFE/edit?usp=sharing

  3. The likert questions are on a scale of 1 - 5, where, 1= Strongly Disagree, 2 = Disagree, 3 = Moderately Agree, 4 = Agree and 5 = Strongly agree

  4. From the data, I am interested in Columns 11, 12, 13 and 14

  5. I would like to summarize, column 11, 12, 13 and 14, with total count, and percentage for each of the scale. Calculate the sum total, mean and standard deviation for each column.

  6. Here is an example of the expected data ouput: Expected output

  7. Create a Likert plot for the data

I am struggling to output the data, specifically the descriptive statistics

A step by step guide would really help.

Upvotes: 3

Views: 379

Answers (2)

neilfws
neilfws

Reputation: 33812

I read your data into R using googlesheets4:

library(googlesheets4)
dataset <- read_sheet("1Kje8K4Ow_Io4wdMikntO1vB-g12fLzJK5fPEBYRhIFE")

We can generate a table somewhat like your example by using dplyr and tidyr to select the columns, pivot the data to a long form, and then group on the items to perform the summary calculations.

We use weighted.mean for the mean and wtd.var from the Hmisc package to get the weighted standard deviation.

library(dplyr)
library(tidyr)

dataset_sum <- dataset %>% 
  select(11:14) %>% 
  pivot_longer(everything()) %>% 
  group_by(name, value) %>% 
  summarise(Count = n()) %>% 
  group_by(name) %>% 
  mutate(`%` = 100 * (Count / sum(Count)), 
         wMean = weighted.mean(value, Count), 
         wSD = sqrt(Hmisc::wtd.var(value, Count)), 
         Total = sum(Count)) %>% 
  ungroup() %>%
  pivot_wider(names_from = "value", 
              names_sep = " ", 
              values_from = c("Count", "%"), 
              names_vary = "slowest")

Result:

# A tibble: 4 × 14
  name                                                                               wMean   wSD Total `Count 1` `% 1` `Count 2` `% 2` `Count 3` `% 3` `Count 4` `% 4` `Count 5` `% 5`
  <chr>                                                                             <dbl> <dbl> <int>     <int> <dbl>     <int> <dbl>     <int> <dbl>     <int> <dbl>     <int> <dbl>
1 Data from ministries or affiliated government agencies is easily accessible onli…  3.63  1.08   180         8  4.44        18 10           48  26.7        65  36.1        41  22.8
2 Data from ministries or government affiliated agencies in Rwanda is publicly ava…  3.68  1.07   180         8  4.44        15  8.33        47  26.1        67  37.2        43  23.9
3 Datasets available on Rwanda's open data platforms are free for anyone to access…  3.36  1.26   180        20 11.1         23 12.8         48  26.7        50  27.8        39  21.7
4 Datasets on Rwanda's open data platforms cover all your areas of service provisi…  3.24  1.10   180        15  8.33        21 11.7         74  41.1        45  25          25  13.9

For Likert analysis we can use the likert package.

library(likert)

First we need to convert the four columns to factor variables. Important: we need a data frame not a tibble for likert to work:

dataset_f <- dataset %>% 
  select(11:14) %>% 
  mutate(across(everything(), ~factor(.x, ordered = TRUE, levels = as.character(1:5)))) %>%
  as.data.frame()

dataset_lik <- likert(dataset_f)

The summary function gives us something similar to the previous summarization:

summary(dataset_lik)

  Item                                                                                                                                 low neutral  high  mean    sd
1 Data from ministries or government affiliated agencies in Rwanda is publicly available online or in digital formats                 12.8    26.1  61.1  3.68  1.07
2 Data from ministries or affiliated government agencies is easily accessible online or in digital formats and quick to find and use  14.4    26.7  58.9  3.63  1.08
3 Datasets available on Rwanda's open data platforms are free for anyone to access, use and share it                                  23.9    26.7  49.4  3.36  1.26
4 Datasets on Rwanda's open data platforms cover all your areas of service provision/mandate                                          20      41.1  38.9  3.24  1.10

And we can also plot the likert object:

plot(dataset_lik)

enter image description here

Upvotes: 4

Jay Bee
Jay Bee

Reputation: 582

I tried to do the first few parts of this as a learning exercise, so with the caveat that I am a beginner also, hope this might help a bit ..

library(tidyverse)

I created a toy dataset from the first 10 rows of your data.

df <- data.frame(
  Familiarity = c(3, 5, 2, 4, 2, 3, 5, 2, 3, 4),
  Accessibility = c(3, 5, 3, 4, 2, 3, 4, 2, 2, 4),
  EaseOfUse = c(3, 4, 2, 3, 2, 3, 3, 2, 2, 4),
  ReleaseSystematic = c(4, 5, 3, 4, 2, 4, 4, 4, 4, 4))

Then gave each of the variables factor levels.

df2 <- df %>%
  mutate(Familiarity = factor(Familiarity, levels = c(1:5)),
         Accessibility = factor(Accessibility, levels = c(1:5)),
         EaseOfUse = factor(EaseOfUse, levels = c(1:5)),
         ReleaseSystematic = factor(ReleaseSystematic, levels = c(1:5)))

Then created tables with the summary counts for each variable. I looked around but couldn't find/understand a simple way to do this?

table_familiarity <- (table(df2$Familiarity))
table_accessibility <- (table(df2$Accessibility))
table_ease <- (table(df2$EaseOfUse))
table_release <- (table(df2$ReleaseSystematic))

df3 <- addmargins(rbind(table_familiarity, table_accessibility, table_ease, table_release))

This is the table of counts:

df4 <- as.data.frame(df3) %>%
  select(-Sum) %>%
  filter(row_number() != 5)

And here is the table of proportions:

proportions <- df4 %>%
  as.matrix() %>%
  prop.table(margin = 1) * 100

prop_table <- as.data.frame(proportions)

Hopefully someone else may be able to help with the other parts of your question and I am interested to read better approaches.

Upvotes: 0

Related Questions