Adding sample n-values to Likert plot in R when you have some non-responses or NAs

Question

In another post, there is a great response about how to add simple n-values to a likert plot in the Likert package. The response by @eipi10 is very helpful, but I noticed that it does not seem correct. The dataset has NA values, so the n-values need to differ per question. I got close to implementing this, but I do not understand how the mapping of geom_text works. My n-values get written on top of one another in the plot.

So how do I correct the geo_text so that my text aligns properly. Note, in my code below (full reproduceable example) I used a lowercase variable name "cnt" in my "Count," where as the original answer post used an uppercase. I point this out just so no one gets confused. Thank you.

#from the original question. Setting up the data    
options(digits=2)

require(likert)
data(pisaitems)

##### Item 24: Reading Attitudes
items24 <- pisaitems[,substr(names(pisaitems), 1,5) == 'ST24Q']
head(items24); ncol(items24)

names(items24) <- c(
  ST24Q01="I read only if I have to.",
  ST24Q02="Reading is one of my favorite hobbies.",
  ST24Q03="I like talking about books with other people.",
  ST24Q04="I find it hard to finish books.",
  ST24Q05="I feel happy if I receive a book as a present.",
  ST24Q06="For me, reading is a waste of time.",
  ST24Q07="I enjoy going to a bookstore or a library.",
  ST24Q08="I read only to get information that I need.",
  ST24Q09="I cannot sit still and read for more than a few minutes.",
  ST24Q10="I like to express my opinions about books I have read.",
  ST24Q11="I like to exchange books with my friends.")
str(items24)

l24 <- likert(items24)



## add the countries to the dataset
cnt<-pisaitems$CNT
l24cnt<-cbind(items24, cnt)
View(l24cnt)


#### create counts for each question per country and drop NAs
library(tidyr)
counts<-l24cnt %>% 
  gather(question, response, "I read only if I have to.":"I like to exchange books with my friends.", factor_key = TRUE) %>%
  na.omit() %>%  ## added this in to drop NA cols so they are not counted in group_by
  group_by(question, cnt) %>%
  count()

counts$variable<-NA ### from the original answer by eipe10: "The variable=NA at the end is there because the original data frame that likert.bar.plot generates in creating the plot creates and uses a column called variable. Even though we don't use that column in our subsequent call to geom_text with the new data frame below, ggplot still expects that colunmn to be present in the new data frame."

View(counts)
###############


##### Group by Country
l24g <- likert(items24, grouping=pisaitems$CNT)

# Plots
p = plot(l24g) +
  geom_text(data=counts,
            aes(label=format(n,big.mark=","), x=cnt, y=145), 
            size=2.5, colour="grey30", hjust=1) +
  scale_y_continuous(limits=c(-100,150)) +
  coord_flip(ylim=c(-110,110)) +
  theme(plot.margin=unit(c(0.2,2,0.2,0.2),"cm"))

# Turn off clipping
# http://stackoverflow.com/a/9691256/496488
p <- ggplot_gtable(ggplot_build(p))
p$layout$clip <- "off"
grid.draw(p)

Adding sample n-values to Likert plot in R when you have some non-responses or NAs

Answers (0)

Related Questions