Reputation: 11
In another post, there is a great response about how to add simple n-values to a likert plot in the Likert package. The response by @eipi10 is very helpful, but I noticed that it does not seem correct. The dataset has NA values, so the n-values need to differ per question. I got close to implementing this, but I do not understand how the mapping of geom_text works. My n-values get written on top of one another in the plot.
So how do I correct the geo_text so that my text aligns properly. Note, in my code below (full reproduceable example) I used a lowercase variable name "cnt" in my "Count," where as the original answer post used an uppercase. I point this out just so no one gets confused. Thank you.
#from the original question. Setting up the data
options(digits=2)
require(likert)
data(pisaitems)
##### Item 24: Reading Attitudes
items24 <- pisaitems[,substr(names(pisaitems), 1,5) == 'ST24Q']
head(items24); ncol(items24)
names(items24) <- c(
ST24Q01="I read only if I have to.",
ST24Q02="Reading is one of my favorite hobbies.",
ST24Q03="I like talking about books with other people.",
ST24Q04="I find it hard to finish books.",
ST24Q05="I feel happy if I receive a book as a present.",
ST24Q06="For me, reading is a waste of time.",
ST24Q07="I enjoy going to a bookstore or a library.",
ST24Q08="I read only to get information that I need.",
ST24Q09="I cannot sit still and read for more than a few minutes.",
ST24Q10="I like to express my opinions about books I have read.",
ST24Q11="I like to exchange books with my friends.")
str(items24)
l24 <- likert(items24)
## add the countries to the dataset
cnt<-pisaitems$CNT
l24cnt<-cbind(items24, cnt)
View(l24cnt)
#### create counts for each question per country and drop NAs
library(tidyr)
counts<-l24cnt %>%
gather(question, response, "I read only if I have to.":"I like to exchange books with my friends.", factor_key = TRUE) %>%
na.omit() %>% ## added this in to drop NA cols so they are not counted in group_by
group_by(question, cnt) %>%
count()
counts$variable<-NA ### from the original answer by eipe10: "The variable=NA at the end is there because the original data frame that likert.bar.plot generates in creating the plot creates and uses a column called variable. Even though we don't use that column in our subsequent call to geom_text with the new data frame below, ggplot still expects that colunmn to be present in the new data frame."
View(counts)
###############
##### Group by Country
l24g <- likert(items24, grouping=pisaitems$CNT)
# Plots
p = plot(l24g) +
geom_text(data=counts,
aes(label=format(n,big.mark=","), x=cnt, y=145),
size=2.5, colour="grey30", hjust=1) +
scale_y_continuous(limits=c(-100,150)) +
coord_flip(ylim=c(-110,110)) +
theme(plot.margin=unit(c(0.2,2,0.2,0.2),"cm"))
# Turn off clipping
# http://stackoverflow.com/a/9691256/496488
p <- ggplot_gtable(ggplot_build(p))
p$layout$clip <- "off"
grid.draw(p)
Upvotes: 0
Views: 157