Con Des
Con Des

Reputation: 359

ggplot2 dotplot how to create empty x axis categories

I have some data in a CSV file that I made up in order to create dot plots of different distributions.

These are the made-up data:

structure(list(uniform = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 
4, 4, 4, 4, 5, 5, 5, 5), left_skew = c(1L, 2L, 2L, 3L, 3L, 3L, 
4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), right_skew = c(5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 
2L, 2L, 1L), trunc_uni_left = c(3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), trunc_uni_right = c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 
3L, 3L, 3L), trunc_norm_left = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L), trunc_norm_right = c(1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L), bimodal = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), extreme_left = c(3L, 
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L), extreme_right = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L)), row.names = c(NA, 
-20L), class = "data.frame")

The dot-plot works when there are 'observations' in each of the five categories on the x-axis. However, if there are values missing then it only reflects those categories. For instance, in one plot there are no 1s and 2s so the plot only shows categories 3, 4, and 5.

I've tried using scale_x_discrete to set the limits and breaks but this doesn't work.

Here is the code I used to plot the data:

ggplot(df, aes(x = trunc_uni_left))+
  geom_point()+ 
  geom_dotplot(method = "histodot", binwidth = 0.25, fill = 'red', dotsize = 0.75)+
  labs(x = 'Rating Categories', y = 'Rating Frequency')+
  theme_bw()+
  ylim(0 , 20)+
  scale_x_discrete(breaks = c ("0.5", "1", "1.5", "2", "2.5"),
                   labels = c ("1", "2", '3', '4', '5'),
                   limits = c ("1", "2", "3", "4", "5"))+
  theme(panel.grid = element_blank(),
        text = element_text(size = 16),
        axis.text.x = element_text(size =  16),
        axis.title.x = element_text(size = 16, margin = margin(t = 20)),
        axis.title.y = element_text(size = 16, margin = margin(r = 20)),
        legend.title= element_text(size = 16))

Is there something I can do in ggplot to achieve this? Or alternatively, can I create a data frame in R that would allow me to do this?

I'm not the best coder in the world as you may be able to tell so would much appreciate the help.

Thanks!

Upvotes: 1

Views: 791

Answers (1)

Pete900
Pete900

Reputation: 2166

Your breaks don't match the data. The breaks should be 1:5 which are the numbers in your df and supply new labels if required. However, I'm guessing you don't want new labels (please correct) and you just want to control the x-axis limits? In which case you can just supply the limits while changing trunc_uni_left to a factor:

ggplot(df, aes(as.factor(trunc_uni_left))) +
  geom_dotplot(method = "histodot", binwidth = 0.25, fill = 'red', dotsize = 0.75)+
  labs(x = 'Rating Categories', y = 'Rating Frequency')+
  theme_bw() +
  scale_x_discrete(limits = seq(1, 5, 1))

If you did want to re-label the x-axis with bespoke labels make sure you match the breaks to what is actually in your data:

ggplot(df, aes(as.factor(trunc_uni_left))) +
  geom_dotplot(method = "histodot", binwidth = 0.25, fill = 'red', dotsize = 0.75) +
  labs(x = 'Rating Categories', y = 'Rating Frequency')+
  theme_bw() +
  scale_x_discrete(limits = seq(1, 5, 1), 
                   breaks = seq(1, 5, 1),
                   labels = paste0("my_lab_", seq(1, 5, 1)))

In this example you don't need the breaks as the data happens to be ordered because it's numeric. But if you had some string as the input you would need to match the breaks and labels in the order you want them.

Upvotes: 1

Related Questions