J. Doe
J. Doe

Reputation: 1730

ggplot - continuous labels on binned variable?

Let's look at the two plots. The first one is created by this code:

ggplot(iris, aes(x = Petal.Length))+
  stat_bin(geom = "density")

enter image description here

The second image is similar and is produced by this code:

library(DescTools)

iris$Petal.Length %>% 
  cut(30) %>% 
  Freq() %>% 
  ggplot(aes(x = level, y = freq, group = 1))+
  geom_line()

enter image description here

The labels in the second image are not clearly visible due to image size, but they are intervals of bins of iris data (e.g. (0.994,1.2], (1.2,1.39] etc.).

My question is: how can I change the labels on the 2nd plot to be exactly the same as the labels in the 1st plot, i.e. to create an impression that the data was never binned using cut in the first place.

Please don't post any solutions that involve not binning the data and producing the plot in some other way - the solution needs to be such that the new code can be added and run after the code for the 2nd plot is run.

Upvotes: 1

Views: 280

Answers (2)

portablemaex
portablemaex

Reputation: 135

maybe this is oversimplified. But looking at your question using scale_x_continuous seemed the easiest solution. The breaks argument let's you specify the tick marks you want to display.

library(DescTools)

iris$Petal.Length %>% 
 cut(30) %>% 
  Freq() %>% 
  ggplot(aes(x = level, y = freq, group = 1))+
  geom_line() +
  scale_x_continuous(breaks = c(2, 4, 6)) +
  labs(x = "Petal.Length", y = "count") 

Upvotes: 0

stefan
stefan

Reputation: 123768

Maybe this what you are looking for. To get a continuous axis I extract the lower and upper bounds of your intervals, convert to numeric and compute the center value which can be mapped on x and:

library(ggplot2)
library(dplyr)
library(DescTools)

iris$Petal.Length %>% 
  cut(30) %>% 
  Freq() %>% 
  mutate(lower = stringr::str_match(level, "^.(\\d+\\.\\d+)")[,2],
         upper = stringr::str_match(level, "(\\d+\\.\\d+).$")[,2],
         across(c(lower, upper), as.numeric),
         center = .5 * lower + .5 * upper) %>% 
  ggplot()+
  geom_line(aes(x = center, y = freq, group = 1))

Upvotes: 1

Related Questions