FragenSteller
FragenSteller

Reputation: 81

How to plot a histogram from existing counts with uneven bin widths using ggplot

I want to create an histogram from already existing classes. I have this dataset:

interval        counts
0 - 8.50        2577
8.51 - 10.00    1199
10.01 - 12.00   1878
12.01 - 14.00   637
14.01 - 16.00   369
16.01 - 18.00   98
18.00 - 20.00   308



library(ggplot2)

plot_tab5_lohn <- ggplot(DS18, aes(x=interval)) + geom_histogram(stat="count")
return(plot_tab5_lohn)})

does result in this graph:

https://i.sstatic.net/W3GDH.png

I want the counts to be on the y axis and the intervals have to be a different width. How can I do this?

EDIT: I've made it this far: https://i.sstatic.net/W3GDH.png using this code

DS18$interval <- factor(DS18$interval, levels = DS18$interval)
output$DS32 <- renderPlot({
plot_tab5_lohn <- ggplot(DS18, aes(x=interval, y = counts)) +
geom_col() + 
geom_point(color = "red") + 
geom_line(aes(group = 1), color = "red")
return(plot_tab5_lohn)
})

I'd like the bars to be as wide as the interval itself. And the density should be on the Y-Axis. The sum of the areas should be 1 (100%) then. Something like this link

Upvotes: 1

Views: 2216

Answers (3)

Axeman
Axeman

Reputation: 35402

You can extract the boundaries, then plot using geom_rect:

# Using dt from @www
library(tidyr)
dt2 <- separate(dt, interval, c('left', 'right'), sep = ' - ', convert = TRUE)
ggplot(dt2) +
  geom_rect(aes(xmin = left, xmax = right, ymin = 0, ymax = counts),
            col = 1) +
  geom_line(aes(x = right + (left - right) / 2, y = counts),
            col = 'red')

enter image description here

Alternatively, you can first expand your data into single observations, this also easily allows you to plot the densities instead:

library(dplyr)
library(tidyr)
dt3 <- dt %>% 
  group_by(interval) %>% 
  do(data.frame(interval = rep.int(.$interval, .$counts), stringsAsFactors = FALSE)) %>% 
  separate(interval, c('left', 'right'), sep = ' - ', convert = TRUE) %>% 
  mutate(value = right + (left - right) / 2)
breaks <- c(0, unique(dt3$right))

ggplot(dt3, aes(value)) +
  geom_histogram(aes(y = ..density..), breaks = breaks, col = 1) +
  geom_freqpoly(aes(y = ..density..), breaks = breaks, col = 'red')

enter image description here

Upvotes: 2

www
www

Reputation: 39184

I think what you need is not a histogram, but a barplot. Here I showed how to use geom_col to create a barplot. Notice that I used factor to sort the bar of each class before plotting the data.

library(ggplot2)

# Order the bar
dt$interval <- factor(dt$interval, levels = dt$interval)
# Create the bar plot
ggplot(dt, aes(x=interval, y = counts)) + geom_col()

enter image description here

DATA

dt <- read.table(text = "interval        counts
'0 - 8.50'        2577
                 '8.51 - 10.00'    1199
                 '10.01 - 12.00'   1878
                 '12.01 - 14.00'   637
                 '14.01 - 16.00'   369
                 '16.01 - 18.00'   98
                 '18.00 - 20.00'   308",
                 header = TRUE, stringsAsFactors = FALSE)

Upvotes: 2

tbradley
tbradley

Reputation: 2290

You can use stat = "identity" and add a y aesthetic to get your desired graph:

ggplot(DS18, aes(x=interval, y = counts)) + 
  geom_histogram(stat="identity")

that gives you this:

enter image description here

Upvotes: 1

Related Questions