CephBirk
CephBirk

Reputation: 6720

Create histogram of count frequencies in ggplot2

Let's say I have the following data frame:

d = data.frame(letter = c(
    'a', 'a', 'a', 
    'b', 'b', 'b', 
    'c',
    'd', 'd', 'd', 'd',
    'e', 'e', 
    'f', 'f', 'f', 'f', 'f', 'f', 'f',
    'g'))

How can I use ggplot2 to make a histogram that does not count how many times a given letter occurs, but rather counts the number of times a given letter frequency occurs? In this example:

table(d$letter)

a b c d e f g 
3 3 1 4 2 7 1 

two letters (c and g) occur once, one letter (e) occurs twice, two letters occur three times, etc. Such that you can make a figure equivalent to the base plot:

hist(table(d$letter), right = F, breaks = 6)

base histogram

Upvotes: 3

Views: 4876

Answers (1)

Stibu
Stibu

Reputation: 15947

You can convert the result of table to a data frame and then use ggplot:

df <- as.data.frame(table(d$letter))
ggplot(df, aes(x = Freq)) +
    geom_histogram(binwidth = 1)

enter image description here

This works because the column containing the frequencies is by default called Freq:

head(df)
##   Var1 Freq
## 1    a    3
## 2    b    3
## 3    c    1
## 4    d    4
## 5    e    2
## 6    f    7

If you want to have the bars positioned between the integer numbers, you can use center = 0.5 to center the bins at half integers. I also use closed = "left", which is equivalent to right = FALSE in hist():

ggplot(df, aes(x = Freq)) +
  geom_histogram(binwidth = 1, center = 0.5, closed = "left") +
  scale_x_continuous(breaks = 1:7)

enter image description here

Upvotes: 4

Related Questions