Reputation: 6720
Let's say I have the following data frame:
d = data.frame(letter = c(
'a', 'a', 'a',
'b', 'b', 'b',
'c',
'd', 'd', 'd', 'd',
'e', 'e',
'f', 'f', 'f', 'f', 'f', 'f', 'f',
'g'))
How can I use ggplot2
to make a histogram that does not count how many times a given letter occurs, but rather counts the number of times a given letter frequency occurs? In this example:
table(d$letter)
a b c d e f g
3 3 1 4 2 7 1
two letters (c and g) occur once, one letter (e) occurs twice, two letters occur three times, etc. Such that you can make a figure equivalent to the base plot:
hist(table(d$letter), right = F, breaks = 6)
Upvotes: 3
Views: 4876
Reputation: 15947
You can convert the result of table
to a data frame and then use ggplot
:
df <- as.data.frame(table(d$letter))
ggplot(df, aes(x = Freq)) +
geom_histogram(binwidth = 1)
This works because the column containing the frequencies is by default called Freq
:
head(df)
## Var1 Freq
## 1 a 3
## 2 b 3
## 3 c 1
## 4 d 4
## 5 e 2
## 6 f 7
If you want to have the bars positioned between the integer numbers, you can use center = 0.5
to center the bins at half integers. I also use closed = "left"
, which is equivalent to right = FALSE
in hist()
:
ggplot(df, aes(x = Freq)) +
geom_histogram(binwidth = 1, center = 0.5, closed = "left") +
scale_x_continuous(breaks = 1:7)
Upvotes: 4