Reputation: 1075
Let's say I've a data frame containing an array of numbers which I want to visualise in a histogram. What I want to achieve is to show only the bins containing more than let's say 50 observations.
Step 1
set.seed(10)
x <- data.frame(x = rnorm(1000, 50, 2))
p <-
x %>%
ggplot(., aes(x)) +
geom_histogram()
p
Step 2
pg <- ggplot_build(p)
pg$data[[1]]
As a check when I print the pg$data[[1]]
I'd like to have only rows where count >= 50
.
Thank you
Upvotes: 1
Views: 568
Reputation: 5897
You could do something like this, most likely you do not really like the factorized names on the x-axis, but what you can do is split the two values and take the average to take that one to plot the x-axis.
x %>%
mutate(bin = cut(x, breaks = 30)) %>%
group_by(bin) %>%
mutate(count = n()) %>%
filter(count > 50) %>%
ggplot(., aes(bin)) +
geom_histogram(stat = "count")
Upvotes: 0
Reputation: 79194
library(ggplot2)
ggplot(x, aes(x=x, y = ifelse(..count.. > 50, ..count.., 0))) +
geom_histogram(bins=30)
With this code you can see the counts of the deleted bins:
library(ggplot2)
ggplot(x, aes(x=x, y = ifelse(..count.. > 50, ..count.., 0))) +
geom_histogram(bins=30, fill="green", color="grey") +
stat_bin(aes(label=..count..), geom="text", vjust = -0.7)
Upvotes: 2