Reputation: 507
I'm fairly new to R and the ggplot2 package. I am trying to create a simple histogram of different cities students are coming from for an after school program. My code thus far is this
cities.hist=ggplot(data,
aes(x=reorder(City, City, function(x)-length(x))))+
geom_histogram()+
theme(axis.text.x = element_text(angle = 60, hjust = 1))
this generates a large histogram with many cities only being counted once. Is there a ggplot2 function to only display output given a threshold of counts? I.E. only plot cities that have a count of 5 or more.
I'd like to avoid reordering the actual dataframe if at all possible.
Upvotes: 2
Views: 692
Reputation: 145755
There's not a built-in ggplot way to do this. What you should do is just give ggplot the subset of your data that you want plotted. One way to do it using base functions is like this:
cities.hist=ggplot(data = subset(as.data.frame(table(data$City)), Freq > 1, select = City),
aes(x=reorder(City, City, function(x)-length(x))))+
geom_histogram()+
theme(axis.text.x = element_text(angle = 60, hjust = 1))
The other option would be to calculate a frequency table outside of ggplot (you could use as.data.frame(table(...)
as above or something more syntactically friendly like dplyr
) and then plot that with y = freq
inside your aesthetic mapping and stat = "identity"
inside the geom_histogram
call. Something like this:
require(dplyr)
data %.% group_by(City) %.% summarize(freq = n()) %.%
ggplot(aes(x = reorder(City, City, function(x)-length(x)),
y = freq)) +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 60, hjust = 1))
Upvotes: 1