Mike
Mike

Reputation: 507

limit histogram output ggplot2

I'm fairly new to R and the ggplot2 package. I am trying to create a simple histogram of different cities students are coming from for an after school program. My code thus far is this

cities.hist=ggplot(data,
  aes(x=reorder(City, City, function(x)-length(x))))+
  geom_histogram()+
  theme(axis.text.x = element_text(angle = 60, hjust = 1))

this generates a large histogram with many cities only being counted once. Is there a ggplot2 function to only display output given a threshold of counts? I.E. only plot cities that have a count of 5 or more.

I'd like to avoid reordering the actual dataframe if at all possible.

Upvotes: 2

Views: 692

Answers (1)

Gregor Thomas
Gregor Thomas

Reputation: 145755

There's not a built-in ggplot way to do this. What you should do is just give ggplot the subset of your data that you want plotted. One way to do it using base functions is like this:

cities.hist=ggplot(data = subset(as.data.frame(table(data$City)), Freq > 1, select = City),
  aes(x=reorder(City, City, function(x)-length(x))))+
  geom_histogram()+
  theme(axis.text.x = element_text(angle = 60, hjust = 1))

The other option would be to calculate a frequency table outside of ggplot (you could use as.data.frame(table(...) as above or something more syntactically friendly like dplyr) and then plot that with y = freq inside your aesthetic mapping and stat = "identity" inside the geom_histogram call. Something like this:

require(dplyr)
data %.% group_by(City) %.% summarize(freq = n()) %.%
    ggplot(aes(x = reorder(City, City, function(x)-length(x)),
               y = freq)) +
    geom_bar(stat = "identity") +
    theme(axis.text.x = element_text(angle = 60, hjust = 1))

Upvotes: 1

Related Questions