augustinus ntjamba
augustinus ntjamba

Reputation: 17

How to plot a histogram using the ggplot

I have a big dataset, and I want to plot a histogram of TYPE.OF.CRIME against HOUR.

This is my dataset:

df <- structure(list(HOUR = c(23, 10, 14, 21, NA, 14), TYPE.OF.CRIME = c("ARMED ROBBERY", 
"ARMED ROBBERY", "ARMED ROBBERY", "ARMED ROBBERY", "ARMED ROBBERY", 
"ASSAULT GBH")), row.names = c(NA, -6L), class = "data.frame")

Here is my code:

ggplot(df, aes(x=TYPE.OF.CRIME, y=HOUR)) +
  geom_histogram()  

When running this code I get the following error:

Error: stat_bin() can only have an x or y aesthetic.

Upvotes: 0

Views: 127

Answers (2)

stefan
stefan

Reputation: 125797

A Histogram is a visualization of the distribution of one variable. That's why ggplot2 or stat_bin is complaining. We could only have an x (vertical histogram) or an y (horizontal histogram) aesthetic.

As you want to visualise the distribution of crimes by hour this can be achieved by mapping HOUR on x and mapping TYPE.OF.CRIME on fill to color the bars:

library(ggplot2)

ggplot(df, aes(x = HOUR, fill = TYPE.OF.CRIME)) +
  geom_histogram()

However, in case of your data I would recommend to simply use a bar chart:

ggplot(df, aes(x = HOUR, fill = TYPE.OF.CRIME)) +
  geom_bar()

Upvotes: 1

Edward
Edward

Reputation: 19514

Perhaps a density plot would be a better graphic that allows you to compare the two crimes over time of day.

library(ggplot2)
ggplot(df, aes(x=HOUR, fill=TYPE.OF.CRIME)) +
  geom_density(alpha=0.5)

enter image description here


Data:

df <- structure(list(TYPE.OF.CRIME = c("ARMED ROBBERY", "ARMED ROBBERY", 
"ARMED ROBBERY", "ARMED ROBBERY", "ARMED ROBBERY", "ASSAULT GBH", 
"ASSAULT GBH", "ASSAULT GBH", "ASSAULT GBH", "ASSAULT GBH"), 
    WEEK = c(1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L), HOUR = c(23L, 
    10L, 14L, 21L, NA, 14L, 12L, 18L, 17L, 16L), day = c(1L, 
    3L, 7L, 8L, 15L, 3L, 3L, 3L, 3L, 3L), month = c(1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L), year = c(2011L, 2011L, 2011L, 
    2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L)), class = "data.frame", row.names = c(NA, 
-10L))

Upvotes: 1

Related Questions