Reputation: 1
Im working with the MPG data set. I am trying to make a bar graph with cylinders (cyl) on the X axis and Highway miles per gallon (hwy) on the y- axis using the code below.
ggplot(data= mpg) +
geom_bar(mapping = aes(x =cyl, y= hwy), stat = "identity")
The Y- values for Hwy in the data set are between ~20-30 mpg, but on my graph the y-axis values range from 0-2000.
Why are the Y- values different in the graph?
Upvotes: 0
Views: 745
Reputation: 39595
It might be due to the lack of another variable. You got large values because all quantities are accumulating. If you add a variable like this, you will get what you want:
library(tidyverse)
#Code
ggplot(data= mpg,aes(x =factor(cyl), y= hwy,fill=manufacturer)) +
geom_bar(stat = "identity",position = position_dodge(0.9))
Output:
Where values for hwy
are now displayed properly:
#Code
summary(mpg$hwy)
Output:
Min. 1st Qu. Median Mean 3rd Qu. Max.
12.00 18.00 24.00 23.44 27.00 44.00
One option to keep only two variables and analysing their relationship is using geom_point()
in this way:
#Code 2
ggplot(data= mpg,aes(x =cyl, y= hwy)) +
geom_point()
Output:
Upvotes: 1