Reputation: 12856
Using ggplot2 generate a plot which shows the following data.
df=data.frame(score=c(4,2,3,5,7,6,5,6,4,2,3,5,4,8),
age=c(18,18,23,50,19,39,19,23,22,22,40,35,22,16))
str(df)
df
Instead of doing a frequency plot of the variables (see below code), I want to generate a plot of the average values for each x value. So I want to plot the average score at each age level. At age 18 on the x axis, we might have a 3 on the y axis for score. At age 23, we might have an average score of 4.5, and so forth (Edit: average values corrected). This would ideally be represented with a barplot.
ggplot(df, aes(x=factor(age), y=factor(score))) + geom_bar()
Error: stat_count() must not be used with a y aesthetic.
Just not sure how to do this in R with ggplot2 and can't seem to find anything on such plots. Statistically, I don't know if the plot I desire to plot is even the right thing to do, but that's a different store.
Upvotes: 34
Views: 109712
Reputation: 41437
Another option is doing a group_by
of the x-values and summarise
the "mean_score" per "age" using dplyr
to do it in one pipe. Also you can use geom_col
instead of geom_bar
. Here is a reproducible example:
df=data.frame(score=c(4,2,3,5,7,6,5,6,4,2,3,5,4,8),
age=c(18,18,23,50,19,39,19,23,22,22,40,35,22,16))
library(dplyr)
library(ggplot2)
df %>%
group_by(age) %>%
summarise(mean_score = mean(score)) %>%
ggplot(aes(x = factor(age), y = mean_score)) +
geom_col() +
labs(x = "Age", y = "Mean score")
Created on 2022-08-26 with reprex v2.0.2
Upvotes: 0
Reputation: 4133
You can use summary functions in ggplot
. Here are two ways of achieving the same result:
# Option 1
ggplot(df, aes(x = factor(age), y = score)) +
geom_bar(stat = "summary", fun = "mean")
# Option 2
ggplot(df, aes(x = factor(age), y = score)) +
stat_summary(fun = "mean", geom = "bar")
Older versions of ggplot
use fun.y
instead of fun
:
ggplot(df, aes(x = factor(age), y = score)) +
stat_summary(fun.y = "mean", geom = "bar")
Upvotes: 73
Reputation: 193637
You can also use aggregate()
in base R instead of loading another package.
temp = aggregate(list(score = df$score), list(age = factor(df$age)), mean)
ggplot(temp, aes(x = age, y = score)) + geom_bar()
Upvotes: 7
Reputation: 14453
If I understood you right, you could try something like this:
library(plyr)
library(ggplot2)
ggplot(ddply(df, .(age), mean), aes(x=factor(age), y=factor(score))) + geom_bar()
Upvotes: 8