mikebmassey
mikebmassey

Reputation: 8584

Bar chart in ggplot not grouping by factor

I'm trying to plot a bar chart in ggplot2 where each factor gets the mean of the observations. However, the plot is the mean of the entire population, and is not breaking out/grouping by the factor, which is what I want

Here is the chart: enter image description here

When I calculate the mean for the groups, there is a difference, which is what I want to plot.

  US      Foreign
1 89.76   124.02

Here is the mean of the entire column in the dataframe

mean(clients$OrderSize)
[1] 96.71

Here is the structure of the dataframe. I have CountryType as a factor, as this is what I want to group by:

str(clients)
'data.frame':   252774 obs. of  4 variables:
$ ClientID     : Factor w/ 252774 levels "58187855","59210128",..: 19 20 21 22 23 24 25 26 27 28 ...
$ Country      : Factor w/ 207 levels "Afghanistan",..: 196 60 139 196 196 40 40 196 196 196 ...
$ CountryType  : Factor w/ 2 levels "Foreign","US": 2 1 1 2 2 1 1 2 2 2 ...
$ OrderSize    : num  12.95 21.99 5.00 7.50 44.5 ...

This is the call I am making:

ggplot(data = clients, aes(x=CountryType, y=mean(OrderSize))) + geom_bar() + ylab("")

And I tried explictely setting CountryType as a factor with no luck:

ggplot(data = clients, aes(x=factor(CountryType), y=mean(OrderSize))) + geom_bar() + ylab("")

Do I need to pre-calculate the means for the two groups before I call ggplot or am I missing something?

Upvotes: 1

Views: 978

Answers (1)

joran
joran

Reputation: 173527

Try something more like this:

dat <- data.frame(x = rep(letters[1:2],each = 25),y = 1:50)
ggplot(dat,aes(x = x,y = y)) + 
    stat_summary(fun.y = mean,geom = "bar")

enter image description here

As a general note, avoid idioms like aes(y = value) where value is a single value, rather than the name of a column in your data frame. That's just not how ggplot2 is intended to be used. (Although all rules can be broken in certain circumstances...)

Upvotes: 4

Related Questions