Giecod
Giecod

Reputation: 23

using geom_bar to plot the sum of values by criteria in R

I'm new in R and I am trying to use ggplot to create subsets of bar graph per id all together. Each bar must represent the sum of the values in d column by month-year (which is c column). d has NA values and numeric values as well.

My dataframe, df, is something like this, but it has actually around 10000 rows:

#Example of my data
a=c(1,1,1,1,1,1,1,1,3)
b=c("2007-12-03", "2007-12-10", "2007-12-17", "2007-12-24", "2008-01-07", "2008-01-14", "2008-01-21", "2008-01-28","2008-02-04")
c=c(format(b,"%m-%Y")[1:9])
d=c(NA,NA,NA,NA,NA,4.80, 0.00, 5.04, 3.84)
df=data.frame(a,b,c,d)
df

  a          b       c    d
1 1 2007-12-03 12-2007   NA
2 1 2007-12-10 12-2007   NA
3 1 2007-12-17 12-2007   NA
4 1 2007-12-24 12-2007   NA
5 1 2008-01-07 01-2008   NA
6 1 2008-01-14 01-2008 4.80
7 1 2008-01-21 01-2008 0.00
8 1 2008-01-28 01-2008 5.04
9 3 2008-02-04 02-2008 3.84

I tried to do my graph using this:

mplot<-ggplot(df,aes(y=d,x=c))+
       geom_bar()+
       theme(axis.text.x = element_text(angle=90, vjust=0.5))+
       facet_wrap(~ a)

I read from the help of geom_bar():

"geom_bar uses stat_count by default: it counts the number of cases at each x position"

So, I thought it would work like that by I'm having this error:

Error: stat_count() must not be used with a y aesthetic.

For the sample I'm providing, I would like to have the graph for id 1 that shows the months with NA empty and the 01-2008 with 9.84. Then for the second id, I would like to have again the months with NA empty and 02-2008 with 3.84.

I'm also tried to sum the data per month by using aggregate and sum before to plot and then use identity in the stat parameter of geom_bar, but, I'm getting NA in some months and I don't know the reason.

I really aprreciate your help.

Upvotes: 2

Views: 10624

Answers (3)

Holger Brandl
Holger Brandl

Reputation: 11202

No need to use geom_col as suggested by @Jan. Simply use the weight aesthetic instead:

ggplot(iris, aes(Species, weight=Sepal.Width)) + geom_bar() + ggtitle("summed sepal width")

Upvotes: 0

Jan
Jan

Reputation: 4216

You should use geom_col not geom_bar. See the help text:

There are two types of bar charts: geom_bar makes the height of the bar proportional to the number of cases in each group (or if the weight aethetic is supplied, the sum of the weights). If you want the heights of the bars to represent values in the data, use geom_col instead. geom_bar uses stat_count by default: it counts the number of cases at each x position. geom_col uses stat_identity: it leaves the data as is.

So your final line of code should be:

ggplot(df, aes(y=d, x=c)) + geom_col() + theme(axis.text.x = element_text(angle=90, vjust=0.5))+facet_wrap(~ a)

Upvotes: 1

AK88
AK88

Reputation: 3026

Do you want something like this:

mplot = ggplot(df, aes(x = b, y = d))+
  geom_bar(stat = "identity", position = "dodge")+
  facet_wrap(~ a)

mplot

enter image description here

I am using x = b instead of x = c for now.

Upvotes: 1

Related Questions