Daniel V
Daniel V

Reputation: 1382

Plotting sum of fields in ggplot

I have the following code for graphing the sum of a series of numerical columns. This should return a stacked bar chart where the contribution of each column is a different colour.

library(readxl)
library(tidyverse)
library(ggthemes)
library(extrafont)
library(RColorBrewer)
library(scales)
library(gridExtra)

ggplot(data, aes(x = `Location Group`, 
                 y = Medical + Wages + `Rehab Cum` + `Invest Cum`,
                 fill = variable)) +
  geom_bar(stat = "identity")

This is coming up with the error Error in FUN(X[[i]], ...) : object 'variable' not found.

I'm not sure what would cause this, the formatting could have easily be copied and pasted from one of a hundred other cases here. Libraries included in case of conflicts (but I doubt that would be the case)

Sample data would be

Medical Wages `Rehab Cum` `Invest Cum`
    <dbl> <dbl>       <dbl>        <dbl>
1    1230 10360        1234          200
2     245  9782        2345          300
3    2234  6542        3456            0
4    5564  1234        4567          400
5      13   357           0            0
6     987   951           0            0

Upvotes: 1

Views: 847

Answers (1)

jimjamslam
jimjamslam

Reputation: 2067

The problem is that ggplot2 doesn't understand what variable is. The key to ggplot2 is remembering that each aspect of your plot should be represented by a column in your data.

So in this case, you don't need to give four different columns to your y mapping—ggplot2 will automatically stack variables if they overlap each other (geom_bar has the default position = "stack"). Instead, you want one column in your data for the y value and another for the colour each part of the bar should be (fill).

Using fill = variable is correct: you want the bars to be shaded according to which variable is being plotted. But variable needs to actually be a column in your dataset. So you want it to look more like this:

`Location Group`        variable        value
---------------------------------------------
location1               Medical         20
location1               Wages           30
location1               Rehab Cum       45
location1               Invest Cum      60
location2               Medical          5
location2               Wages           15
location2               Rehab Cum       55
location2               Invest Cum      90

Then x is mapped to Location Group, y is mapped to value and fill is mapped to variable.

You can get your data into this shape using gather :

library(tidyr)
data = data %>% gather(variable, value, Medical, Wages, `Rehab Cum`, `Invest Cum`)

ggplot(data, aes(x = `Location Group`, y = value, fill = variable)) +
  geom_bar(stat = "identity")

Upvotes: 2

Related Questions