Jorge Anariba
Jorge Anariba

Reputation: 29

Making multiple plots in R dependent on two different categorical variables

I'm new to R and I'm trying to generate in RStudio multiple bar plots in a single graph by using a dataset with more than 1000 observations of several variables. Below is an fragment of the dataset:

Municipality    Production  Type
Atima           690         Reverification
Atima           120         Reverification
Atima           220         Reverification
Comayagua       153         Initial
Comayagua       193         Initial
Comayagua       138         Initial
Comayagua       307         Reverification
Copán           179         Initial
Copán           100         Initial
Copán           236         Reverification
Copán           141         Reverification
Danlí            56         Reverification
...

The dataset's structure is

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   1543 obs. of  3 variables:
$ Municipality    : chr  "Atima" "Atima" "Atima" "Comayagua" ...
$ Production      : num  98 690 153 307 179 ...
$ Type            : chr  "Reverification" "Reverification" "Reverification" "Initial" ...

What I would like to come up with is a barplot displaying a pair of bars (1 pair per municipality), a bar showing how much production a municipality had in "Initial" and another bar showing how much in "Reverification".

I've tried with various commands, such as barplot, barchart and ggplot, but so far without success.

Should I split the Type variable into 2, 1 for each category? I've also tried to plot it only for production depending on the type, and received the following message:

barplot(table(dataset$Production[dataset$Type=="Initial"]), names.arg = Municipality)
Error in barplot.default(dataset$Production[dataset$Type=="Initial"]), names.arg = 
Municipality, : incorrect number of names

I'm working in Rstudio Version 0.99.441, in Windows 7.

Thanks in advance for your help.

Upvotes: 1

Views: 516

Answers (1)

grrgrrbla
grrgrrbla

Reputation: 2589

try this:

library(ggplot2)
library(data.table)
df_s <- 
    as.data.table(df)[ , .("Production_Sum" = sum(Production)),
                      by = .(Municipality, Type)]

ggplot(df_s, aes( x = Municipality, y = Production_Sum, fill = Type)) +
    geom_bar(stat = "identity", position = position_dodge())

enter image description here

I am using the following data (which you specified in your OP):

df <- read.table(header = TRUE, text = "Municipality    Production  Type
Atima           690         Reverification
Atima           120         Reverification
Atima           220         Reverification
Comayagua       153         Initial
Comayagua       193         Initial
Comayagua       138         Initial
Comayagua       307         Reverification
Copán           179         Initial
Copán           100         Initial
Copán           236         Reverification
Copán           141         Reverification
Danlí            56         Reverification
")

Upvotes: 1

Related Questions