Reputation: 5141
I've been asked to make a bar plot from pollution data. Example data can be found here. Data structure is as follows
str(datos) 'data.frame': 55 obs. of 10 variables:
$ PROVINCIA : int 46 46 46 46 46 46 46 46 46 46 ...
$ ESTACION : Factor w/ 55 levels "Alacant-El_Pla",..: 5 1 2 3 8 23 24 21 31 22 ...
$ MAXIMO_HORARIO : num 99.5 88.5 88.5 90 97.5 87.3 96 92.5 88 20 ...
$ PROMEDIO_DIARIO : num NA NA NA NA NA NA NA NA NA NA ...
$ MAXIMO_OCTOHORARIO : num 103.9 83.1 80.9 75.7 95.1 ...
$ VARIACION_MAX_HOR : num -25.2 -6.5 -6.7 -1.2 -13.2 -15.4 -12.7
-29.5 -16.3 NA ...
$ VARIACION_PRM_DIA : num NA NA NA NA NA NA NA NA NA NA ...
$ OSCILACION_DIARIO : num 16.5 63.7 53.3 62 26.8 31.3 29.2 15 52 20 ...
$ ESTACIONALIDAD_MAX : num -38.2 -39.6 -36.8 -38.8 -37.6 -51.8 -35.6 -40.3 -42.9 -86.5 ...
$ ESTACIONALIDAD_MAX-1: num NA NA NA NA NA NA NA NA NA NA ...
I've tried to use ggplot2 geom_bar geometry and facetting with the following code
datos=read.csv("data.csv",header=T,sep=",", na.strings="-99.9")
ggplot(datos, aes(ESTACION,MAXIMO_HORARIO, fill = factor(MAXIMO_HORARIO))) +
geom_bar(stat="identity") +
theme(axis.text.x = element_text(angle=90, size=10)) +
facet_grid(PROVINCIA ~ .)
obtaining this output
This is on the right way but I would like that every facet (group) shows its own values and not empty space that correspond to data in another facet, and also with the right labels in each grid. I can split data into three parts and produce three different plots but I'd like to build just a single file with the three plots in it.
Desired output would look like
EDIT: Output of dput(datos) **>
dput(datos)
structure(list(PROVINCIA = c(46L, 46L, 46L, 46L, 46L, 46L, 46L,
46L, 46L, 46L, 46L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L,
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L), ESTACION = structure(c(5L, 1L, 2L, 3L,
8L, 23L, 24L, 21L, 31L, 22L, 41L, 27L, 12L, 13L, 14L, 15L, 16L,
18L, 28L, 29L, 19L, 37L, 39L, 26L, 49L, 52L, 53L, 54L, 55L, 4L,
7L, 6L, 9L, 10L, 11L, 17L, 20L, 33L, 25L, 30L, 32L, 36L, 35L,
34L, 38L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 50L, 51L, 40L), .Label = c("Alacant-El_Pla",
"Alacant-Florida_Babel", "Alacant-Rabassa", "Albalat_dels_Tarongers",
"Alcoi-Verge_dels_Lliris", "Algar_de_Pal", "Alzira", "Benidorm",
"Benig", "Bull", "Burjassot-Facultats", "Burriana", "Castell1",
"Castell2", "Castell3", "Castell4", "Caudete_de_las_Fuentes",
"Cirat", "Coratxar", "Cortes_de_Pall", "Elda-Lacy", "El_Pin",
"Elx-Agroalimentari", "Elx-Parc_de_Bombers", "Gandia", "La_Vall_d",
"Lluce", "Morella", "Onda", "Ontinyent", "Orihuela", "Paterna-CEAM",
"Quart_de_Poblet", "Sagunt-CEA", "Sagunt-Nord", "Sagunt-Port",
"Sant_Jordi", "Torrebaja", "Torre_Endom", "Torrent-El_Vedat",
"Torrevieja", "Val1", "Val2", "Val3", "Val4", "Val5", "Val6",
"Val7", "Vilafranca", "Vilamarxant", "Villar_del_Arzobispo",
"Vinaros", "VinarosP", "Viver", "Zorita"), class = "factor"),
MAXIMO_HORARIO = c(99.5, 88.5, 88.5, 90, 97.5, 87.3, 96,
92.5, 88, 20, 20, 81.5, 99, 91.7, 93.5, 81.5, 90.5, 84.5,
100.3, 96.3, 41.7, 91.5, 57.3, NA, 93, 111.5, 86.8, NA, 100.3,
21.9, 80.5, 111, 98.7, 87.3, 89.7, 87.5, 41.7, 81.7, NA,
20, 84.8, 92, 88.7, NA, 74, NA, 95, 20.5, 85.7, 80, 82.3,
76, 20, 90.8, NA), PROMEDIO_DIARIO = c(NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 21.9, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA), MAXIMO_OCTOHORARIO = c(103.9, 83.1,
80.9, 75.7, 95.1, 82.9, 90.2, 83.5, 85, NA, NA, 77.1, 76.7,
91.4, 73.1, 65.1, 96.6, 81.1, 110.5, 91.1, NA, 87.8, 54.8,
NA, 95.1, 116.8, 79.9, NA, 107.2, 73.9, 70.5, 102.8, 100.5,
77.5, 80.9, 86.9, NA, 70.5, NA, NA, 73.5, 86.9, 86, NA, 83.5,
NA, 84.5, 20.5, 90.8, 71.5, 67.5, 64.5, NA, 91.4, NA), VARIACION_MAX_HOR = c(-25.2,
-6.5, -6.7, -1.2, -13.2, -15.4, -12.7, -29.5, -16.3, NA,
NA, -32.5, -11.5, -22.3, -19.5, -22.3, -25.3, -24.7, -14.7,
-18, NA, -12.8, -36, NA, -27.3, -11.4, -15.7, NA, -21.4,
-103.6, -26, -24.5, -33.1, -30, -31, -17.8, NA, -15.1, NA,
NA, -23.5, -32.5, -16.1, NA, -32.3, NA, -28.2, 0.3, -30.5,
-17.3, -18.4, -19.7, NA, -31.2, NA), VARIACION_PRM_DIA = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA), OSCILACION_DIARIO = c(16.5,
63.7, 53.3, 62, 26.8, 31.3, 29.2, 15, 52, 20, 20, 51.8, 85.7,
27.5, 80, 74.8, 45, 48.3, 12.5, 21.6, 41.7, 41.8, 35.3, NA,
26.5, 27.1, 64.2, NA, 58.6, 3.9, 39.2, 39.3, 32.9, 22.6,
43.4, 17.3, 41.7, 46.9, NA, 20, 50.8, 58.2, 64.5, NA, 2.7,
NA, 40.2, 1.5, 25.9, 30.5, 58.6, 31, 20, 15.8, NA), ESTACIONALIDAD_MAX = c(-38.2,
-39.6, -36.8, -38.8, -37.6, -51.8, -35.6, -40.3, -42.9, -86.5,
-83.6, -50.6, -35, -46.8, -45, -57.1, -31.4, -49.7, -35.5,
-45.7, -75.2, -44.1, -62.6, NA, -48.4, -10.8, -39.3, NA,
-38.1, -86.4, -53.7, -16.5, -42.3, -42.2, -38.1, -48.7, -68.2,
-45.4, NA, -87.6, -43.8, -44.2, -43.1, NA, -55.5, NA, -33.1,
-86.1, -38.3, -44.4, -41.6, -38.2, -85.5, -50.1, NA), ESTACIONALIDAD_MAX.1 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, -71.11,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("PROVINCIA",
"ESTACION", "MAXIMO_HORARIO", "PROMEDIO_DIARIO", "MAXIMO_OCTOHORARIO",
"VARIACION_MAX_HOR", "VARIACION_PRM_DIA", "OSCILACION_DIARIO",
"ESTACIONALIDAD_MAX", "ESTACIONALIDAD_MAX.1"), class = "data.frame", row.names = c(NA,
-55L))
**
Upvotes: 1
Views: 126
Reputation: 206526
Sounds like you want facet_wrap
rather than facet_grid
. Try
ggplot(datos, aes(ESTACION,MAXIMO_HORARIO, fill = factor(MAXIMO_HORARIO))) +
geom_bar(stat="identity") +
theme(axis.text.x = element_text(angle=90, size=10)) +
facet_wrap(~PROVINCIA , scales="free", ncol=1)
to get
Upvotes: 5
Reputation: 4824
facet_grid()
is not designed for what you want. Making the three plots separately is the right approach. But with the gridExtra
package it is easy to combine these plot elements (the gridExtra
package calls them "grobs") into a single plot or single file.
require(ggplot2)
require(gridExtra)
#toy data
dat <- data.frame(x=1:20, y=sample(1:20, size=20, replace=T), group=sample(1:3, size=20, replace=T))
#making each "grob"
p1 <- ggplot(subset(dat, group==1), aes(factor(x), y)) +
geom_bar(stat='identity')
p2 <- ggplot(subset(dat, group==2), aes(factor(x), y)) +
geom_bar(stat='identity')
p3 <- ggplot(subset(dat, group==3), aes(factor(x), y)) +
geom_bar(stat='identity')
#combine them into a single stack of plots
pAll <- grid.arrange(p1, p2, p3, ncol=1)
pAll
Note for this approach to work, your x-variable in the parent data.frame
will have to be a string or a numeric, not a factor. (For numerics, you have to make it a factor after subsetting: that's the only way ggplot2
will know that you don't want to show the gaps where each subset has no data. For strings, this won't be a problem and the x-axis doesn't need to be a factor at any point.)
Upvotes: 2