pacomet
pacomet

Reputation: 5141

R ggplot2 grid labeling

I've been asked to make a bar plot from pollution data. Example data can be found here. Data structure is as follows

    str(datos) 'data.frame':    55 obs. of  10 variables:   
$ PROVINCIA    : int  46 46 46 46 46 46 46 46 46 46 ...   
$ ESTACION     : Factor w/ 55 levels "Alacant-El_Pla",..: 5 1 2 3 8 23 24 21 31 22 ...   
$ MAXIMO_HORARIO    : num  99.5 88.5 88.5 90 97.5 87.3 96 92.5 88 20 ... 
$ PROMEDIO_DIARIO     : num  NA NA NA NA NA NA NA NA NA NA ...   
$ MAXIMO_OCTOHORARIO  : num  103.9 83.1 80.9 75.7 95.1 ...   
$ VARIACION_MAX_HOR   : num  -25.2 -6.5 -6.7 -1.2 -13.2 -15.4 -12.7
    -29.5 -16.3 NA ...   
$ VARIACION_PRM_DIA   : num  NA NA NA NA NA NA NA NA NA NA ...   
$ OSCILACION_DIARIO   : num  16.5 63.7 53.3 62 26.8 31.3 29.2 15 52 20 ...   
$ ESTACIONALIDAD_MAX  : num  -38.2 -39.6 -36.8 -38.8 -37.6 -51.8 -35.6 -40.3 -42.9 -86.5 ...   
$ ESTACIONALIDAD_MAX-1: num  NA NA NA NA NA NA NA NA NA NA ...

I've tried to use ggplot2 geom_bar geometry and facetting with the following code

datos=read.csv("data.csv",header=T,sep=",", na.strings="-99.9")

ggplot(datos, aes(ESTACION,MAXIMO_HORARIO, fill = factor(MAXIMO_HORARIO))) +
  geom_bar(stat="identity") +
  theme(axis.text.x  = element_text(angle=90, size=10)) +
  facet_grid(PROVINCIA ~ .)

obtaining this output

enter image description here

This is on the right way but I would like that every facet (group) shows its own values and not empty space that correspond to data in another facet, and also with the right labels in each grid. I can split data into three parts and produce three different plots but I'd like to build just a single file with the three plots in it.

Desired output would look like enter image description here

EDIT: Output of dput(datos) **>

dput(datos)
structure(list(PROVINCIA = c(46L, 46L, 46L, 46L, 46L, 46L, 46L, 
46L, 46L, 46L, 46L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L), ESTACION = structure(c(5L, 1L, 2L, 3L, 
8L, 23L, 24L, 21L, 31L, 22L, 41L, 27L, 12L, 13L, 14L, 15L, 16L, 
18L, 28L, 29L, 19L, 37L, 39L, 26L, 49L, 52L, 53L, 54L, 55L, 4L, 
7L, 6L, 9L, 10L, 11L, 17L, 20L, 33L, 25L, 30L, 32L, 36L, 35L, 
34L, 38L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 50L, 51L, 40L), .Label = c("Alacant-El_Pla", 
"Alacant-Florida_Babel", "Alacant-Rabassa", "Albalat_dels_Tarongers", 
"Alcoi-Verge_dels_Lliris", "Algar_de_Pal", "Alzira", "Benidorm", 
"Benig", "Bull", "Burjassot-Facultats", "Burriana", "Castell1", 
"Castell2", "Castell3", "Castell4", "Caudete_de_las_Fuentes", 
"Cirat", "Coratxar", "Cortes_de_Pall", "Elda-Lacy", "El_Pin", 
"Elx-Agroalimentari", "Elx-Parc_de_Bombers", "Gandia", "La_Vall_d", 
"Lluce", "Morella", "Onda", "Ontinyent", "Orihuela", "Paterna-CEAM", 
"Quart_de_Poblet", "Sagunt-CEA", "Sagunt-Nord", "Sagunt-Port", 
"Sant_Jordi", "Torrebaja", "Torre_Endom", "Torrent-El_Vedat", 
"Torrevieja", "Val1", "Val2", "Val3", "Val4", "Val5", "Val6", 
"Val7", "Vilafranca", "Vilamarxant", "Villar_del_Arzobispo", 
"Vinaros", "VinarosP", "Viver", "Zorita"), class = "factor"), 
    MAXIMO_HORARIO = c(99.5, 88.5, 88.5, 90, 97.5, 87.3, 96, 
    92.5, 88, 20, 20, 81.5, 99, 91.7, 93.5, 81.5, 90.5, 84.5, 
    100.3, 96.3, 41.7, 91.5, 57.3, NA, 93, 111.5, 86.8, NA, 100.3, 
    21.9, 80.5, 111, 98.7, 87.3, 89.7, 87.5, 41.7, 81.7, NA, 
    20, 84.8, 92, 88.7, NA, 74, NA, 95, 20.5, 85.7, 80, 82.3, 
    76, 20, 90.8, NA), PROMEDIO_DIARIO = c(NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, 21.9, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA), MAXIMO_OCTOHORARIO = c(103.9, 83.1, 
    80.9, 75.7, 95.1, 82.9, 90.2, 83.5, 85, NA, NA, 77.1, 76.7, 
    91.4, 73.1, 65.1, 96.6, 81.1, 110.5, 91.1, NA, 87.8, 54.8, 
    NA, 95.1, 116.8, 79.9, NA, 107.2, 73.9, 70.5, 102.8, 100.5, 
    77.5, 80.9, 86.9, NA, 70.5, NA, NA, 73.5, 86.9, 86, NA, 83.5, 
    NA, 84.5, 20.5, 90.8, 71.5, 67.5, 64.5, NA, 91.4, NA), VARIACION_MAX_HOR = c(-25.2, 
    -6.5, -6.7, -1.2, -13.2, -15.4, -12.7, -29.5, -16.3, NA, 
    NA, -32.5, -11.5, -22.3, -19.5, -22.3, -25.3, -24.7, -14.7, 
    -18, NA, -12.8, -36, NA, -27.3, -11.4, -15.7, NA, -21.4, 
    -103.6, -26, -24.5, -33.1, -30, -31, -17.8, NA, -15.1, NA, 
    NA, -23.5, -32.5, -16.1, NA, -32.3, NA, -28.2, 0.3, -30.5, 
    -17.3, -18.4, -19.7, NA, -31.2, NA), VARIACION_PRM_DIA = c(NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA), OSCILACION_DIARIO = c(16.5, 
    63.7, 53.3, 62, 26.8, 31.3, 29.2, 15, 52, 20, 20, 51.8, 85.7, 
    27.5, 80, 74.8, 45, 48.3, 12.5, 21.6, 41.7, 41.8, 35.3, NA, 
    26.5, 27.1, 64.2, NA, 58.6, 3.9, 39.2, 39.3, 32.9, 22.6, 
    43.4, 17.3, 41.7, 46.9, NA, 20, 50.8, 58.2, 64.5, NA, 2.7, 
    NA, 40.2, 1.5, 25.9, 30.5, 58.6, 31, 20, 15.8, NA), ESTACIONALIDAD_MAX = c(-38.2, 
    -39.6, -36.8, -38.8, -37.6, -51.8, -35.6, -40.3, -42.9, -86.5, 
    -83.6, -50.6, -35, -46.8, -45, -57.1, -31.4, -49.7, -35.5, 
    -45.7, -75.2, -44.1, -62.6, NA, -48.4, -10.8, -39.3, NA, 
    -38.1, -86.4, -53.7, -16.5, -42.3, -42.2, -38.1, -48.7, -68.2, 
    -45.4, NA, -87.6, -43.8, -44.2, -43.1, NA, -55.5, NA, -33.1, 
    -86.1, -38.3, -44.4, -41.6, -38.2, -85.5, -50.1, NA), ESTACIONALIDAD_MAX.1 = c(NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, -71.11, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("PROVINCIA", 
"ESTACION", "MAXIMO_HORARIO", "PROMEDIO_DIARIO", "MAXIMO_OCTOHORARIO", 
"VARIACION_MAX_HOR", "VARIACION_PRM_DIA", "OSCILACION_DIARIO", 
"ESTACIONALIDAD_MAX", "ESTACIONALIDAD_MAX.1"), class = "data.frame", row.names = c(NA, 
-55L))

**

Upvotes: 1

Views: 126

Answers (2)

MrFlick
MrFlick

Reputation: 206526

Sounds like you want facet_wrap rather than facet_grid. Try

ggplot(datos, aes(ESTACION,MAXIMO_HORARIO, fill = factor(MAXIMO_HORARIO))) +
  geom_bar(stat="identity") +
  theme(axis.text.x  = element_text(angle=90, size=10)) +
  facet_wrap(~PROVINCIA , scales="free", ncol=1)

to get

enter image description here

Upvotes: 5

Curt F.
Curt F.

Reputation: 4824

facet_grid() is not designed for what you want. Making the three plots separately is the right approach. But with the gridExtra package it is easy to combine these plot elements (the gridExtra package calls them "grobs") into a single plot or single file.

require(ggplot2)
require(gridExtra)

#toy data
dat <- data.frame(x=1:20, y=sample(1:20, size=20, replace=T), group=sample(1:3, size=20, replace=T))

#making each "grob"
p1 <- ggplot(subset(dat, group==1), aes(factor(x), y)) + 
         geom_bar(stat='identity')
p2 <- ggplot(subset(dat, group==2), aes(factor(x), y)) + 
         geom_bar(stat='identity')
p3 <- ggplot(subset(dat, group==3), aes(factor(x), y)) + 
         geom_bar(stat='identity')

#combine them into a single stack of plots
pAll <- grid.arrange(p1, p2, p3, ncol=1)
pAll

Note for this approach to work, your x-variable in the parent data.frame will have to be a string or a numeric, not a factor. (For numerics, you have to make it a factor after subsetting: that's the only way ggplot2 will know that you don't want to show the gaps where each subset has no data. For strings, this won't be a problem and the x-axis doesn't need to be a factor at any point.)

Upvotes: 2

Related Questions