How reproduce multiple plot (ggplot) with for iteration in R?

I am not very skilled with the ggplot package, but I would like to know the reason for the following error. I am looking to graph through an iteration (for) several ggplot graphs.

The names of the columns come as follows:

> head(colnames(basefinal))

[1] "1. Nombre de la empresa"                                  
[2] "2.Municipio de ubicación de la Empresa"                   
[3] "3. ¿Qué rol desempeña usted dentro de la Empresa?"        
[4] "4. El tipo de Sociedad de la Empresa familiar es: *"      
[5] "5.¿Cuál es el número de empleados de la Empresa familiar?"
[6] "6. La actividad principal de la Empresa familiar es:"  

The iteration code is:

for (nm in names(basefinal)) 
{
    ggplot(basefinal, aes_string(parse(nm))) + 
       geom_bar(fill="sienna1",aes(y = (..count..)/sum(..count..))) + 
       theme_classic() + 
       labs(y = "Porcentaje de empresas (%)", x= 'Rol dentro de la empresa') + 
       scale_y_continuous(labels = scales::percent) + 
       coord_flip() + 
       ggtitle(nm)
}

Show this error:

cannot open file '1. Nombre de la empresa': No such file or directoryError in file(filename, "r") : cannot open the connection

Upvotes: 1

Views: 198

Answers (1)

r2evans
r2evans

Reputation: 160447

Here are some suggestions for what I think you're doing here.

First, I don't have your data, so I'll make some up. My q1 is similar to your 1. Nombre de la empresa.

set.seed(42)
basefinal <- data.frame(
  q1 = sample(letters[1:3], size=100, replace=TRUE),
  q2 = sample(letters[1:3], size=100, replace=TRUE),
  q3 = sample(letters[1:3], size=100, replace=TRUE))
head(basefinal)
#   q1 q2 q3
# 1  c  b  c
# 2  c  a  b
# 3  a  a  c
# 4  c  b  b
# 5  b  c  a
# 6  b  c  b

My first thought is that you don't need to parse things:

for (nm in names(basefinal)) {
  ggplot(basefinal, aes_string(nm)) +
    geom_bar(fill="sienna1", aes(y = (..count..)/sum(..count..))) +
    theme_classic() +
    labs(y = "Porcentaje de empresas (%)", x= 'Rol dentro de la empresa') +
    scale_y_continuous(labels = scales::percent) +
    coord_flip() +
    ggtitle(nm)
}

One "problem" with this is the way that ggplot plots work. With base graphics, everything works in implied side-effect by altering the current "dev" (graphic canvas). In contrast, ggplot plots do nothing to the current dev until you explicitly do something. By design, this is when you explicitly tell it to do so by one of:

  1. Call ggplot(...) + ... explicitly on the console. This implicitly returns an object of class c("gg", "ggplot"), and the console finds an S3 method to use to print it. In this case, it finds ggplot2:::print.ggplot.
  2. Store it into a variable, perhaps gg <- ggplot(..) + ..., and then dump this object on the console (type gg on the > console prompt). This finds the print method from number 1.
  3. Explicitly print it with print(gg) (which finds ggplot2:::print.ggplot2, even if you don't to look for it or where to find it ... it is not exported).

So in the for loop, you create a series of ggplot objects but do nothing with them, so they are silently discarded. Further, for loops never return anything (other than NULL), regardless of what is in the loop. So if you want to present all of these plots in rapid succession, you should capture the object and print it.

for (nm in names(basefinal)) {
  gg <- ggplot(basefinal, aes_string(nm)) +
    geom_bar(fill="sienna1", aes(y = (..count..)/sum(..count..))) +
    theme_classic() +
    labs(y = "Porcentaje de empresas (%)", x= 'Rol dentro de la empresa') +
    scale_y_continuous(labels = scales::percent) +
    coord_flip() +
    ggtitle(nm)
  print(gg)
}

This does not let you really pause and look at any of them, other than the last plot (since in R and RStudio, the plot window normally stays up). If you want to be able to look at them individually, you might want to capture all of the grobs (graphic objects).

ggs <- list()
for (nm in names(basefinal)) {
  gg <- ggplot(basefinal, aes_string(nm)) +
    geom_bar(fill="sienna1", aes(y = (..count..)/sum(..count..))) +
    theme_classic() +
    labs(y = "Porcentaje de empresas (%)", x= 'Rol dentro de la empresa') +
    scale_y_continuous(labels = scales::percent) +
    coord_flip() +
    ggtitle(nm)
  ggs[[nm]] <- gg
}
print(ggs[[1]]) # or print(ggs[[ names(basefinal)[3] ]])

first ggplot

I'm a fan of R's apply functions, however, and this is more efficiently coded as

ggs <- sapply(names(basefinal), function(nm) {
  ggplot(basefinal, aes_string(nm)) +
    geom_bar(fill="sienna1", aes(y = (..count..)/sum(..count..))) +
    theme_classic() +
    labs(y = "Porcentaje de empresas (%)", x= 'Rol dentro de la empresa') +
    scale_y_continuous(labels = scales::percent) +
    coord_flip() +
    ggtitle(nm)
}, simplify = FALSE)

(Same ability to now view any of the grobs in ggs as-desired.)


But wait, there's more.

I think data like this might benefit from faceting, one thing that ggplot2 makes somewhat easy. Unfortunately, your data is in a "wide" layout and ggplot2 prefers "long" formats. Let's reshape this:

tidyr::pivot_longer(basefinal, everything())
# # A tibble: 300 x 2
#    name  value
#    <chr> <fct>
#  1 q1    c    
#  2 q2    b    
#  3 q3    c    
#  4 q1    c    
#  5 q2    a    
#  6 q3    b    
#  7 q1    a    
#  8 q2    a    
#  9 q3    c    
# 10 q1    c    
# # ... with 290 more rows

We can now combine all questions into one plot. This can be arranged either as a grid (various columns) or just wrapping. Examples:

ggplot(tidyr::pivot_longer(basefinal, everything()),
       aes(value)) +
  geom_bar(fill="sienna1", aes(y = (..count..)/sum(..count..))) +
  theme_classic() +
  labs(y = "Porcentaje de empresas (%)", x= 'Rol dentro de la empresa') +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  facet_grid(~ name)

ggplot facet grid 1

... + facet_grid(name ~ .)

ggplot facet grid 2

... + facet_wrap("name", ncol = 2)

ggplot wrap

No loop required, and everything is plotted on the same scale. (They can have individual scales if you'd like, but sometimes it can be best -- less-biasing -- to show them with the same axes.)

Upvotes: 2

Related Questions