Andres Gonzalez
Andres Gonzalez

Reputation: 329

How to loop and filter over groups

I'm having trouble doing a loop that filters over groups and then save one histogram per group. My code goes like this: (var 'Periodo' goes from 2020 to 2029, so I should end with 9 graphs)

for (i in 1:length(data_tot$Periodo)) {
  data_tot %>%
    filter(Periodo == Periodo[i]) %>%
   ggplot(data_tot, mapping =  aes(x=b1)) + 
    geom_histogram(fill="steelblue", aes(y=..density.., alpha=..count..), bins=60) + 
    labs(title="Beneficios", subtitle="") + ylab("Densidad") + 
    xlab("Beneficios ($millones)") + 
    geom_vline(aes(xintercept=mean(b1)), color="red4",linetype="dashed") +
    theme(legend.position = "none") + annotate("text", x= mean(data_tot$b1), y=0, 
                                               label=round(mean(data_tot$b1), digits = 2), 
                                               colour="red4", size=3.5, vjust=-1.5, hjust=-0.5) -> g
  ggsave(g, file=paste0(i,"_histogram.png"))
  }

The code runs but I'm getting 'infinite' graphs as I think R is understanding that I want one graph per row. Does anyone knows what am I doing wrong? Thanks in advance!

Upvotes: 1

Views: 366

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389135

1:length(data_tot$Periodo) is looping over every Periodo value. What you want instead is to loop over every unique Periodo value. You can rewrite the loop in one chain using split :

library(tidyverse)

data_tot %>%
  split(.$Periodo) %>%
  imap(~{
    ggplot(.x, aes(x=b1)) + 
      geom_histogram(fill="steelblue", aes(y=..density.., alpha=..count..), bins=60) + 
      labs(title="Beneficios", subtitle="") + ylab("Densidad") + 
      xlab("Beneficios ($millones)") + 
      geom_vline(aes(xintercept=mean(b1)), color="red4",linetype="dashed") +
      theme(legend.position = "none") + 
      annotate("text", x= mean(.x$b1), y=0, label=round(mean(.x$b1), digits = 2), 
              colour="red4", size=3.5, vjust=-1.5, hjust=-0.5) -> g
    ggsave(g, file=paste0(y,"_histogram.png"))
  })

You can also correct your for loop as :

for (i in unique(data_tot$Periodo)) {
  tmp <- data_tot %>% filter(Periodo == i)
    ggplot(tmp, aes(x=b1)) + 
    geom_histogram(fill="steelblue", aes(y=..density.., alpha=..count..), bins=60) + 
    labs(title="Beneficios", subtitle="") + ylab("Densidad") + 
    xlab("Beneficios ($millones)") + 
    geom_vline(aes(xintercept=mean(b1)), color="red4",linetype="dashed") +
    theme(legend.position = "none") + 
    annotate("text", x= mean(tmp$b1), y=0, label=round(mean(tmp$b1), digits = 2), 
             colour="red4", size=3.5, vjust=-1.5, hjust=-0.5) -> g
  ggsave(g, file=paste0(i,"_histogram.png"))
}

Upvotes: 3

Related Questions