Mag
Mag

Reputation: 31

Using 'for loop' in R to plot multiple histograms - avoiding plotting the same graph multiple times

I have a data set named pairs that looks similar to this :

  id_density sp_code country  dens_ab ...
       <dbl> <fct>  <chr>     <dbl>       
1         15 LALO     US      24.0
2         16 LALO     US      32.0 
3         17 LALO     US      20.0 
4         18 LALO     US      30.0
5         19 LALO     US      17.5 
6         20 LALO     US      32.5 
...

I want to make a histogram for each country (3) as a small multiple : example

I have 21 different values for the column sp_code, and I would like to use a loop to make 21 different small multiple histograms, one set of histograms per species.

I looked online to find something and tried this code :

# libraries
library(ggplot2)
library(dplyr)

# group by country
pairs <- pairs %>% group_by(country)

dplyr::is_grouped_df(pairs)
# Grouped (TRUE)

# Loop
plot.country <- function(x = pairs) {
  sp_code <- unique(x$sp_code)
  for (i in seq_along(sp_code)) {
    plot <- x %>% 
      ggplot(aes(x = dens_ab, color = country, fill = country)) +
  geom_histogram(data = filter(x, sp_code == sp_code[i]), alpha = 0.6, binwidth = 0.5) +
  scale_fill_viridis(discrete=TRUE) +
  scale_color_viridis(discrete=TRUE) +
  theme_ipsum()+
  xlab("Density (pairs/km2)") +
  ylab("Frequency") +
  facet_wrap(~country)
    if (dir.exists("output")) { } 
    else {dir.create("output")}
    ggsave(filename = paste0("output/", sp_code[i], "plot_site_dens.png"),
           plot = plot,
           width = 11, height = 8.5, units = "in")
    print(plot)
  }
}

# Plot function execution
plot.country()

I managed to get 21 small multiples under and output file, but many histograms seem to show the same results and do not fit with the actual data distribution : the background is also black to me even tho it is white when I don't use a loop same data for 2 countries

Does this have to do with the script I'm using (and it can be corrected) ? Or has it something to do with my data ?

I am quite new to R and it's the first time I'm trying something with 'for loop', so I don't really know how to fix this...

Upvotes: -1

Views: 53

Answers (1)

margusl
margusl

Reputation: 17514

One of the reasons behind identical plots is

filter(x, sp_code == sp_code[i])

, where both sp_codes are data-variables, columns in x; so it basically reads: "keep rows in dataset x where values in x$sp_code column are equal to i-th value of x$sp_code".

Assuming that included data example comes from head(pairs), this expression evaluates as filter(x, sp_code == "LALO") during at least first 6 iterations producing at least 6 identical plots.

You could work around this particular issue by explicitly pointing to data- and env-vars through .data & .env pronouns:

filter(x, .data$sp_code == .env$sp_code[i])

On the other hand, a generic example with ggplot2::mpg dataset, grouping, group_walk() and without loops might look something like this:

library(ggplot2)
library(dplyr)

glimpse(mpg)
#> Rows: 234
#> Columns: 11
#> $ manufacturer <chr> "audi", "audi", "audi", "audi", "audi", "audi", "audi", "…
#> $ model        <chr> "a4", "a4", "a4", "a4", "a4", "a4", "a4", "a4 quattro", "…
#> $ displ        <dbl> 1.8, 1.8, 2.0, 2.0, 2.8, 2.8, 3.1, 1.8, 1.8, 2.0, 2.0, 2.…
#> $ year         <int> 1999, 1999, 2008, 2008, 1999, 1999, 2008, 1999, 1999, 200…
#> $ cyl          <int> 4, 4, 4, 4, 6, 6, 6, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 8, 8, …
#> $ trans        <chr> "auto(l5)", "manual(m5)", "manual(m6)", "auto(av)", "auto…
#> $ drv          <chr> "f", "f", "f", "f", "f", "f", "f", "4", "4", "4", "4", "4…
#> $ cty          <int> 18, 21, 20, 21, 16, 18, 18, 18, 16, 20, 19, 15, 17, 17, 1…
#> $ hwy          <int> 29, 29, 31, 30, 26, 26, 27, 26, 25, 28, 27, 25, 25, 25, 2…
#> $ fl           <chr> "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p…
#> $ class        <chr> "compact", "compact", "compact", "compact", "compact", "c…

# group_walk will call it with current group data and current group key (1-row tibble with grouping info),
# latter is used to generate a file name, works with multiple grouping variables
plot_grp_and_save <- function(data, grp_key, mapping = aes(x = cty), facets = ~drv, ext = ".png") {
  p <- 
    ggplot(data, mapping) +
    geom_histogram(binwidth = 5) +
    facet_wrap(facets)
  ggsave(paste0(grp_key, collapse = "_") |> paste0(ext), plot = p)
  print(p)
}

mpg |> 
  group_by(year) |> 
  # apply plot_grp_and_save() to each group
  group_walk(plot_grp_and_save)

Resulting files:

fs::dir_info(glob = "*.png")[,1:5]
#> # A tibble: 2 × 5
#>   path       type         size permissions modification_time  
#>   <fs::path> <fct> <fs::bytes> <fs::perms> <dttm>             
#> 1 1999.png   file          25K rw-         2025-01-25 13:24:10
#> 2 2008.png   file        26.4K rw-         2025-01-25 13:24:10

Created on 2025-01-25 with reprex v2.1.1

Upvotes: 2

Related Questions