Archymedes
Archymedes

Reputation: 441

How to combine jobs to avoid nested lapply

I have a data frame where I would like to perform multiple operations with. Here I give you an example to illustrate it, for example to create a list of plots:

library(tidyverse)

plot_fun = function(data, geom) {

  plot = ggplot(data, aes(x = factor(0), y = Sepal.Length))

  if (geom == 'bar') {
    plot = plot + geom_col()
  } else if (geom == 'box') {
    plot = plot + geom_boxplot()
  }

  plot +
    labs(x = unique(data$Species)) +
    theme_bw() +
    theme(axis.text.x = element_blank())

}

As you can see, this function takes a data frame, and perform two types of plots depending the geom parameter.

In my real problem, I have to split the data frame by one or multiple factors, and do the job. Do not take care about this specific example (I know I can put iris$Species on x-axis)

iris_ls = split(iris, iris$Species)
geom_ls = c('bar', 'box')

lapply(geom_ls, function(g) {
  lapply(iris_ls, function(x) {
    plot_fun(x, g)
  })
})

My problem is due if I want to create both types of plots, I have to write a nested lapply (bad performance for parallelization cases).

So my question is, how should I avoid nested lapply procedure? Should I multiplicate length of iris_ls by the length of geom_ls vector? I do not know how to asses this. Imagine I have multiple geom like parameters in my function.

PS: Using drop = TRUE on split function, does not drop factor levels for each element of the list, I don't not know if it's the correct way to do it. I have to use another lapply to do it

Upvotes: 1

Views: 233

Answers (1)

Waldi
Waldi

Reputation: 41260

Use the purrr package :

cross_ls  <- purrr::cross(list(iris = split(iris, iris$Species),
                               geom = list('bar', 'box')))

cross_ls %>% purrr::map(~{plot_fun(.x$iris,.x$geom)})

or in its parallel version :

library(furrr)
plan(multiprocess)

cross_ls %>% furrr::future_map(~{plot_fun(.x$iris,.x$geom)})

Upvotes: 3

Related Questions