Reputation: 441
I have a data frame where I would like to perform multiple operations with. Here I give you an example to illustrate it, for example to create a list of plots:
library(tidyverse)
plot_fun = function(data, geom) {
plot = ggplot(data, aes(x = factor(0), y = Sepal.Length))
if (geom == 'bar') {
plot = plot + geom_col()
} else if (geom == 'box') {
plot = plot + geom_boxplot()
}
plot +
labs(x = unique(data$Species)) +
theme_bw() +
theme(axis.text.x = element_blank())
}
As you can see, this function takes a data frame, and perform two types of plots depending the geom parameter.
In my real problem, I have to split the data frame by one or multiple factors, and do the job. Do not take care about this specific example (I know I can put iris$Species on x-axis)
iris_ls = split(iris, iris$Species)
geom_ls = c('bar', 'box')
lapply(geom_ls, function(g) {
lapply(iris_ls, function(x) {
plot_fun(x, g)
})
})
My problem is due if I want to create both types of plots, I have to write a nested lapply (bad performance for parallelization cases).
So my question is, how should I avoid nested lapply procedure? Should I multiplicate length of iris_ls by the length of geom_ls vector? I do not know how to asses this. Imagine I have multiple geom like parameters in my function.
PS: Using drop = TRUE on split function, does not drop factor levels for each element of the list, I don't not know if it's the correct way to do it. I have to use another lapply to do it
Upvotes: 1
Views: 233
Reputation: 41260
Use the purrr package :
cross_ls <- purrr::cross(list(iris = split(iris, iris$Species),
geom = list('bar', 'box')))
cross_ls %>% purrr::map(~{plot_fun(.x$iris,.x$geom)})
or in its parallel version :
library(furrr)
plan(multiprocess)
cross_ls %>% furrr::future_map(~{plot_fun(.x$iris,.x$geom)})
Upvotes: 3