Reputation: 725
I am trying to translate this basic for loop using the purr package. The idea is to apply a function using data frame elements as parameters.
Creating the data frame to loop on using the mpg
dataset from ggplot2
:
param <- mpg %>% select(manufacturer, year) %>% distinct() %>% rename(man = manufacturer, y = year)
The function to apply:
fcn <- function(man, y) {
df <- mpg %>% filter(manufacturer == man & year == y)
mod <- lm(data = df, cty ~ hwy)
out <- summary(mod)
return(out)
}
And the loop to apply fcn
for each man
and y
combination :
for (i in 1:nrow(param)) {
fcn(man = param$man[i],
y = param$y[i])
}
I am very new to purr and struggle how general specifications of purr::map work. Thanks a lot.
EDIT :
I used here a very basic example with fcn
and param
to understand how to include function parameters (from param
) inside the map specification. As a results, I was not particularly interested in a nesting beforehand but only the dull translation of the loop using map that could work for any king of function with multiple parameters.
Upvotes: 1
Views: 827
Reputation: 725
The following post helped me to achieve the desired outcome, general enough to be applied in many situations and ignoring nesting: https://stackoverflow.com/a/52309113/10580543.
Using pmap:
output <- param %>% pmap(~fcn(.x, .y))
Upvotes: 0
Reputation: 46898
You can nest the data first within manufacturer and year, then map using a function, except below, I used the .x
directly, which would be each element of the data you map through. You can also use tidy()
from broom to put the summary()
result into a data.frame:
library(purrr)
library(tidyr)
library(dplyr)
library(broom)
mpg = ggplot2::mpg
result = mpg %>%
select(manufacturer, year,cty,hwy) %>%
nest(data=c(cty, hwy)) %>%
mutate(
model=map(data,~lm(cty ~ hwy,data=.x)),
summary=map(model,~tidy(summary(.x)))
)
# A tibble: 30 x 5
manufacturer year data model summary
<chr> <int> <list> <list> <list>
1 audi 1999 <tibble [9 × 2]> <lm> <tibble [2 × 5]>
2 audi 2008 <tibble [9 × 2]> <lm> <tibble [2 × 5]>
3 chevrolet 2008 <tibble [12 × 2]> <lm> <tibble [2 × 5]>
4 chevrolet 1999 <tibble [7 × 2]> <lm> <tibble [2 × 5]>
5 dodge 1999 <tibble [16 × 2]> <lm> <tibble [2 × 5]>
6 dodge 2008 <tibble [21 × 2]> <lm> <tibble [2 × 5]>
If you want to look at the results of summary:
result %>% unnest(summary)
# A tibble: 55 x 9
manufacturer year data model term estimate std.error statistic p.value
<chr> <int> <list> <list> <chr> <dbl> <dbl> <dbl> <dbl>
1 audi 1999 <tibbl… <lm> (Inte… -5.85 6.15 -0.951 3.73e-1
2 audi 1999 <tibbl… <lm> hwy 0.879 0.235 3.74 7.27e-3
3 audi 2008 <tibbl… <lm> (Inte… -0.5 3.68 -0.136 8.96e-1
4 audi 2008 <tibbl… <lm> hwy 0.695 0.137 5.08 1.43e-3
Upvotes: 1
Reputation: 33
If I have understood correctly you want to model the cty
based on hwy
for each year
and manufacturer
combinations.
library(tidyverse)
library(ggplot2)
library(purrr)
I have changed the definition of your function to fit to the map
function settings.
fcn <- function(df){
mod <- lm(data = df, cty ~ hwy)
return(summary(mod))
}
The code below should produce the summary of the model for each year and manufacturer
mpg %>% group_by(manufacturer, year) %>%
nest() %>% mutate(model = map(data, fcn))
Upvotes: 2