tom
tom

Reputation: 725

Translating a 'for loop' to 'purrr::map'

I am trying to translate this basic for loop using the purr package. The idea is to apply a function using data frame elements as parameters.

Creating the data frame to loop on using the mpg dataset from ggplot2:

param <- mpg %>% select(manufacturer, year) %>% distinct() %>% rename(man = manufacturer, y = year)

The function to apply:

fcn <- function(man, y) {
    df <- mpg %>% filter(manufacturer == man & year == y)
    mod <- lm(data = df, cty ~ hwy)
    out <- summary(mod)
    return(out)
}

And the loop to apply fcn for each man and y combination :

for (i in 1:nrow(param)) {
    fcn(man = param$man[i], 
        y = param$y[i])
}

I am very new to purr and struggle how general specifications of purr::map work. Thanks a lot.

EDIT : I used here a very basic example with fcn and param to understand how to include function parameters (from param) inside the map specification. As a results, I was not particularly interested in a nesting beforehand but only the dull translation of the loop using map that could work for any king of function with multiple parameters.

Upvotes: 1

Views: 827

Answers (3)

tom
tom

Reputation: 725

The following post helped me to achieve the desired outcome, general enough to be applied in many situations and ignoring nesting: https://stackoverflow.com/a/52309113/10580543.

Using pmap:

output <- param %>% pmap(~fcn(.x, .y)) 

Upvotes: 0

StupidWolf
StupidWolf

Reputation: 46898

You can nest the data first within manufacturer and year, then map using a function, except below, I used the .x directly, which would be each element of the data you map through. You can also use tidy() from broom to put the summary() result into a data.frame:

library(purrr)
library(tidyr)
library(dplyr)
library(broom)

mpg = ggplot2::mpg

result = mpg %>% 
select(manufacturer, year,cty,hwy) %>% 
nest(data=c(cty, hwy)) %>% 
mutate(
model=map(data,~lm(cty ~ hwy,data=.x)),
summary=map(model,~tidy(summary(.x)))
) 

# A tibble: 30 x 5
   manufacturer  year data              model  summary         
   <chr>        <int> <list>            <list> <list>          
 1 audi          1999 <tibble [9 × 2]>  <lm>   <tibble [2 × 5]>
 2 audi          2008 <tibble [9 × 2]>  <lm>   <tibble [2 × 5]>
 3 chevrolet     2008 <tibble [12 × 2]> <lm>   <tibble [2 × 5]>
 4 chevrolet     1999 <tibble [7 × 2]>  <lm>   <tibble [2 × 5]>
 5 dodge         1999 <tibble [16 × 2]> <lm>   <tibble [2 × 5]>
 6 dodge         2008 <tibble [21 × 2]> <lm>   <tibble [2 × 5]>

If you want to look at the results of summary:

 result %>% unnest(summary)
# A tibble: 55 x 9
   manufacturer  year data    model  term   estimate std.error statistic p.value
   <chr>        <int> <list>  <list> <chr>     <dbl>     <dbl>     <dbl>   <dbl>
 1 audi          1999 <tibbl… <lm>   (Inte…   -5.85     6.15      -0.951 3.73e-1
 2 audi          1999 <tibbl… <lm>   hwy       0.879    0.235      3.74  7.27e-3
 3 audi          2008 <tibbl… <lm>   (Inte…   -0.5      3.68      -0.136 8.96e-1
 4 audi          2008 <tibbl… <lm>   hwy       0.695    0.137      5.08  1.43e-3

Upvotes: 1

Zanboor
Zanboor

Reputation: 33

If I have understood correctly you want to model the cty based on hwy for each year and manufacturer combinations.

library(tidyverse)
library(ggplot2)
library(purrr)

I have changed the definition of your function to fit to the map function settings.

fcn <- function(df){
  mod <- lm(data = df, cty ~ hwy)
  return(summary(mod))
}

The code below should produce the summary of the model for each year and manufacturer

mpg %>% group_by(manufacturer, year) %>%
  nest() %>% mutate(model = map(data, fcn))

Upvotes: 2

Related Questions