nicholas
nicholas

Reputation: 983

Iterating over listed data frames within a piped purrr anonymous function call

Using purrr::map and the magrittr pipe, I am trying generate a new column with values equal to a substring of the existing column.

I can illustrate what I'm trying to do with the following toy dataset:

library(tidyverse)
library(purrr)

test <- list(tibble(geoid_1970 = c(123, 456), 
                    name_1970 = c("here", "there"), 
                    pop_1970 = c(1, 2)),
             tibble(geoid_1980 = c(234, 567), 
                    name_1980 = c("here", "there"), 
                    pop_1970 = c(3, 4))
)

Within each listed data frame, I want a column equal to the relevant year. Without iterating, the code I have is:

data <- map(test, ~ .x %>% mutate(year = as.integer(str_sub(names(test[[1]][1]), -4))))

Of course, this returns a year of 1970 in both listed data frames, which I don't want. (I want 1970 in the first and 1980 in the second.)

In addition, it's not piped, and my attempt to pipe it throws an error:

data <- test %>% map(~ .x %>% mutate(year = as.integer(str_sub(names(.x[[1]][1]), -4))))
# > Error: Problem with `mutate()` input `year`.
# > x Input `year` can't be recycled to size 2.
# > ℹ Input `year` is `as.integer(str_sub(names(.x[[1]][1]), -4))`.
# > ℹ Input `year` must be size 2 or 1, not 0.

How can I iterate over each listed data frame using the pipe?

Upvotes: 1

Views: 76

Answers (2)

Waldi
Waldi

Reputation: 41210

Try:

test %>% map(~.x %>% mutate(year = as.integer(str_sub(names(.x[1]), -4))))

[[1]]
# A tibble: 2 x 4
  geoid_1970 name_1970 pop_1970  year
       <dbl> <chr>        <dbl> <int>
1        123 here             1  1970
2        456 there            2  1970

[[2]]
# A tibble: 2 x 4
  geoid_1980 name_1980 pop_1970  year
       <dbl> <chr>        <dbl> <int>
1        234 here             3  1980
2        567 there            4  1980

Upvotes: 3

akrun
akrun

Reputation: 886948

We can get the 'year' with parse_number

library(dplyr)
library(purrr)
map(test, ~ .x %>%
      mutate(year = readr::parse_number(names(.)[1])))

-output

#[[1]]
# A tibble: 2 x 4
#  geoid_1970 name_1970 pop_1970  year
#       <dbl> <chr>        <dbl> <dbl>
#1        123 here             1  1970
#2        456 there            2  1970

#[[2]]
# A tibble: 2 x 4
#  geoid_1980 name_1980 pop_1970  year
#       <dbl> <chr>        <dbl> <dbl>
#1        234 here             3  1980
#2        567 there            4  1980

Upvotes: 3

Related Questions