ohnoplus
ohnoplus

Reputation: 1325

Applying purrr::walk2() to a data.frame of data.frames at the end of a pipe

I have an R data frame, with one column of data frames, each of which I want to print to a file:

df0 <- tibble(x = 1:3, y = rnorm(3))
df1 <- tibble(x = 1:3, y = rnorm(3))
df2 <- tibble(x = 1:3, y = rnorm(3))

animalFrames <- tibble(animals = c('sheep', 'cow', 'horse'),
                       frames = list(df0, df1, df2))

I could do this with a for loop:

for (i in 1:dim(animalFrames)[1]){
    write.csv(animalFrames[i,2][[1]], file = paste0('test_', animalFrames[i,1], '.csv'))
}

Or with purrr's walk2 function:

walk2(animalFrames$animals, animalFrames$frames,  ~write.csv(.y, file
= paste0('test_', .x, '.csv')))

Is there some way I can put this walk function at the end of a magrittr pipe?

I was thinking something like:

animalFrames %>% do({walk2(.$animals, .$frames, ~write.csv(.y, file = paste0('test_', .x, '.csv')))})

But this gives me an error:

Error: Result must be a data frame, not character
Traceback:

1. animalFrames %>% do({
 .     walk2(.$animals, .$frames, ~write.csv(.y, file = paste0("test_", 
 .         .x, ".csv")))
 . })
2. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
3. eval(quote(`_fseq`(`_lhs`)), env, env)
4. eval(quote(`_fseq`(`_lhs`)), env, env)
5. `_fseq`(`_lhs`)
6. freduce(value, `_function_list`)
7. withVisible(function_list[[k]](value))
8. function_list[[k]](value)
9. do(., {
 .     walk2(.$animals, .$frames, ~write.csv(.y, file = paste0("test_", 
 .         .x, ".csv")))
 . })
10. do.data.frame(., {
  .     walk2(.$animals, .$frames, ~write.csv(.y, file = paste0("test_", 
  .         .x, ".csv")))
  . })
11. bad("Result must be a data frame, not {fmt_classes(out)}")
12. glubort(NULL, ..., .envir = parent.frame())
13. .abort(text)

Presumably because write.csv() is returning data frames and do() doesn't handle those or something.

I don't really have a coding requirement that I have to put walk at the end of a pipe (Indeed, I can always work around pipes), but it seems like I am missing something basic and this is bugging me. Any suggestions?

Upvotes: 4

Views: 1707

Answers (2)

Captain Hat
Captain Hat

Reputation: 3237

Use purrr::pwalk()

Very similar to the fantastic answer given by Calum You, but shorter and (in my opinion) slightly more elegant.

pwalk() walks across many list elements in parallel. It's mostly used to do walk() on more than two vectors. But because a tibble is a named list of columns, we can pass the whole tibble to pwalk() and each column becomes an argument which is passed to .f for parallel evaluation.

The shortest solution uses ~ notation based on column locations, but you can also write a function which accepts arguments with the same names as your columns:

## using column locations (`~` notation) ---------------------
animalFrames |> 
  pwalk(
    .f = ~ write.csv(.y, file = paste0("test_", .x, ".csv"))
  )

## using column names & custom function ----------------------
## (longer, more robust, perhaps more readable) --------------

save_file <- function(animals, frames){
  write.csv(frames, file = paste0("test_", animals, ".csv"))
}

animalFrames |> pwalk(save_file)

Created on 2022-10-11 by the reprex package (v2.0.1)

Upvotes: 0

Calum You
Calum You

Reputation: 15072

I don't think you need do at all. Both of the following work for me. The first is simply the same as yours minus do I think, the second makes use of magrittr's convenient %$% operator to expose the column names to walk2 and avoid the .$. Note that if this is at the end of a pipe it doesn't matter much whether you use walk2 or map2 since you don't care what's returned after this step.

NB I also swapped out paste0 and write.csv for tidyverse equivalents out of habit but they're easily put back in.

library(tidyverse)
df0 <- tibble(x = 1:3, y = rnorm(3))
df1 <- tibble(x = 1:3, y = rnorm(3))
df2 <- tibble(x = 1:3, y = rnorm(3))

animalFrames <- tibble(animals = c('sheep', 'cow', 'horse'),
                       frames = list(df0, df1, df2))

animalFrames %>%
  walk2(
    .x = .$animals,
    .y = .$frames,
    .f = ~ write_csv(.y, str_c("test_", .x, ".csv"))
  )

library(magrittr)
#> 
#> Attaching package: 'magrittr'
#> The following object is masked from 'package:purrr':
#> 
#>     set_names
#> The following object is masked from 'package:tidyr':
#> 
#>     extract
animalFrames %$%
  walk2(
    .x = animals,
    .y = frames,
    .f = ~ write_csv(.y, str_c("test_", .x, ".csv"))
  )

Created on 2018-03-13 by the reprex package (v0.2.0).

Upvotes: 7

Related Questions