Raivo Kolde
Raivo Kolde

Reputation: 739

Printing intermediate results without breaking pipeline in tidyverse

Is there a command to add to tidyverse pipelines that does not break the flow, but produces some side effect, like printing something out. The usecase I have in mind is something like this. In case of a pipeline

data %>%
  mutate(new_var = <some time consuming operation>) %>%
  mutate(new_var2 = <some other time consuming operation>) %>%
  ...

I would like to add some command to the pipeline that would not modify the end result, but would print out some progress or the state of things. Maybe something like this:

data %>%
  mutate(new_var = <some time consuming operation>) %>%
  command_x(print("first operation done")) %>%
  mutate(new_var2 = <some other time consuming operation>) %>%
  ...

Does there exist such command_x already?

Upvotes: 13

Views: 4353

Answers (3)

Jonas Lindel&#248;v
Jonas Lindel&#248;v

Reputation: 5683

For the specific case of printing an intermediate step in the pipeline, just use %>% print() %>%. E.g.,

mtcars %>%
  filter(cyl == 4) %>%
  print() %>%
  summarise(mpg = mean(mpg))

For a simple status message, either library(tidylog) or do it manually:

pipe_message = function(.data, status) {message(status); .data}
mtcars %>%
  filter(cyl == 4) %>%
  pipe_message("first operation done") %>%
  select(cyl)

See the answer by @MrFlick for a more general solution for non-print functions.

Upvotes: 16

GitHunter0
GitHunter0

Reputation: 574

You can do on the fly with an anonymous function:

mtcars %>% ( function(x){print(x); return(x)} ) %>% nrow()

Upvotes: 5

MrFlick
MrFlick

Reputation: 206242

You could easily write your own function

pass_through <- function(data, fun) {fun(data); data}

And use it like

mtcars %>% pass_through(. %>% ncol %>% print) %>% nrow

Here we use the . %>% syntax to create an anonymous function. You could also write your own more explicitly with

mtcars %>% pass_through(function(x) print(ncol(x))) %>% nrow

Upvotes: 9

Related Questions