hmhensen
hmhensen

Reputation: 3195

Temporarily store variable in series of pipes dplyr

Is there a way to pause a series of pipes to store a temporary variable that can be used later on in pipe sequence?

I found this question but I'm not sure that it was doing the same thing I am looking for.

Here's a sample dataframe:

library(dplyr)
set.seed(123)
df <- tibble(Grp = c("Apple","Boy","Cat","Dog","Edgar","Apple","Boy","Cat","Dog","Edgar"),
             a = sample(0:9, 10, replace = T),
             b = sample(0:9, 10, replace = T),
             c = sample(0:9, 10, replace = T),
             d = sample(0:9, 10, replace = T),
             e = sample(0:9, 10, replace = T),
             f = sample(0:9, 10, replace = T),
             g = sample(0:9, 10, replace = T))

I am going to convert df to long format but, after having done so, I will need to apply the number of rows before the gather.

This is what my desired output looks like. In this case, storing the number of rows before the pipe begins would look like:

n <- nrow(df)

df %>% 
  gather(var, value, -Grp) %>% 
  mutate(newval = value * n)
# A tibble: 70 x 4
   Grp   var   value newval
   <chr> <chr> <int>  <int>
 1 Apple a         2     20
 2 Boy   a         7     70
 3 Cat   a         4     40
 4 Dog   a         8     80
 5 Edgar a         9     90
 6 Apple a         0      0
 7 Boy   a         5     50
 8 Cat   a         8     80
 9 Dog   a         5     50
10 Edgar a         4     40
# ... with 60 more rows

In my real world problem, I have a long chain of pipes and it would be a lot easier if I could perform this action within the pipe structure. I would like to do something that looks like this:

df %>% 
  { "n = nrow(.)" } %>% # temporary variable is created here but df is passed on
  gather(var, value, -Grp) %>% 
  mutate(newval = value * n)

I could do something like the following, but it seems really sloppy.

df %>% 
  mutate(n = nrow(.)) %>% 
  gather(var, value, -Grp, -n) %>% 
  mutate(newval = value * mean(n))

Is there a way to do this or perhaps a good workaround?

Upvotes: 4

Views: 1747

Answers (2)

akrun
akrun

Reputation: 887048

Here is an option with %>>% (pipe operator) from pipeR

library(pipeR)
library(dplyr)
library(tidyr)
df %>>% 
   (~ n  = nrow(.)) %>% 
    gather(., var, value, -Grp) %>%
    mutate(newval = value * n)
# A tibble: 70 x 4
#   Grp   var   value newval
#   <chr> <chr> <int>  <int>
# 1 Apple a         2     20
# 2 Boy   a         7     70
# 3 Cat   a         4     40
# 4 Dog   a         8     80
# 5 Edgar a         9     90
# 6 Apple a         0      0
# 7 Boy   a         5     50
# 8 Cat   a         8     80
# 9 Dog   a         5     50
#10 Edgar a         4     40
# … with 60 more rows

Upvotes: 2

MrFlick
MrFlick

Reputation: 206197

You could use a code block for a local variable. This would look like

df %>% 
{ n = nrow(.)
  gather(., var, value, -Grp) %>% 
  mutate(newval = value * n)
}

Notice how we have to pass the . to gather as well here and the pipe continues inside the block. But you could put other parts afterwards

df %>% 
{ n = nrow(.)
  gather(., var, value, -Grp) %>% 
  mutate(newval = value * n)
} %>% 
select(newval)

Upvotes: 6

Related Questions