Reputation: 3195
Is there a way to pause a series of pipes to store a temporary variable that can be used later on in pipe sequence?
I found this question but I'm not sure that it was doing the same thing I am looking for.
Here's a sample dataframe:
library(dplyr)
set.seed(123)
df <- tibble(Grp = c("Apple","Boy","Cat","Dog","Edgar","Apple","Boy","Cat","Dog","Edgar"),
a = sample(0:9, 10, replace = T),
b = sample(0:9, 10, replace = T),
c = sample(0:9, 10, replace = T),
d = sample(0:9, 10, replace = T),
e = sample(0:9, 10, replace = T),
f = sample(0:9, 10, replace = T),
g = sample(0:9, 10, replace = T))
I am going to convert df
to long format but, after having done so, I will need to apply the number of rows before the gather
.
This is what my desired output looks like. In this case, storing the number of rows before the pipe begins would look like:
n <- nrow(df)
df %>%
gather(var, value, -Grp) %>%
mutate(newval = value * n)
# A tibble: 70 x 4
Grp var value newval
<chr> <chr> <int> <int>
1 Apple a 2 20
2 Boy a 7 70
3 Cat a 4 40
4 Dog a 8 80
5 Edgar a 9 90
6 Apple a 0 0
7 Boy a 5 50
8 Cat a 8 80
9 Dog a 5 50
10 Edgar a 4 40
# ... with 60 more rows
In my real world problem, I have a long chain of pipes and it would be a lot easier if I could perform this action within the pipe structure. I would like to do something that looks like this:
df %>%
{ "n = nrow(.)" } %>% # temporary variable is created here but df is passed on
gather(var, value, -Grp) %>%
mutate(newval = value * n)
I could do something like the following, but it seems really sloppy.
df %>%
mutate(n = nrow(.)) %>%
gather(var, value, -Grp, -n) %>%
mutate(newval = value * mean(n))
Is there a way to do this or perhaps a good workaround?
Upvotes: 4
Views: 1747
Reputation: 887048
Here is an option with %>>%
(pipe operator) from pipeR
library(pipeR)
library(dplyr)
library(tidyr)
df %>>%
(~ n = nrow(.)) %>%
gather(., var, value, -Grp) %>%
mutate(newval = value * n)
# A tibble: 70 x 4
# Grp var value newval
# <chr> <chr> <int> <int>
# 1 Apple a 2 20
# 2 Boy a 7 70
# 3 Cat a 4 40
# 4 Dog a 8 80
# 5 Edgar a 9 90
# 6 Apple a 0 0
# 7 Boy a 5 50
# 8 Cat a 8 80
# 9 Dog a 5 50
#10 Edgar a 4 40
# … with 60 more rows
Upvotes: 2
Reputation: 206197
You could use a code block for a local variable. This would look like
df %>%
{ n = nrow(.)
gather(., var, value, -Grp) %>%
mutate(newval = value * n)
}
Notice how we have to pass the .
to gather
as well here and the pipe continues inside the block. But you could put other parts afterwards
df %>%
{ n = nrow(.)
gather(., var, value, -Grp) %>%
mutate(newval = value * n)
} %>%
select(newval)
Upvotes: 6