Reputation: 33
I want to store some variables within a column cell within a tibble. I then want to call that column and either paste the names of those variables or call that column and paste the columns which those variables correspond to together. In addition, all of this occurs within a function and this is the only piece of hard coding left so I'd really like to find a way to solve this.
library("tidyverse")
myData<-tibble("c1"=c("a","b","c"),
"c2"=c("1","2","3"),
"c3"=c("A","B","C"),
factors=c(list(c("c1","c2")),list(c("c2","c3")),list(c("c1","c2","c3"))))
myData%>%mutate(factors1=interaction(!!!quos(factors),sep=":",lex.order=TRUE))
# A tibble: 3 x 5
c1 c2 c3 factors factors1
<chr> <chr> <chr> <list> <fct>
1 a 1 A <chr [2]> c1:c2:c1
2 b 2 B <chr [2]> c2:c3:c2
3 c 3 C <chr [3]> c1:c2:c3
So this allows me to concatenate the names of the variables but as you can see, if one list is longer than the others, it loops.
For the second problem in which I would like to use the $factors column to specifically call the values of other columns, I can hardcode this like so:
myData%>%
mutate(factors2=interaction(!!!syms(c("c1","c2")),sep=":",lex.order=TRUE))
# A tibble: 3 x 5
c1 c2 c3 factors factors2
<chr> <chr> <chr> <list> <fct>
1 a 1 A <chr [2]> a:1
2 b 2 B <chr [2]> b:2
3 c 3 C <chr [3]> c:3
However if I try this:
myData%>%
mutate(factors2=interaction(!!!syms(factors),sep=":",lex.order=TRUE))
Error in lapply(.x, .f, ...) : object 'factors' not found
The same happens if I try to unlist the factors or use other rlang expressions. I have also tried nesting rlang expressions but so far haven't found one that works as I intended.
I feel like this is something that should be possible but so far I haven't found a question on stack overflow or a tutorial that indicates that it is so maybe I'm on a wild goose chase. Thank you all for your time and help.
My code in full:
library("tidyverse")
myData<-tibble("c1"=c("a","b","c"),
"c2"=c("1","2","3"),
"c3"=c("A","B","C"),
factors=c(list(c("c1","c2")),list(c("c2","c3")),list(c("c1","c2","c3"))))%>%
mutate(factors1=interaction(!!!quos(factors),sep=":",lex.order=TRUE))%>%
mutate(factors2=interaction(!!!syms(factors),sep=":",lex.order=TRUE))
My desired output is:
# A tibble: 3 x 6
c1 c2 c3 factors factors1 factors2
<chr> <chr> <chr> <list> <fct> <fct>
1 a 1 A <chr [2]> c1:c2 a:1
2 b 2 B <chr [2]> c2:c3 2:B
3 c 3 C <chr [3]> c1:c2:c3 c:3:C
Upvotes: 3
Views: 772
Reputation: 18681
Here is a method using map
and imap
:
library(tidyverse)
myData %>%
mutate(factor1 = factors %>% map(~interaction(as.list(.), sep=':', lex.order = TRUE)) %>% unlist(),
factor2 = factors %>% imap(~interaction(myData[.y, match(.x, names(myData))], sep=":", lex.order = TRUE)) %>% unlist())
For factor1
, instead of splicing the arguments into dots, I pass a list into interaction
.
For factor2
, I match factors
in each row with the names
in myData
and uses the column index (match(.x, names(myData))
) in combination with the row index (.y
from imap
) to subset the appropriate elements to feed into interaction
.
Both factor1
and factor2
require an unlist
because map
and imap
returns lists.
Output:
# A tibble: 3 x 6
c1 c2 c3 factors factor1 factor2
<chr> <chr> <chr> <list> <fct> <fct>
1 a 1 A <chr [2]> c1:c2 a:1
2 b 2 B <chr [2]> c2:c3 2:B
3 c 3 C <chr [3]> c1:c2:c3 c:3:C
Upvotes: 1
Reputation: 13691
You first question can be addressed with purrr::map
and purrr::lift
families of functions:
myData %>%
mutate( factors1 = map(factors, lift_dv(interaction, sep=":", lex.order=TRUE)) ) %>%
mutate_at( "factors1", lift(fct_c) )
# # A tibble: 3 x 5
# c1 c2 c3 factors factors1
# <chr> <chr> <chr> <list> <fct>
# 1 a 1 A <chr [2]> c1:c2
# 2 b 2 B <chr [2]> c2:c3
# 3 c 3 C <chr [3]> c1:c2:c3
The second question is more tricky, because !!!
causes the evaluation of its argument immediately, which can sometimes lead to unintuitive operator precedence inside a dplyr
chain. The cleanest way is to define a standalone function that composes your interaction
expressions:
f <- function(fct) {expr( interaction(!!!syms(fct), sep=":", lex.order=TRUE) )}
# Example usage
f( myData$factors[[1]] ) # interaction(c1, c2, sep = ":", lex.order = TRUE)
f( myData$factors[[2]] ) # interaction(c2, c3, sep = ":", lex.order = TRUE)
myData %>% mutate( e = map(factors, f) )
# # A tibble: 3 x 5
# c1 c2 c3 factors e
# <chr> <chr> <chr> <list> <list>
# 1 a 1 A <chr [2]> <language>
# 2 b 2 B <chr [2]> <language>
# 3 c 3 C <chr [3]> <language>
Unfortunately, we can't evaluate e
directly, because it will feed the entire columns c1
, c2
, and c3
to the expressions, whereas you only want a single value that is in the same row as the expression. For this reason, we need to encapsulate columns c1
through c3
in a row-wise fashion.
X <- myData %>% mutate( e = map(factors, f) ) %>%
rowwise() %>% mutate( d = list(data_frame(c1,c2,c3)) ) %>% ungroup()
# # A tibble: 3 x 6
# c1 c2 c3 factors e d
# <chr> <chr> <chr> <list> <list> <list>
# 1 a 1 A <chr [2]> <language> <tibble [1 × 3]>
# 2 b 2 B <chr [2]> <language> <tibble [1 × 3]>
# 3 c 3 C <chr [3]> <language> <tibble [1 × 3]>
Now you have expressions in e
that need to be applied to data in d
, so it's just a simple map2
traversal from here. Putting everything together and cleaning up, we get:
myData %>%
mutate( factors1 = map(factors, lift_dv(interaction, sep=":", lex.order=TRUE)) ) %>%
mutate( e = map(factors, f) ) %>%
rowwise() %>% mutate( d = list(data_frame(c1,c2,c3)) ) %>% ungroup() %>%
mutate( factors2 = map2( e, d, rlang::eval_tidy ) ) %>%
mutate_at( vars(factors1,factors2), lift(fct_c) ) %>%
select( -e, -d )
# # A tibble: 3 x 6
# c1 c2 c3 factors factors1 factors2
# <chr> <chr> <chr> <list> <fct> <fct>
# 1 a 1 A <chr [2]> c1:c2 a:1
# 2 b 2 B <chr [2]> c2:c3 2:B
# 3 c 3 C <chr [3]> c1:c2:c3 c:3:C
Upvotes: 1