Reputation: 189
I've got a dataset to which I want to apply the following operations:
1 & 2 work fine using purrr and dplyr, but in step 3 when I try to reference one of the columns created in step 2 it gives me an 'object not found' error. It works fine if I reference only columns in the 'data' list column that already existed before step 2. If I inspect the contents of the list column after step 2 everything referenced in step 3 is there so why is the newly created column not recognized?
Reprex
library(tidyverse)
mtcarsT <- mtcars %>% as_tibble() %>%
group_by(cyl, gear, vs) %>%
mutate(cp_flag = rep(c('C', 'P'), length.out = n())) %>%
# step 1, works fine.
nest() %>%
mutate(data = map(data, ~ .x %>%
group_by(mpg, disp, hp, drat, am, carb) %>%
# step 2, also works fine, generates new columns 'wt_C', 'wt_P', 'qsec_C', 'qsec_P' in 'data'
pivot_wider(names_from = cp_flag, values_from = wt:qsec) %>%
ungroup()))
# > mtcarsT[1,]
# A tibble: 1 x 4
# Groups: cyl, vs, gear [1]
# cyl vs gear data
# <dbl> <dbl> <dbl> <list>
# 1 6 0 4 <tibble [1 x 10]>
#
# > mtcarsT$data[[1]]
# A tibble: 1 x 10
# mpg disp hp drat am carb wt_C wt_P qsec_C qsec_P
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 21 160 110 3.9 1 4 2.62 2.88 16.5 17.0
# step 3A: this one works fine when only referencing columns in 'data' that already existed before step 2.
mtcarsT %>%
mutate(data = pmap(.l = list(a = data, b = vs, c = gear),
.f = function(a, b, c) a %>%
dplyr::mutate(vs_gear = carb - b + c)))
# > .Last.value$data[[1]]
# A tibble: 1 x 11
# mpg disp hp drat am carb wt_C wt_P qsec_C qsec_P vs_gear
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 21 160 110 3.9 1 4 2.62 2.88 16.5 17.0 8
# Step 3B: this is what I want to do, use the column 'wt_P' in the 'data' list column
# that was created in step 2 along with other columns 'vs' and 'gear' in the nested tibble,
# but it throws the error 'object wt_P not found'
mtcarsT %>%
mutate(data = pmap(.l = list(a = data, b = vs, c = gear),
.f = function(a, b, c) a %>%
dplyr::mutate(vs_gear = wt_P - b + c)))
# Error: object 'wt_P' not found
# Called from: mutate_impl(.data, dots, caller_env())
'''
I'm using R 3.6.2 x64 with tidyverse 1.3.0 inside RStudio 1.2.5033 on Windows 10.
Upvotes: 0
Views: 1042
Reputation: 389145
The example works as expected but the issue is present in step 1 itself.
To show that the example works you can change wt_P
to wt_C
and it works.
library(tidyverse)
mtcarsT %>%
mutate(data = pmap(.l = list(a = data, b = vs, c = gear),
.f = function(a, b, c)
a %>% dplyr::mutate(vs_gear = wt_C - b + c)))
# cyl vs gear data
# <dbl> <dbl> <dbl> <list>
# 1 6 0 4 <tibble [1 × 11]>
# 2 4 1 4 <tibble [8 × 11]>
# 3 6 1 3 <tibble [2 × 11]>
#....
The issue in step 1 is when you are doing
mtcars %>% as_tibble() %>%
group_by(cyl, gear, vs) %>%
mutate(cp_flag = rep(c('C', 'P'), length.out = n()))
there are certain groups with only 1 observation who do not get P
value at all.
mtcars %>% count(cyl, gear, vs)
# A tibble: 10 x 4
# cyl gear vs n
# <dbl> <dbl> <dbl> <int>
# 1 4 3 1 1
# 2 4 4 1 8
# 3 4 5 0 1
# 4 4 5 1 1
# 5 6 3 1 2
# 6 6 4 0 2
# 7 6 4 1 2
# 8 6 5 0 1
# 9 8 3 0 12
#10 8 5 0 2
Hence, wt_P
is not calculated for them and it returns an error whereas wt_C
does not. If you change order in rep
from c('C', 'P')
to c('P', 'C')
then you will get error for wt_C
and wt_P
would work as expected.
To add the missing column, we can do :
mtcars %>%
group_by(cyl, gear, vs) %>%
mutate(cp_flag = rep(c('C', 'P'), length.out = n())) %>%
nest() %>%
mutate(data = map(data, ~{
temp <- .x %>%
group_by(mpg, disp, hp, drat, am, carb) %>%
pivot_wider(names_from = cp_flag, values_from = wt:qsec,
values_fill = list(wt = NA, qsec = NA)) %>%
ungroup()
temp[setdiff(cols, names(temp))] <- NA;temp
}))
# cyl vs gear data
# <dbl> <dbl> <dbl> <list>
# 1 6 0 4 <tibble [1 × 10]>
# 2 4 1 4 <tibble [8 × 10]>
# 3 6 1 3 <tibble [2 × 10]>
# 4 8 0 3 <tibble [12 × 10]>
# 5 6 1 4 <tibble [2 × 10]>
# 6 4 1 3 <tibble [1 × 10]>
# 7 4 0 5 <tibble [1 × 10]>
# 8 4 1 5 <tibble [1 × 10]>
# 9 8 0 5 <tibble [2 × 10]>
#10 6 0 5 <tibble [1 × 10]>
So all of them have same number of columns.
Upvotes: 1