Reputation: 2015
I have the following tibble:
library(tidyverse)
set.seed(1234)
df <- tibble(
x1 = letters[1:2],
y1 = list(
tibble(
x2 = letters[3:4],
y2 = list(
tibble(
x3 = seq(1, 100, 1),
y3 = rnorm(100)
)
)
)
)
)
And I need to access the tibble inside the tibble that contains x3
and y3
and apply a custom function to each data frame. For simplicity, let's say I need to apply base::mean()
to y3
.
My real data is much bigger than this, so I am looking for a clean and efficient way of doing it. Any ideas?
Upvotes: 2
Views: 55
Reputation: 3321
Could you just unnest your way down?
df %>% unnest() %>% unnest() %>%
group_by(x2) %>%
summarise(mean(y3))
# A tibble: 2 x 2
x2 `mean(y3)`
<chr> <dbl>
1 c -0.157
2 d -0.157
Not sure how you want your final dataframe to look, but here's another suggestion
df %>% unnest() %>%
mutate(y3.average = map(y2, ~mean(.$y3)) ) %>%
unnest(y3.average)
# A tibble: 4 x 4
x1 x2 y2 y3.average
<chr> <chr> <list> <dbl>
1 a c <tibble [100 × 2]> -0.157
2 a d <tibble [100 × 2]> -0.157
3 b c <tibble [100 × 2]> -0.157
4 b d <tibble [100 × 2]> -0.157
Upvotes: 2
Reputation: 8402
You need to locate the level at which you want to apply the function (which I do through bracket indexing), and then apply the function. I hope this is transferable to what you need to do.
> df[["y1"]][[1]][[2]] %>% lapply(., function(x){mean(x$y3)})
[[1]]
[1] 0.04124318
[[2]]
[1] 0.04124318
Upvotes: 0