Format nested data.frame grouped by ids using tidyverse

Question

I have got a nested datame with two ids per row looking like:

df <- data.frame(
        sample = rep(paste0("s", 1:5), each = 5),
        ID1 = paste0("id1.", 1:5),
        ID2 = paste0("id2.", 1:5),
        counts = rep(1:5, each = 5)) %>%
    arrange(ID1) %>%
    group_by(ID1, ID2) %>% nest

I would like to get a dataframe for further analyses with the first row giving samples and in the next columns giving counts (per ids) with merged ids as colnames:

df3 <- data.frame(
    sample = paste0("s", 1:5),
    "id1.1|id2.1" = 1:5,
    "id1.2|id2.2" = 1:5,
    "id1.3|id2.3" = 1:5,
    "id1.4|id2.4" = 1:5,
    "id1.5|id2.5" = 1:5)

I have already started formating:

df2 <- df %>% 
    mutate(sample = data %>% map(pull, sample)) %>%
    mutate(counts = data %>% map(pull, counts))

However I am not sure what would be an elegant approach to continue.

Darren Tsai · Accepted Answer

A tidyr solution, which unnest the nested column and then pivot to wide format.

library(tidyr)

df %>%
  unite(ID, ID1, ID2) %>%
  unnest(data) %>%
  pivot_wider(names_from = ID, values_from = counts)

# # A tibble: 5 x 6
#   sample id1.1_id2.1 id1.2_id2.2 id1.3_id2.3 id1.4_id2.4 id1.5_id2.5
#                                       
# 1 s1               1           1           1           1           1
# 2 s2               2           2           2           2           2
# 3 s3               3           3           3           3           3
# 4 s4               4           4           4           4           4
# 5 s5               5           5           5           5           5

Or starting with your work

df %>% 
  ungroup() %>% 
  mutate(sample = data %>% map("sample"),
         counts = data %>% map("counts"), .keep = "unused") %>% 
  unite(ID, ID1, ID2) %>%
  unnest(-ID) %>%
  pivot_wider(names_from = ID, values_from = counts)

Note that mutate(sample = data %>% map("sample") is a shortcut of mutate(sample = data %>% map(pull, sample)), which is a feature of map().

Format nested data.frame grouped by ids using tidyverse

Answers (2)

Related Questions