bassen
bassen

Reputation: 525

Unnest vector in dataframe but add list indices column

say I have a tibble such as this:

tibble(x=22:23, y=list(4:6,4:7))

# A tibble: 2 × 2
      x         y
  <int>    <list>
1    22 <int [3]>
2    23 <int [4]>

I would like to convert it into a new larger tibble by unnesting the lists (e.g. with unnest), which would give me a tibble with 7 rows. However, I want a new column added that tells me, for a given y-value in a row after unnesting, what the index of that y-value was when it was in list form. Here's what the above would look like after doing this:

# A tibble: 7 × 2
      x     y    index
  <int> <int>    <int>
1    22     4        1
2    22     5        2
3    22     6        3
4    23     4        1
5    23     5        2
6    23     6        3
7    23     7        4

Upvotes: 1

Views: 997

Answers (4)

akrun
akrun

Reputation: 887501

Here is another version with lengths

df %>%
    mutate(index = lengths(y)) %>%
    unnest(y) %>%
    mutate(index = sequence(unique(index)))
# A tibble: 7 x 3
#     x index     y
#  <int> <int> <int>
#1    22     1     4
#2    22     2     5
#3    22     3     6
#4    23     1     4
#5    23     2     5
#6    23     3     6
#7    23     4     7

Upvotes: 3

pe-perry
pe-perry

Reputation: 2621

You can also try rowwise and do.

library(tidyverse)
tibble(x=22:23, y=list(4:6,4:7)) %>% 
    rowwise() %>% 
    do(tibble(x=.$x, y=unlist(.$y), index=1:length(.$y)))

Upvotes: -1

akuiper
akuiper

Reputation: 215047

You can map over y column and bind the index for each element before unnesting:

df %>% 
    mutate(y = map(y, ~ data.frame(y=.x, index=seq_along(.x)))) %>% 
    unnest()

# A tibble: 7 x 3
#      x     y index
#  <int> <int> <int>
#1    22     4     1
#2    22     5     2
#3    22     6     3
#4    23     4     1
#5    23     5     2
#6    23     6     3
#7    23     7     4

Upvotes: 4

BENY
BENY

Reputation: 323316

By suing unnest and group_by

library(tidyr)
library(dplyr)
df %>%
  unnest(y)%>%group_by(x)%>%mutate(index=row_number())

# A tibble: 7 x 3
# Groups:   x [2]
      x     y index
  <int> <int> <int>
1    22     4     1
2    22     5     2
3    22     6     3
4    23     4     1
5    23     5     2
6    23     6     3
7    23     7     4

Upvotes: 2

Related Questions