Wasabi
Wasabi

Reputation: 3071

Expand a data.frame/table

Let's say I have a data.frame/table which I want to "expand", such that each row transforms into a sequence of repeat rows (which can be described as a group), except for a new column which will have the same values in all groups.

So, taking the following dataframe, we want to expand it by column id, such that each row is now repeated, but with a new column q which repeats from 1-4 for each group.

The way I've found to do this is to build a second dataframe based on the first one's column id and with the new column values, and then joining it to the original:

library(dplyr)

set.seed(42)

a <- data.frame(id = 1:5,
                value = runif(5))
a
#>   id     value
#> 1  1 0.9148060
#> 2  2 0.9370754
#> 3  3 0.2861395
#> 4  4 0.8304476
#> 5  5 0.6417455

b <- data.frame(id = rep(a$id, each = 4),
                q = 1:4)

left_join(a, b, by = "id")
#>    id     value q
#> 1   1 0.9148060 1
#> 2   1 0.9148060 2
#> 3   1 0.9148060 3
#> 4   1 0.9148060 4
#> 5   2 0.9370754 1
#> 6   2 0.9370754 2
#> 7   2 0.9370754 3
#> 8   2 0.9370754 4
#> 9   3 0.2861395 1
#> 10  3 0.2861395 2
#> 11  3 0.2861395 3
#> 12  3 0.2861395 4
#> 13  4 0.8304476 1
#> 14  4 0.8304476 2
#> 15  4 0.8304476 3
#> 16  4 0.8304476 4
#> 17  5 0.6417455 1
#> 18  5 0.6417455 2
#> 19  5 0.6417455 3
#> 20  5 0.6417455 4

Created on 2020-01-27 by the reprex package (v0.3.0)

Is there a more direct way of doing this? In my example above I've used dplyr, but I can just as easily adopt data.table syntax if that's easier.

Upvotes: 3

Views: 286

Answers (3)

Peter H.
Peter H.

Reputation: 2164

This is easily accomplished with the function crossing from the tidyr package.

library(tidyr)

a <- tibble(
  id = 1:5, value = runif(5)
)

crossing(a, q = 1:4)
#> # A tibble: 20 x 3
#>       id value     q
#>    <int> <dbl> <int>
#>  1     1 0.222     1
#>  2     1 0.222     2
#>  3     1 0.222     3
#>  4     1 0.222     4
#>  5     2 0.262     1
#>  6     2 0.262     2
#>  7     2 0.262     3
#>  8     2 0.262     4
#>  9     3 0.284     1
#> # … with 10 more rows

Created on 2020-01-27 by the reprex package (v0.3.0)

Upvotes: 3

s_baldur
s_baldur

Reputation: 33743

A data.table alternative:

setDT(a)
a[CJ(id = id, q = 1:4), on = .(id)]

Base R:

expand_grid(a, q = 1:4)

Upvotes: 2

akrun
akrun

Reputation: 887951

If we need to replicate the rows, it can be easily done with uncount

library(dplyr)
library(tidyr)
library(data.table)
uncount(a, 4) %>% 
     mutate(q = rowid(id))

Or another option is create a list column of 'q' and then unnest

a %>% 
   mutate(q = list(1:4)) %>%
   unnest(q)

Or in base R

transform(a[rep(seq_len(nrow(a)), 4), ], q = sequence(table(id)))

Or using data.table

library(data.table)
setDT(a)[, .(q = 1:4),.(id, value)]

Or replicate the rows first

setDT(a)[rep(seq_len(.N), .N)][, q := rep(1:4, length.out = .N)][]

Upvotes: 3

Related Questions