Kozolovska
Kozolovska

Reputation: 1119

lists of tibble to column in data.frame

I want to create a column which is a list of tibbles (of different row number). The straight forward way fails. Example:

> x <- data.frame('a' = 1:2, 
+                 'b' = list(tibble('c' = 1:4, 'd' = 1:4),
+                            tibble('c' = 1:3, 'd' = 1:3)))
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 4, 3

I can avoid the error by wrapping it with I. However, when I do so, and try to unnest I can't.

> x <- data.frame('a' = 1:2, 
+                 'b' = I(list(tibble('c' = 1:4, 'd' = 1:4),
+                            tibble('c' = 1:3, 'd' = 1:3))))
> x %>% unnest(cols = b) 
# A tibble: 2 x 2
      a b               
  <int> <I<list>>       
1     1 <tibble [4 x 2]>
2     2 <tibble [3 x 2]>

How can I create a column which is a list of tibble, which later I can unnest?

Upvotes: 1

Views: 961

Answers (2)

jpiversen
jpiversen

Reputation: 3212

It's much easier to create list columns using tibbles instead of data.frames (See e.g. Hadley's note on this here).

You can fix your code by swtiching from data.frame() to tibble():

library(dplyr)

x <- tibble(
  'a' = 1:2,
  'b' = list(
    tibble('c' = 1:4, 'd' = 1:4),
    tibble('c' = 1:3, 'd' = 1:3)
  )
)

x
#> # A tibble: 2 × 2
#>       a b               
#>   <int> <list>          
#> 1     1 <tibble [4 × 2]>
#> 2     2 <tibble [3 × 2]>

x %>% tidyr::unnest(b)
#> # A tibble: 7 × 3
#>       a     c     d
#>   <int> <int> <int>
#> 1     1     1     1
#> 2     1     2     2
#> 3     1     3     3
#> 4     1     4     4
#> 5     2     1     1
#> 6     2     2     2
#> 7     2     3     3

Created on 2022-03-31 by the reprex package (v2.0.1)

Upvotes: 2

user18309711
user18309711

Reputation:

you can create the data.frame without list-column first and add the list:

x <- data.frame(a = 1:2)
x$b <- list(tibble('c' = 1:4, 'd' = 1:4),
            tibble('c' = 1:3, 'd' = 1:3)
           )

control:

str(x)
# 'data.frame': 2 obs. of  2 variables:
# $ a: int  1 2
# $ b:List of 2
#  ..$ : tibble [4 x 2] (S3: tbl_df/tbl/data.frame)
#  .. ..$ c: int  1 2 3 4
#  .. ..$ d: int  1 2 3 4
#  ..$ : tibble [3 x 2] (S3: tbl_df/tbl/data.frame)
#  .. ..$ c: int  1 2 3
#  .. ..$ d: int  1 2 3

Upvotes: 1

Related Questions