Reputation: 1472
I'm not sure how to create a reproducible dataset based on this situation. But below is the output from glimpse
, where glimpse identifies 2 columns with item
column is a data.frame
object.
What I want is to have a rectangular tibble, which if item
was just another tibble
, I can simply unnest
.
> glimpse(df)
Columns: 2
$ ID <chr> "A", "B", "C", "D",
$ item <df[,2]> <data.frame[25 x 2]>
> df
# A tibble: 149,826 x 2
ID item$link $value
<chr> <chr> <chr>
1 A~ https://ex.com/api/now/ 941b821e6f9121004d
2 B~ https://ex.com/api/now/ 6c5bdbd46fa9610080
3 C~ https://ex.com/api/now/ 941b821e6f9121004d
The current workaround I use is by converting the data.frame
column as tibble and bind it back to the rest of the table. But this is not ideal if I have many columns with this data.frame
type. Notice how the important column identifier which is item
is now lost as it only captures link
and value
.
If there're multiple columns with data.frame
type, this will generate an error since these columns will all be called link
and value
.
df <- bind_cols(df %>% dplyr::select(ID), as_tibble(dataset$item))
> df
# A tibble: 149,826 x 3
number link value
<chr> <chr> <chr>
1 A~ https://ex.com/api/now/ 941b821e6f9121004d
2 B~ https://ex.com/api/now/ 6c5bdbd46fa9610080
3 C~ https://ex.com/api/now/ 941b821e6f9121004d
Upvotes: 2
Views: 89
Reputation: 18581
I think the approach below can solve your problem. Please see the inline comments.
library(dplyr)
library(tidyr)
library(stringr)
# a data.frame
dat1 <- data.frame(link = rep("https://ex.com/api/now/", 3),
value = c("941b821e6f9121004d",
"6c5bdbd46fa9610080",
"941b821e6f9121004d"))
# another data.frame
dat2 <- data.frame(link = rep("https://another/link", 3),
value = c("843rq4nrfhvr843938",
"35rth165bft1tsv15d",
"435grg53th1ht51gte"))
# the example data containing two columns which are data.frames
dat <- tibble(id = c("A","B", "C"),
item1 = dat1,
item2 = dat2)
# lets have a look
dat %>% glimpse
#> Rows: 3
#> Columns: 3
#> $ id <chr> "A", "B", "C"
#> $ item1 <df[,2]> <data.frame[3 x 2]>
#> $ item2 <df[,2]> <data.frame[3 x 2]>
# use chop to bring the data.frame columns in a nested format
dat_chop <- dat %>% chop(c(item1, item2))
# if you need one specific colum just unnest it:
dat_chop %>% unnest(item1)
#> # A tibble: 3 x 4
#> id link value item2
#> <chr> <chr> <chr> <list<df[,2]>>
#> 1 A https://ex.com/api/now/ 941b821e6f9121004d [1 × 2]
#> 2 B https://ex.com/api/now/ 6c5bdbd46fa9610080 [1 × 2]
#> 3 C https://ex.com/api/now/ 941b821e6f9121004d [1 × 2]
# if you need all columns, unnest it and specify a `names_repair` strategy:
dat_chop %>%
unnest(c(item1, item2), names_repair = "universal") %>%
# the above will suffice if you don't mind:
# (1) the dots `...` in the column names
# (2) the consecutive numbering of the variables
# both can be changed with the following call to `rename_with`
rename_with(~ str_replace_all(.x, "\\.\\.\\.", "") %>%
str_replace_all(., "\\d$",
function(x) ((as.numeric(x)%/% 2))),
.cols = -id)
# Now each number refers to the item number it originated from!
#> New names:
#> * link -> link...2
#> * value -> value...3
#> * link -> link...4
#> * value -> value...5
#> # A tibble: 3 x 5
#> id link1 value1 link2 value2
#> <chr> <chr> <chr> <chr> <chr>
#> 1 A https://ex.com/api/… 941b821e6f91210… https://another/l… 843rq4nrfhvr84…
#> 2 B https://ex.com/api/… 6c5bdbd46fa9610… https://another/l… 35rth165bft1ts…
#> 3 C https://ex.com/api/… 941b821e6f91210… https://another/l… 435grg53th1ht5…
Created on 2021-01-13 by the reprex package (v0.3.0)
Upvotes: 1