Unnesting a complex dataframe

Question

I'm trying to unpack a dataframe with columns that contain sub dataframes in each row.

The problem is, that the sub dataframes in each column have different sizes (e.g. 1x3, 2x3 and 2x2). Moreover, I have a column in a sub dataframe (Conversions.Value) that has different data formats in each row (num and char). During the unpacking process, I get error messages like 'can't recycle input of size 3 to size 2.' or 'Can't combine ..1$Conversions$Value and ..6$Conversions$Value .' Structure below

structure(list
(Conversions = list(structure(list(Field = "Volume", 
    Unit = "m3", Value = 338L), class = "data.frame", row.names = 1L), 
    structure(list(Field = "Volume", Unit = "m3", Value = 450L), class = "data.frame", row.names = 1L)),     

Categories = list(structure(list(CategorySystem = c("Base", 
    NA), Title = c("Mineral materials and glass (excluding concrete)", "213.7 Kevytbetoni, Aerated concrete"), ClassificationType = c(NA, 
    "Talo2000")), class = "data.frame", row.names = 1:2), structure(list(
        CategorySystem = c("Base", NA), Title = c("Mineral materials and glass (excluding concrete)", 
        "213.7 Kevytbetoni, Aerated concrete"), ClassificationType = c(NA, 
        "Talo2000")), class = "data.frame", row.names = 1:2)), 
   
DataItems.DataValueItems = list(structure(list(DataModuleCode = c("A1-A3 Conservative",  "A1-A3 Typical"), Value = c(0.43, 0.36)), class = "data.frame", row.names = 1:2), 
        structure(list(DataModuleCode = c("A1-A3 Conservative", 
        "A1-A3 Typical"), Value = c(0.41, 0.34)), class = "data.frame", row.names = 1:2)), 
   
ResourceId = c(7000000995, 7000000996)), row.names = 1:2, class = "data.frame")

So far I've tried:

unnest_wider(df, col = 1:3, names_repair = "universal") 
# WORKED BUT multiple observations as a list in one row
# but different lengths

unnest_longer(df, col = 1:3, names_repair = "universal") %>%
mutate(across(.fns = as.character)) %>%
  type_convert()
# ERROR Can't combine `..1$Conversions$Value`  and `..6$Conversions$Value` .

df$Conversions=lapply(df$Conversions, FUN=as.character)
unnest_longer(df, col = 1:3, names_repair = "universal") %>%
  mutate(across(.fns = as.character)) %>%
  type_convert()
#ERROR ! In row 1, can't recycle input of size 3 to size 2.

ideally, this is how the outcome would look like

EDIT rbindlist worked, but only when applied on each column separately. Thus I lose the primary identificator of each row (ResourceId) and the data is not rejoinable anymore.

rbindlist(lapply(df$Conversions, as.data.frame.list), fill=TRUE)
rbindlist(lapply(df$Categories, as.data.frame.list), fill=TRUE)
rbindlist(lapply(df$DataItems.DataValueItems, as.data.frame.list), fill=TRUE)

How do I paste the Resource Id into the dataframe structure of each column, so that when rbindlist is applied after, I get a result with a column containing the respective ResourceId values?

Unnesting a complex dataframe

Answers (1)

Related Questions