jessica
jessica

Reputation: 1355

Separating embedded/nested data frames in R

I have a data frame that has embedded within it, another data frame

class(data)
[1] "dfidx_mlogit" "dfidx"        "data.frame"   "mlogit.data"

I am trying to seperate the two data frames apart. One that includes the pertinent data on health and education and the other which contains information about the persons id, called 'idx'.

How do I completely separate the two data frames?

Here is the following data

data <- structure(list(EDUC = c(4L, 4L, 4L, 4L), HEALTH = c(3L, 3L, 3L, 
3L), idx = structure(list(chid = c(1L, 1L, 1L, 1L), unique_id = c(3000175513, 
3000175513, 3000175513, 3000175513), alt = structure(1:4, .Label = c("Bicycle", 
"Car", "Metro", "Walking"), class = "factor")), ids = c(1, 1, 
2), row.names = c(NA, 4L), class = c("idx", "data.frame"))), row.names = c(NA, 
4L), class = c("dfidx_mlogit", "dfidx", "data.frame", "mlogit.data"
), idx = structure(list(chid = c(1L, 1L, 1L, 1L), unique_id = c(3000175513, 
3000175513, 3000175513, 3000175513), alt = structure(1:4, .Label = c("Bicycle", 
"Car", "Metro", "Walking"), class = "factor")), ids = c(1, 1, 
2), row.names = c(NA, 4L), class = c("idx", "data.frame")))

Upvotes: 1

Views: 94

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388982

A general solution would be to separate data based on their class.

data1 <- Filter(function(x) all(class(x) != "data.frame"), data)
data2 <- data$idx
#Or maybe we can generalise this as well
#data2 <- Filter(function(x) any(class(x) == "data.frame"), data)
str(data1)

#Classes ‘dfidx_mlogit’, ‘dfidx’, ‘mlogit.data’ and 'data.frame': 4 obs. of  2 variables:
# $ EDUC  : int  4 4 4 4
# $ HEALTH: int  3 3 3 3

str(data2)
#Classes ‘idx’ and 'data.frame':    4 obs. of  3 variables:
# $ chid     : int  1 1 1 1
# $ unique_id: num  3e+09 3e+09 3e+09 3e+09
# $ alt      : Factor w/ 4 levels "Bicycle","Car",..: 1 2 3 4
# - attr(*, "ids")= num [1:3] 1 1 2

Upvotes: 1

akrun
akrun

Reputation: 887108

If we want to separate the datasets, it is the 'id' column that have a nested 'data.frame'. We can pull that column to create a new object

library(dplyr)
data2 <- data %>% 
              pull(idx) 
data1 <- data %>% 
             as_tibble %>% 
             select(-idx)
attr(data1, "idx") <- NULL

-checking the structure

str(data1)
#tibble [4 × 2] (S3: tbl_df/tbl/data.frame)
# $ EDUC  : int [1:4] 4 4 4 4
# $ HEALTH: int [1:4] 3 3 3 3
str(data2)
#Classes ‘idx’ and 'data.frame':    4 obs. of  3 variables:
# $ chid     : int  1 1 1 1
# $ unique_id: num  3e+09 3e+09 3e+09 3e+09
# $ alt      : Factor w/ 4 levels "Bicycle","Car",..: 1 2 3 4
# - attr(*, "ids")= num [1:3] 1 1 2

Or doing this in base R

data2 <- data$idx
class(data2) <- 'data.frame'
data1 <- data[1:2]

-checking the structure

str(data1)
#Classes ‘dfidx_mlogit’, ‘dfidx’, ‘mlogit.data’ and 'data.frame':   4 obs. of  2 variables:
# $ EDUC  : int  4 4 4 4
# $ HEALTH: int  3 3 3 3



str(data2)
#'data.frame':  4 obs. of  3 variables:
# $ chid     : int  1 1 1 1
# $ unique_id: num  3e+09 3e+09 3e+09 3e+09
# $ alt      : Factor w/ 4 levels "Bicycle","Car",..: 1 2 3 4
# - attr(*, "ids")= num [1:3] 1 1 2

Upvotes: 2

Related Questions