Reputation: 4201
I've seen many similar questions, but couldn't adapt to my situation. I have data that comes as a nested list, and want to convert it to a data frame in a certain way.
my_data_object <-
list(my_variables = list(
age = list(
type = "numeric",
originType = "slider",
originSettings = structure(list(), .Names = character(0)),
originIndex = 5L,
title = "what is your age?",
valueDescriptions = NULL
),
med_field = list(
type = "string",
originType = "choice",
originSettings = structure(list(), .Names = character(0)),
originIndex = 6L,
title = "what medical branch are you at?",
valueDescriptions = list(card = "Cardiology", ophth = "Ophthalmology",
derm = "Dermatology")
),
covid_vaccine = list(
type = "string",
originType = "choice",
originSettings = structure(list(), .Names = character(0)),
originIndex = 8L,
title = "when do you plan to get vaccinated?",
valueDescriptions = list(
next_mo = "No later than next month",
within_six_mo = "No later than six months from now",
never = "I will not get vaccinated"
)
)
))
var_name type originType title
<chr> <chr> <chr> <chr>
1 age numeric slider what is your age?
2 med_field string choice what medical branch are you at?
3 covid_vaccine string choice when do you plan to get vaccinated?
library(tibble)
library(tidyr)
my_data_object %>%
enframe() %>%
unnest_longer(value) %>%
unnest(value)
## # A tibble: 18 x 3
## name value value_id
## <chr> <named list> <chr>
## 1 my_variables <chr [1]> age
## 2 my_variables <chr [1]> age
## 3 my_variables <named list [0]> age
## 4 my_variables <int [1]> age
## 5 my_variables <chr [1]> age
## 6 my_variables <NULL> age
## 7 my_variables <chr [1]> med_field
## 8 my_variables <chr [1]> med_field
## 9 my_variables <named list [0]> med_field
## 10 my_variables <int [1]> med_field
## 11 my_variables <chr [1]> med_field
## 12 my_variables <named list [3]> med_field
## 13 my_variables <chr [1]> covid_vaccine
## 14 my_variables <chr [1]> covid_vaccine
## 15 my_variables <named list [0]> covid_vaccine
## 16 my_variables <int [1]> covid_vaccine
## 17 my_variables <chr [1]> covid_vaccine
## 18 my_variables <named list [3]> covid_vaccine
I'm trying to get this using tidyverse
functions, but so far it seems I'm not headed the right direction. I Will be grateful for guidance.
Unlike the example data I provided originally, in reality my data comes in a bit different hierarchy. I thought this would be simple to generalize once I have the method but turns it's not. So if we consider that data comes such as the following, but truly I only care about the my_variables
sub-list.
my_data_object_2 <-
list(
other_variables = list(
whatever_var_1 = list(
type = "numeric",
originType = "slider",
originSettings = structure(list(), .Names = character(0)),
originIndex = 5L,
title = "blah question",
valueDescriptions = NULL
)
),
my_variables = list(
age = list(
type = "numeric",
originType = "slider",
originSettings = structure(list(), .Names = character(0)),
originIndex = 5L,
title = "what is your age?",
valueDescriptions = NULL
),
med_field = list(
type = "string",
originType = "choice",
originSettings = structure(list(), .Names = character(0)),
originIndex = 6L,
title = "what medical branch are you at?",
valueDescriptions = list(card = "Cardiology", ophth = "Ophthalmology",
derm = "Dermatology")
),
covid_vaccine = list(
type = "string",
originType = "choice",
originSettings = structure(list(), .Names = character(0)),
originIndex = 8L,
title = "when do you plan to get vaccinated?",
valueDescriptions = list(
next_mo = "No later than next month",
within_six_mo = "No later than six months from now",
never = "I will not get vaccinated"
)
)
)
)
So how could I "zoom in"/"extract" my_variables
and only then get the table I specified in "Desired Output" above?
Upvotes: 2
Views: 361
Reputation: 269854
Iterate over my_data_object
tibblifying the indicated columns and putting it all together using map_dfr
(or maybe fun(my_data_object$my_variables)
is sufficient depnding on what the general case is). There are no missing fields in the example data but if any of the 3 spec fields can be missing then add .default = NA
as an lcol_chr
argument to that field spec.
library(purrr)
library(tibblify)
spec <- lcols(
lcol_chr("type"),
lcol_chr("originType"),
lcol_chr("title")
)
fun <- function(x) cbind(var_name = names(x), tibblify(x, spec))
map_dfr(my_data_object, fun)
giving:
var_name type originType title
1 age numeric slider what is your age?
2 med_field string choice what medical branch are you at?
3 covid_vaccine string choice when do you plan to get vaccinated?
Depending on what the general case is this simplification by @mgirlich (which is similar to the alternative in the introduction to this answer) may work. spec
is from above.
library(tibblify)
cbind(
var_name = names(my_data_object[[1]]),
tibblify(my_data_object[[1]], spec)
)
Upvotes: 2
Reputation: 389125
You can flatten
the object, use enframe
and unnest_wider
to create new columns.
library(tidyverse)
my_data_object %>%
flatten() %>%
tibble::enframe() %>%
unnest_wider(value)
# name type originType originIndex title valueDescriptions
# <chr> <chr> <chr> <int> <chr> <list>
#1 age numeric slider 5 what is your age? <NULL>
#2 med_field string choice 6 what medical branch are you at? <named list [3]>
#3 covid_vaccine string choice 8 when do you plan to get vaccinated? <named list [3]>
You can then drop the columns that you don't need.
To use only my_data_object_2$my_variables
:
my_data_object_2$my_variables %>%
tibble::enframe() %>%
unnest_wider(value)
Upvotes: 2
Reputation: 73252
Using lapply
as usual to select specific columns, just rbind
them.
res <- do.call(rbind.data.frame,
lapply((my_data_object)[[1]], `[`, c("type", "originType", "title")))
res
# type originType title
# age numeric slider what is your age?
# med_field string choice what medical branch are you at?
# covid_vaccine string choice when do you plan to get vaccinated?
If you want row names to first column, do:
`rownames<-`(cbind(var=rownames(res), res), NULL)
# var type originType title
# 1 age numeric slider what is your age?
# 2 med_field string choice what medical branch are you at?
# 3 covid_vaccine string choice when do you plan to get vaccinated?
Upvotes: 2