How to convert a list of lists contains NULL values to a data frame

Question

I have a list created from a JSON object as an output from an e-commerce API - minimal. example below. I'm trying to convert this to a df but without much luck.

my_ls <- list(list(id = 406962L, user_id = 132786L, user_name = "Visitor Account", 
      organization_id = NULL, checkout_at = NULL, currency = "USD", 
      bulk_discount = NULL, coupon_codes = NULL, items = list(list(
        id = 505296L, quantity = 1L, unit_cost = 1295, used = 0L, 
        item_id = 6165L, item_type = "Path", item_name = "Product_2", 
        discount_type = "Percent", discount = NULL, coupon_id = NULL), 
        list(id = 505297L, quantity = 1L, unit_cost = 1295, used = 0L, 
             item_id = 6163L, item_type = "Path", item_name = "Product_1", 
             discount_type = "Percent", discount = NULL, coupon_id = NULL))), 
 list(id = 407178L, user_id = 132786L, user_name = "Visitor Account", 
      organization_id = "00001", checkout_at = NULL, currency = "USD", 
      bulk_discount = NULL, coupon_codes = NULL, items = list(
        list(id = 505744L, quantity = 1L, unit_cost = 1295, 
             used = 0L, item_id = 6163L, item_type = "Path", 
             item_name = "Product_1", 
             discount_type = "Percent", discount = NULL, coupon_id = NULL))))

I've tried some short solutions such as this: Converting a list of lists to a dataframe in R: The Tidyverse-way

... and combinations of flatten, map & map_dfr from purrr.

There are two problems I keep running into and when I solve one, I run into the other:

There are NULL values in the data for certain entries. If I try to convert sub-lists to a tibble I get an error: Error: All columns in a tibble must be vectors. x Columnorganization_idis NULL
Under the items sublists there is a named item called id. There is already a named item in a higher level list called id. The former ones represent product ids and the latter represent order ids. I can't seem to rename one reliably - by one method converting to a df deletes the lower level ids.

Each item under the items sublist is a cart item so in the final df they should have the column items from the higher level list item they are contained within so if there are two sub items the values inherited from the higher level list will be repeated such as organization_id and user_name. I want to keep the columns that have NULL values - some entries such as checkout_at have values in the larger data set.

Thanks.

tmfmnk · Accepted Answer

One option involving dplyr, tidyr and purrr could be:

map_depth(.x = my_ls, 2, ~ replace(.x, is.null(.x), NA), .ragged = TRUE) %>%
 bind_rows() %>%
 mutate(items = map_depth(items, 2, ~ replace(.x, is.null(.x), NA))) %>%
 rename(`original_id` = id) %>%
 unnest_wider(items) 

 original_id user_id user_name organization_id checkout_at currency bulk_discount
                                               
1      406962  132786 Visitor …             NA          USD      NA           
2      406962  132786 Visitor …             NA          USD      NA           
3      407178  132786 Visitor … 00001           NA          USD      NA           
# … with 11 more variables: coupon_codes , id , quantity , unit_cost ,
#   used , item_id , item_type , item_name , discount_type ,
#   discount , coupon_id

Or an option using rrapply, dplyr and tidyr:

rrapply(my_ls, f = function(x) if(is.null(x)) NA else x, how = "replace") %>%
 bind_rows() %>%
 rename(`original_id` = id) %>%
 unnest_wider(items)

How to convert a list of lists contains NULL values to a data frame

Answers (2)

Related Questions