David Pepper
David Pepper

Reputation: 593

Turning lists of unequal length lists into data frames

I have a list of unequal length lists obtained from JSON, and I want to munge it into a single dataframe or a series of dataframes. As an example, say this is the result of using fromJSON on my original JSON:

mylist <- list(
  list(
    volume = array(1:6, dim=c(3,2)),
    price = array(c(1,2,3,7,8,9), dim=c(3,2)),
    name = 'A'
  ),
  list(
    volume = array(1:10, dim=c(5,2)),
    price = array(c(1:5,12:16), dim=c(5,2)),
    name = 'B'
  ),
  list(
    volume = array(1:14, dim=c(7,2)),
    price = array(c(1:7,21:27), dim=c(7,2)),
    name = 'C'
  )
)

The price and volume lists are of unequal lengths, and I'd like to process the data assuming that the n observations of a given variable are the last n entries in the series. So for each of the data lists, I essentially want to throw away the first column and bottom-align the data. So one way to represent the price data would be as follows:

tribble(
  ~Day, ~PriceA, ~PriceB, ~PriceC,
  #---|--------|--------|---------
    1L,      NA,      NA,      21,
    2L,      NA,      NA,      22,
    3L,      NA,      12,      23,
    4L,      NA,      13,      24,
    5L,       7,      14,      25,
    6L,       8,      15,      26,
    7L,       9,      16,      27
)

If done this way, I'd need to create a separate table for volume. I'm open to other ways of representing the final data set, for instance using nested columns in a data frame.

Anyone have an idea on how to do this gracefully? Especially confusing for me is using purrr/map to operate on the second-level lists involved.

Upvotes: 2

Views: 755

Answers (1)

akrun
akrun

Reputation: 887541

Here is an option using tidyverse

library(tidyverse)
out <- mylist %>%
         transpose %>% 
         map(~ if(all(lengths(.x) == 1)) unlist(.x) else
         map(.x, as_tibble) %>%
           reduce(full_join, by = 'V1') %>%
        mutate_all(funs(.[order(!is.na(.))])))

Now, we can extract the list elements

out$price %>%
      set_names(c("Day", paste0("Price", LETTERS[1:3])))
# A tibble: 7 x 4
#    Day PriceA PriceB PriceC
#  <dbl>  <dbl>  <int>  <int>
#1  1.00  NA        NA     21
#2  2.00  NA        NA     22
#3  3.00  NA        12     23
#4  4.00  NA        13     24
#5  5.00   7.00     14     25
#6  6.00   8.00     15     26
#7  7.00   9.00     16     27

Upvotes: 2

Related Questions