Long table to nested list to JSON using R and tidyverse

Question

I am trying to convert a long table (eg. example below) into a nested list into a JSON using R/tidyverse and one of the JSON packages. I suspect that this can be done using split/purrr in some way but I haven't been able to figure out how exactly.

Take the following example input:

library(tidyverse)
library(stringi)

df = data.frame(patient   = c(rep("A",4), rep("B",4)),
                sample    = rep(c("P","R"),4),
                file      = stri_rand_strings(8, 6, '[A-Z]'))

Which looks for example like this

  patient sample   file
1       A      P ZZEVYQ
2       A      R KIUXRU
3       A      P XRYBUE
4       A      R ZCHBKN
5       B      P WZYAPM
6       B      R EKDFYT
7       B      P CYEJCK
8       B      R XFAYXX

I would like to output something similar to this (note: manually typed).

[
  {
    "patient" : "A",
    "samples" : [
      {
        "sample" : "P",
        "files" : [
          {
            "file" : "ZZEVYQ"
          },
          {
            "file" : "XRYBUE"
          }
        ] 
      },
      {
        "sample" : "R",
        "files" : [
          {
            "file" : "KIUXRU"
          },
          {
            "file" : "ZCHBKN"
          }
        ]
      }
    ]
  },
  {
    "patient" : "B",
    "samples" : [
      {
        "sample" : "P",
        "files" : [
          {
            "file" : "WZYAPM"
          },
          {
            "file" : "CYEJCK"
          }
          ] 
      },
      {
        "sample" : "R",
        "files" : [
          {
            "file" : "EKDFYT"
          },
          {
            "file" : "XFAYXX"
          }
        ]
      }
    ]
  }
]

Any suggestions on how to do this?

Many thanks!

MrFlick · Accepted Answer

For most json output, if you want nested levels, then you need to nest your data. Here's one way to achieve the nexting required to get the JSON you are after

dfout <- df %>% group_by(patient, sample) %>% 
  summarize(files=list(map(file, ~list(file=.x)))) %>% 
  summarize(samples=list(map2(sample, files, ~list(samples=.x, files=.y))))

jsonlite::toJSON(dfout, auto_unbox = TRUE)

Long table to nested list to JSON using R and tidyverse

Answers (1)

Related Questions