Floris
Floris

Reputation: 677

Long table to nested list to JSON using R and tidyverse

I am trying to convert a long table (eg. example below) into a nested list into a JSON using R/tidyverse and one of the JSON packages. I suspect that this can be done using split/purrr in some way but I haven't been able to figure out how exactly.

Take the following example input:

library(tidyverse)
library(stringi)

df = data.frame(patient   = c(rep("A",4), rep("B",4)),
                sample    = rep(c("P","R"),4),
                file      = stri_rand_strings(8, 6, '[A-Z]'))

Which looks for example like this

  patient sample   file
1       A      P ZZEVYQ
2       A      R KIUXRU
3       A      P XRYBUE
4       A      R ZCHBKN
5       B      P WZYAPM
6       B      R EKDFYT
7       B      P CYEJCK
8       B      R XFAYXX

I would like to output something similar to this (note: manually typed).

[
  {
    "patient" : "A",
    "samples" : [
      {
        "sample" : "P",
        "files" : [
          {
            "file" : "ZZEVYQ"
          },
          {
            "file" : "XRYBUE"
          }
        ] 
      },
      {
        "sample" : "R",
        "files" : [
          {
            "file" : "KIUXRU"
          },
          {
            "file" : "ZCHBKN"
          }
        ]
      }
    ]
  },
  {
    "patient" : "B",
    "samples" : [
      {
        "sample" : "P",
        "files" : [
          {
            "file" : "WZYAPM"
          },
          {
            "file" : "CYEJCK"
          }
          ] 
      },
      {
        "sample" : "R",
        "files" : [
          {
            "file" : "EKDFYT"
          },
          {
            "file" : "XFAYXX"
          }
        ]
      }
    ]
  }
]

Any suggestions on how to do this?

Many thanks!

Upvotes: 0

Views: 359

Answers (1)

MrFlick
MrFlick

Reputation: 206411

For most json output, if you want nested levels, then you need to nest your data. Here's one way to achieve the nexting required to get the JSON you are after

dfout <- df %>% group_by(patient, sample) %>% 
  summarize(files=list(map(file, ~list(file=.x)))) %>% 
  summarize(samples=list(map2(sample, files, ~list(samples=.x, files=.y))))

jsonlite::toJSON(dfout, auto_unbox = TRUE)

Upvotes: 2

Related Questions