Kate FitzGerald
Kate FitzGerald

Reputation: 11

Convert JSONL file to CSV in R

I am very new to R and am trying to convert a JSONL file I have been given to a CSV for further analysis. I can find plenty of code for converting JSON to CSV but not JSONL.

Even if someone was able to assist with converting JSONL to JSON, I could probably go from there!

I have attempted a few codes in R but receive error messages. The most recent was:

jq -n 'FILENAME.jsonl' <in.jsonl >out.json

Error: unexpected string constant in "jq -n 'FILENAME.jsonl

If anyone could point me in the right direction I would really appreciate it.

Thanks!

K

Upvotes: 1

Views: 633

Answers (1)

JBGruber
JBGruber

Reputation: 12440

I'm using a simple example of a JSONL file. Ideally you would provide one with your question to make it easier for others to help:

example <- c(
  '{"name": "Gilbert", "wins": [["straight", "7♣"], ["one pair", "10♥"]]}',
  '{"name": "Alexa", "wins": [["two pair", "4♠"], ["two pair", "9♠"]]}',
  '{"name": "May", "wins": "NA"}',
  '{"name": "Deloise", "wins": [["three of a kind", "5♣"]]}'
)

You can easily read this into R using the jsonlite package:

df <- jsonlite::stream_in(textConnection(example)) # you don't need textConnection if you work with a file
#>  Found 4 records... Imported 4 records. Simplifying...
tibble::as_tibble(df) # tibble used for nicer printing
#> # A tibble: 4 x 2
#>   name    wins             
#>   <chr>   <list>           
#> 1 Gilbert <chr[,2] [2 × 2]>
#> 2 Alexa   <chr[,2] [2 × 2]>
#> 3 May     <chr [1]>        
#> 4 Deloise <chr[,2] [1 × 2]>

One problem with this example (and most json files found in the wild) is that values are nested in list columns. This can't be easily exported to csv files. So I use the tidyverse (specifically tidyr in this case) to unnest all list columns:


library(tidyverse)
df %>% 
  unnest(cols = where(is.list))
#> # A tibble: 6 x 2
#>   name    wins[,1]        [,2] 
#>   <chr>   <chr>           <chr>
#> 1 Gilbert straight        7♣  
#> 2 Gilbert one pair        10♥ 
#> 3 Alexa   two pair        4♠  
#> 4 Alexa   two pair        9♠  
#> 5 May     NA              NA   
#> 6 Deloise three of a kind 5♣

After that, you can export the data to csv. My personal favourite is to use the rio package, which chooses some sensible defaults for many export formats:

rio::export(df, "out.csv")

Created on 2021-04-09 by the reprex package (v2.0.0)

Upvotes: 1

Related Questions