Reputation: 11
I am very new to R and am trying to convert a JSONL file I have been given to a CSV for further analysis. I can find plenty of code for converting JSON to CSV but not JSONL.
Even if someone was able to assist with converting JSONL to JSON, I could probably go from there!
I have attempted a few codes in R but receive error messages. The most recent was:
jq -n 'FILENAME.jsonl' <in.jsonl >out.json
Error: unexpected string constant in "jq -n 'FILENAME.jsonl
If anyone could point me in the right direction I would really appreciate it.
Thanks!
K
Upvotes: 1
Views: 633
Reputation: 12440
I'm using a simple example of a JSONL file. Ideally you would provide one with your question to make it easier for others to help:
example <- c(
'{"name": "Gilbert", "wins": [["straight", "7♣"], ["one pair", "10♥"]]}',
'{"name": "Alexa", "wins": [["two pair", "4♠"], ["two pair", "9♠"]]}',
'{"name": "May", "wins": "NA"}',
'{"name": "Deloise", "wins": [["three of a kind", "5♣"]]}'
)
You can easily read this into R
using the jsonlite
package:
df <- jsonlite::stream_in(textConnection(example)) # you don't need textConnection if you work with a file
#> Found 4 records... Imported 4 records. Simplifying...
tibble::as_tibble(df) # tibble used for nicer printing
#> # A tibble: 4 x 2
#> name wins
#> <chr> <list>
#> 1 Gilbert <chr[,2] [2 × 2]>
#> 2 Alexa <chr[,2] [2 × 2]>
#> 3 May <chr [1]>
#> 4 Deloise <chr[,2] [1 × 2]>
One problem with this example (and most json files found in the wild) is that values are nested in list columns. This can't be easily exported to csv files. So I use the tidyverse
(specifically tidyr
in this case) to unnest
all list columns:
library(tidyverse)
df %>%
unnest(cols = where(is.list))
#> # A tibble: 6 x 2
#> name wins[,1] [,2]
#> <chr> <chr> <chr>
#> 1 Gilbert straight 7♣
#> 2 Gilbert one pair 10♥
#> 3 Alexa two pair 4♠
#> 4 Alexa two pair 9♠
#> 5 May NA NA
#> 6 Deloise three of a kind 5♣
After that, you can export the data to csv. My personal favourite is to use the rio
package, which chooses some sensible defaults for many export formats:
rio::export(df, "out.csv")
Created on 2021-04-09 by the reprex package (v2.0.0)
Upvotes: 1