user1471980
user1471980

Reputation: 10626

convert json into data frame in R?

I need to convert json file into a data frame. Each line in the json file, may have different number of entries. For example

{"timestamp":"2016-12-13T04:04:06.394-0500",
"test101":"2016-12-13T04:04:06.382-0500",
"error":"false","from":"xon","event":"DAT","BT":"work","cd":"E","id":"IBM",
"key":"20161213040330617511","begin_work":"2016-12-13T04:04:06.383-0500"","@version:"1","@timestamp":"2016-12-14T20:04:29.502Z"}

{"timestamp":"2016-12-13T04:04:05.318-0500","test101":"2016-12-13T04:03:46.074-0500","error":"false","from":"de","event":"cp","BT":"work","cd":"dsh","id":"appl",
"key":"142314089",
"begin_work":"2016-12-13T04:03:46.074-0500",
"refresh":"2016-12-13T04:03:45.920-0500",
"co_refresh":"2016-12-13T04:03:45.769-0500",
"test104":"2016-12-13T04:03:45.832-0500",
"test104":"2016-12-13T04:03:45.832-0500",
"test105":"2016-12-13T04:03:46.031-0500",
"test7":"2016-12-13T04:03:46.032-0500",
"t-test9":"2016-12-13T04:03:45.704-0500",
"test10_StartDateTimeStamp":"2016-12-13T04:03:45.704-0500",
"stop":"2016-12-13T04:03:50.772-0500",
"stop_again":"2016-12-13T04:03:46.091-0500",
"@version":"1","@timestamp":"2016-12-14T20:04:29.503Z"}
{"timestamp":"2016-12-13T04:04:07.113-0500","test101":"2016-12-13T04:04:07.068-0500","error":"false","from":"xon","event":"DAT","BT":"work","cd":"E","id":"3YPS","key":"20161213040318326935","begin_work":"2016-12-13T04:04:07.069-0500","@version":"1","@timestamp":"2016-12-14T20:04:29.505Z"}

I need to start parsing the file form a keyword called "key" until a keyword called @version.

Data frame need to look something like this:

key group time
20161213040330617511 begin_work  2016-12-13T04:04:06.383-0500
142314089 begin_work 2016-12-13T04:03:46.074-0500
142314089 refresh 2016-12-13T04:03:45.920-0500
142314089 co_refresh 2016-12-13T04:03:45.769-0500
142314089 test104 2016-12-13T04:03:45.832-0500

etc

I have tried something like this:

library(jsonlite)
library(data.table) 

setwd("C:/file/")

filenames <- list.files("system", pattern="*json*", full.names=TRUE)

dflist <- lapply(filenames, function(i) {
  jsonlite::fromJSON(
    paste0("[",
           paste0(readLines(i),collapse=","),
           "]"),flatten=TRUE
  )
})

d<-rbindlist(dflist, use.names=TRUE, fill=TRUE)

I need to put key value pairs into a 3 column data frame

I am getting field names after key as columns and NA as the values. Any ideas how could I convert json to df frame in R?

Upvotes: 0

Views: 1651

Answers (1)

denrou
denrou

Reputation: 640

This is something you can try, a combination of dplyr and tidyr :

library(dplyr)
library(tidyr)
library(jsonlite)

data <- jsonlite::fromJSON("data.json")
lapply(data, function(d) as_data_frame(d)) %>% 
  bind_rows() %>% 
  gather(groups, val, -timestamp, -key) %>% 
  select(key, group, timestamp)

BTW I had to change your json example a little bit. Here's the json file I use:

{"x":{"timestamp":"2016-12-13T04:04:06.394-0500",
"test101":"2016-12-13T04:04:06.382-0500",
"error":"false","from":"xon","event":"DAT","BT":"work","cd":"E","id":"IBM",
"key":"20161213040330617511","begin_work":"2016-12-13T04:04:06.383-0500","@version":"1","@timestamp":"2016-12-14T20:04:29.502Z"},
"y":{"timestamp":"2016-12-13T04:04:05.318-0500","test101":"2016-12-13T04:03:46.074-0500","error":"false","from":"de","event":"cp","BT":"work","cd":"dsh","id":"appl",
"key":"142314089",
"begin_work":"2016-12-13T04:03:46.074-0500",
"refresh":"2016-12-13T04:03:45.920-0500",
"co_refresh":"2016-12-13T04:03:45.769-0500",
"test104":"2016-12-13T04:03:45.832-0500",
"test105":"2016-12-13T04:03:46.031-0500",
"test7":"2016-12-13T04:03:46.032-0500",
"t-test9":"2016-12-13T04:03:45.704-0500",
"test10_StartDateTimeStamp":"2016-12-13T04:03:45.704-0500",
"stop":"2016-12-13T04:03:50.772-0500",
"stop_again":"2016-12-13T04:03:46.091-0500",
"@version":"1","@timestamp":"2016-12-14T20:04:29.503Z"},
"z":{"timestamp":"2016-12-13T04:04:07.113-0500","test101":"2016-12-13T04:04:07.068-0500","error":"false","from":"xon","event":"DAT","BT":"work","cd":"E","id":"3YPS","key":"20161213040318326935","begin_work":"2016-12-13T04:04:07.069-0500","@version":"1","@timestamp":"2016-12-14T20:04:29.505Z"}}

Upvotes: 1

Related Questions