Reputation: 77
I am trying to convert a JSON response file using the JSONlite package, but am running into a few issues in the way the data is being presented in a data frame.
Using this code:
library(dplyr)
library(jsonlite)
json_data <- fromJSON("File Path")
df <-data.frame(unlist(json_data))
JSON Response looks like:
{
"totalPages": 1,
"firstPage": true,
"lastPage": true,
"numberOfElements": 1757,
"number": 0,
"totalElements": 1757,
"columns": {
"dimension": {
"id": "variables/evar4",
"type": "string"
},
"columnIds": [
"1",
"2",
"3",
"4"
]
},
"rows": [
{
"itemId": "0",
"value": "Unspecified",
"data": [
220,
1.0002442201269182,
0.29499089253187616,
2.890909090909091
]
},
{
"itemId": "118293442",
"value": "PNIwTjWWjphkqfu",
"data": [
5,
1.0002442201269182,
57.879999999999995,
30.8
]
},
{
"itemId": "1810135314",
"value": "PNIFBOIKLplumdb",
"data": [
3,
1.0002442201269182,
1243.0277777777778,
545.3333333333334
]
}
]
}
I receive a table df that has columns and rows all in one list. Is there a way to have the column ids going across the top, with the corresponding rowIDs matched to the proper column?
So that the data frame looks like:
1 2 3 4
Unspecified 220 1.00 0.294 2.89
PNIwTjWWjphkqfu 5 1.00 57.87 30.8
PNIFBOIKLplumdb 3 1.00 1243.0 545.33
Any help on this would be greatly appreciated
Upvotes: 0
Views: 3424
Reputation: 50678
Please note that your JSON file as posted is incomplete and missing closing brackets.
One option to reproduce your expected output is to unnest
data in the rows
element of the fromJSON
output.
library(jsonlite)
json_data <- fromJSON("sample.json")
library(tidyverse)
json_data$rows %>%
select(-itemId) %>%
unnest() %>%
group_by(value) %>%
mutate(n = 1:n()) %>%
spread(n, data) %>%
ungroup()
## A tibble: 3 x 5
# value `1` `2` `3` `4`
# <chr> <dbl> <dbl> <dbl> <dbl>
#1 PNIFBOIKLplumdb 3 1.00 1243. 545.
#2 PNIwTjWWjphkqfu 5 1.00 57.9 30.8
#3 Unspecified 220 1.00 0.295 2.89
Explanation: I recommend going through the code line-by-line to understand what each command does. In short: json_data$rows
is a data.frame
; we select all columns except itemId
, and unnest
the data in list
column data
. This results in your data being in long format, so in order to reproduce your expected we need to reshape the data from long to wide. To do so, we group entries by value
, add a counter and use spread
to give the final output.
"sample.json"
{
"totalPages": 1,
"firstPage": true,
"lastPage": true,
"numberOfElements": 1757,
"number": 0,
"totalElements": 1757,
"columns": {
"dimension": {
"id": "variables/evar4",
"type": "string"
},
"columnIds": [
"1",
"2",
"3",
"4"
]
},
"rows": [
{
"itemId": "0",
"value": "Unspecified",
"data": [
220,
1.0002442201269182,
0.29499089253187616,
2.890909090909091
]
},
{
"itemId": "118293442",
"value": "PNIwTjWWjphkqfu",
"data": [
5,
1.0002442201269182,
57.879999999999995,
30.8
]
},
{
"itemId": "1810135314",
"value": "PNIFBOIKLplumdb",
"data": [
3,
1.0002442201269182,
1243.0277777777778,
545.3333333333334
]
}
]
}
Upvotes: 1