Reputation: 49
I have a quite specific question on how to convert JSON data to R. I am dealing with data from a reaction time test. The data contains some basic formation on id, gender, and age and is in csv format. However, the data for the reaction task is delivered as JSON-array with the following structure: ["stimulus 1", "stimulus 2", answer chosen, reaction time].
This is an example of how the data looks like, just to give you a basic idea of it (with the exception that the JSON array is in fact much longer in the original data)
id gender age reaction_task
HU3 male 34 [["prime1", "target2", 1, 1560], ["prime7", "target6", 2, 1302], ["prime4", "target5", 2, 996]]
I am quite a novice in R and looking for a method to convert this JSON-array into multiple R columns - for instance like this:
trial1_stimulus1 trial1_stimulus2 trial1_answer trial1_time trail2_stimulus1 trial2_stimulus2 etc
prime1 target2 1 1560 prime7 target2
I found out how to separate the data from another using the following command:
df <- cbind(df, read.table(text = as.character(df$reaction_task), sep = ",", fill=TRUE) )
It worked, but turned out to be quite laborious, as I stilled had to eliminate the []
from the data manually. So I was wondering wether there is a smoother way to deal with this task?
I was trying the following code as well, but got an error message:
purrr::map_dfr(sosci$A101oRAW, jsonlite::fromJSON)
Fehler: parse error: premature EOF
(right here) ------^
Thanks for your help!
Edit: Thanks a lot to Maydin for the answer provided! It works well for the example data, but when the data frame contains more than one person, I get almost the same error warning as before:
id <- c("HU3", "AB0", "IO9")
gender <- c("male", "female", "male")
age <-c(34, 87, 23)
task <- c("[[\"prime1\", \"target2\", 2, 1529], [\"prime7\", \"target6\", 2, 829], [\"prime4\", \"target5\", 1, 1872]]", "[[\"prime1\", \"target2\", 1, 1560], [\"prime7\", \"target6\", 2, 1302], [\"prime4\", \"target5\", 2, 996]]","[[\"prime1\", \"target2\", 1, 679], [\"prime7\", \"target6\", 1, 2090], [\"prime4\", \"target5\", 1, 528]]")
df <- data.frame(id, gender, age, task)
library(jsonlite)
library(dplyr)
df2 <- data.frame(df[,1:3],fromJSON(as.character(df[,"task"])))
parse error: trailing garbage
rime4", "target5", 1, 1872]] [["prime1", "target2", 1, 1560]
(right here) ------^
Upvotes: 1
Views: 1047
Reputation: 3755
library(jsonlite)
df2 <- lapply(1:nrow(df), function(x) {
data.frame(df[x,1:3],fromJSON(as.character(df[x,"task"])),
row.names = NULL) })
df2 <- do.call(rbind,df2)
df2
id gender age X1 X2 X3 X4
1 HU3 male 34 prime1 target2 2 1529
2 HU3 male 34 prime7 target6 2 829
3 HU3 male 34 prime4 target5 1 1872
4 AB0 female 87 prime1 target2 1 1560
5 AB0 female 87 prime7 target6 2 1302
6 AB0 female 87 prime4 target5 2 996
7 IO9 male 23 prime1 target2 1 679
8 IO9 male 23 prime7 target6 1 2090
9 IO9 male 23 prime4 target5 1 528
I think the output above is in a nicer format, but if you like to convert this into columns,
library(tidyr)
pivot_wider(data = df2,
id_cols = c("id","gender","age"),
names_from = c("X1","X2","X3","X4"),
values_from =c("X1","X2","X3","X4")) %>% as.data.frame()
You can change the names of the columns if you want later on by using colnames()
etc.
Data:
df <- structure(list(id = structure(1L, .Label = "HU3", class = "factor"),
gender = structure(1L, .Label = "male", class = "factor"),
age = structure(1L, .Label = "34", class = "factor"), reaction_task = structure(1L, .Label = "[[\"prime1\", \"target2\", 1, 1560], [\"prime7\", \"target6\", 2, 1302], [\"prime4\", \"target5\", 2, 996]]", class = "factor")), class = "data.frame", row.names = c(NA,
-1L))
Upvotes: 3