Reputation: 141
I have as input a file named in.json. You can find the content of this file here
Using this answer I try to convert json to csv with this code:
require(RJSONIO)
require(rjson)
library("rjson")
filename2 <- "C:/Users/Desktop/in.json"
json_data <- fromJSON(file = filename2)
json_data <- lapply(json_data, function(x) {
x[sapply(x, is.null)] <- NA
unlist(x)
})
json <- do.call("rbind", json_data)
df=json
write.csv(df,file='C:/Users/Desktop/final.csv', row.names=FALSE)
However when I type nrow(df) I can see I have only 2 rows but according to every id of project I have to more rows.
Upvotes: 0
Views: 863
Reputation: 4534
The json you provide as an example indeed has only two objects in an array. The structure is faithfully shown by a called to str
:
> str(json_data,max.level=2)
List of 2
$ :List of 3
..$ projects :List of 1
..$ total_hits: num 12596
..$ seed : chr "776766"
$ :List of 3
..$ projects :List of 16
..$ total_hits: num 12596
..$ seed : chr "776766"
Guessing that you mean project id, and that you don't mind to loose the "total_hits" and you simply need to unlist the first two levels of the json:
unlisted <- unlist(unlist(json_data,recursive=FALSE),recursive=FALSE)
And then select the items named projects*:
projects <- unlisted[grep("^projects*",names(unlisted))]
You can then simply unlist using:
data <- lapply(projects,unlist)
Rbinding is more tricky as you do not have exactly the same fields filled in all projects, you need to rely on the names, the following is one of the many solutions, and probably not the optimal one:
# list all the names in all projects
allNames <- unique(unlist(lapply(data,names)))
# have a model row
modelRow <- rep(NA,length(allNames))
names(modelRow)<-allNames
# the function to change your list into a row following modelRow structure
rowSettingFn <- function(project){
row <- modelRow
for(iItem in 1:length(project)){
row[names(project)[iItem]] <- project[[iItem]]
}
return(row)
}
# change your data into a matrix
dataMat <- sapply(data,rowSettingFn)
Upvotes: 2