Reputation: 135
I'm trying to import data from a JSON file into R in order to experiment with natural language processing. The data was parsed and extracted from a blog written in markdown. The problem is that the import in R is imported as lists and a funny format, and I can't figure out how to get it into a data frame. Is it a problem with my JSON file or import process?
Sample Data:
{
"2017-11-17-blog-post-01": {
"title": "Blog Post 01",
"layout": "post",
"categories": [
"Category1",
"Category2"
],
"comments": true,
"published": true,
"permalink": "/blog-post-01.html",
"basename": "2017-11-17-blog-post-01"
},
"2017-11-30-blog-post-02": {
"title": "Blog Post 2",
"layout": "post",
"categories": [
"Category2",
"Category3"
],
"comments": true,
"published": true,
"permalink": "/2017-11-30-blog-post-02.html",
"basename": "2017-11-30-blog-post-02"
}
}
Command:
library(jsonlite)
import <- fromJSON("test-import.json", flatten=TRUE)
Results:
$`2017-11-17-blog-post-01`
$`2017-11-17-blog-post-01`$title
[1] "Blog Post 01"
$`2017-11-17-blog-post-01`$layout
[1] "post"
$`2017-11-17-blog-post-01`$categories
[1] "Category1" "Category2"
$`2017-11-17-blog-post-01`$comments
[1] TRUE
$`2017-11-17-blog-post-01`$published
[1] TRUE
$`2017-11-17-blog-post-01`$permalink
[1] "/blog-post-01.html"
$`2017-11-17-blog-post-01`$basename
[1] "2017-11-17-blog-post-01"
$`2017-11-30-blog-post-02`
$`2017-11-30-blog-post-02`$title
[1] "Blog Post 2"
$`2017-11-30-blog-post-02`$layout
[1] "post"
$`2017-11-30-blog-post-02`$categories
[1] "Category2" "Category3"
$`2017-11-30-blog-post-02`$comments
[1] TRUE
$`2017-11-30-blog-post-02`$published
[1] TRUE
$`2017-11-30-blog-post-02`$permalink
[1] "/2017-11-30-blog-post-02.html"
$`2017-11-30-blog-post-02`$basename
[1] "2017-11-30-blog-post-02"
Upvotes: 0
Views: 215
Reputation: 78832
library(purrr)
Your data:
jsonlite::fromJSON('{
"2017-11-17-blog-post-01": {
"title": "Blog Post 01",
"layout": "post",
"categories": [
"Category1",
"Category2"
],
"comments": true,
"published": true,
"permalink": "/blog-post-01.html",
"basename": "2017-11-17-blog-post-01"
},
"2017-11-30-blog-post-02": {
"title": "Blog Post 2",
"layout": "post",
"categories": [
"Category2",
"Category3"
],
"comments": true,
"published": true,
"permalink": "/2017-11-30-blog-post-02.html",
"basename": "2017-11-30-blog-post-02"
}
}', flatten=TRUE) -> jsdat
flatten=TRUE
works much of the time but I think categories
is causing it to not automagically make a data frame for you, so we can give it a hand:
map_df(jsdat, ~{
.x$categories <- list(.x$categories)
.x
}, .id="id")
## # A tibble: 2 x 8
## id title layout categories comments published permalink basename
## <chr> <chr> <chr> <list> <lgl> <lgl> <chr> <chr>
## 1 2017-11-17-blog-post-01 Blog Post 01 post <chr [2]> TRUE TRUE /blog-post-01.html 2017-11-17-blog-post-01
## 2 2017-11-30-blog-post-02 Blog Post 2 post <chr [2]> TRUE TRUE /2017-11-30-blog-post-02.html 2017-11-30-blog-post-02
Upvotes: 1