TSpinde
TSpinde

Reputation: 35

multiple JSON objects into R from one txt file

I am very new to Json files. I scraped a txt file with some million json objects such as:

{
    "created_at":"Mon Oct 14 21:04:25 +0000 2013",
    "default_profile":true,
    "default_profile_image":true,
    "description":"...",
    "followers_count":5,
    "friends_count":560,
    "geo_enabled":true,
    "id":1961287134,
    "lang":"de",
    "name":"Peter Schmitz",
    "profile_background_color":"C0DEED",
    "profile_background_image_url":"http://abs.twimg.com/images/themes", 
    "utc_offset":-28800,
    ...
}
{
    "created_at":"Fri Oct 17 20:04:25 +0000 2015",
    ...
}

I want to extract the columns into a data frame in R:

Variable          Value
created_at          X     
default_profile     Y     

 …

In general, similar to how done here(multiple Json objects in one file extract by python) in Python. If anyone has an idea or a suggestion, help would be much appreciated! Thank you!

Upvotes: 2

Views: 1239

Answers (1)

Florian
Florian

Reputation: 25395

Here is an example on how you could approach it with two objects. I assume you were able to read the JSON from a file, otherwise see here.

myjson = '{"created_at": "Mon Oct 14 21:04:25 +0000 2013", "default_profile": true, 
  "default_profile_image": true, "description": "...", "followers_count": 
    5, "friends_count": 560, "geo_enabled": true, "id": 1961287134, "lang":  
    "de", "name": "Peter Schmitz", "profile_background_color": "C0DEED",  
  "profile_background_image_url": "http://abs.twimg.com/images/themes", "utc_offset": -28800}
{"created_at": "Mon Oct 15 21:04:25 +0000 2013", "default_profile": true, 
  "default_profile_image": true, "description": "...", "followers_count": 
    5, "friends_count": 560, "geo_enabled": true, "id": 1961287134, "lang":  
    "de", "name": "Peter Schmitz", "profile_background_color": "C0DEED",  
  "profile_background_image_url": "http://abs.twimg.com/images/themes", "utc_offset": -28800}
'

library("rjson")

# Split the text into a list of all JSON objects. I chose '!x!x!' pretty randomly.. There may be better ways of keeping the brackets wile splitting.
my_json_objects = head(strsplit(gsub('\\}','\\}!x!x!', myjson),'!x!x!')[[1]],-1)
# read the text as JSON objects 
json_data <- lapply(my_json_objects, function(x) {fromJSON(x)})
# Transform to dataframes
json_data <- lapply(json_data, function(x) {data.frame(val=unlist(x))}) 

Output:

[[1]]
                                                            val
created_at                       Mon Oct 14 21:04:25 +0000 2013
default_profile                                            TRUE
default_profile_image                                      TRUE
description                                                 ...
followers_count                                               5
friends_count                                               560
geo_enabled                                                TRUE
id                                                   1961287134
lang                                                         de
name                                              Peter Schmitz
profile_background_color                                 C0DEED
profile_background_image_url http://abs.twimg.com/images/themes
utc_offset                                               -28800

[[2]]
                                                            val
created_at                       Mon Oct 15 21:04:25 +0000 2013
default_profile                                            TRUE
default_profile_image                                      TRUE
description                                                 ...
followers_count                                               5
friends_count                                               560
geo_enabled                                                TRUE
id                                                   1961287134
lang                                                         de
name                                              Peter Schmitz
profile_background_color                                 C0DEED
profile_background_image_url http://abs.twimg.com/images/themes
utc_offset                                               -28800

Hope this helps!

Upvotes: 2

Related Questions