Reputation: 21
I'm trying to get a data frame but keep running into various error messages depending on the arguments I specify in read.json after I specify my file.
I've run through many of the arguments in the pandas.read_json documentation, but haven't been able to identify a solution.
import pandas
json_file = "https://gis.fema.gov/arcgis/rest/services/NSS/OpenShelters/MapServer/0/query?where=1%3D1&outFields=*&returnGeometry=false&outSR=4326&f=json"
pandas.read_json(json_file)
I'm trying to get a data frame but keep running into various error messages depending on the arguments I specify in read.json after I specify my file.
Upvotes: 1
Views: 393
Reputation: 93161
Because the JSON is not directly convertible to DataFrame
. read_json
works only with a few formats defined by the orient
parameter. Your JSON doesn't follow any of the allowed formats so you need to manipulate the JSON before converting it to a data frame.
Let's take a high level look at your JSON:
{
"displayFieldName": ...,
"fieldAliases": {...},
"fields": {...},
"features": [...]
}
I'm gonna fathom a guess and assume the features
node is what you want. Let's div deeper into features
:
"features": [
{
"attributes": {
"OBJECTID": 1,
"SHELTER_ID": 223259,
...
}
},
{
"attributes": {
"OBJECTID": 2,
"SHELTER_ID": 223331,
...
}
},
...
]
features
contains a list of objects, each having an attributes
node. The data contained in the attributes
node is what you actually want.
Here's the code
import pandas as pd
import json
from urllib.request import urlopen
json_file = "https://gis.fema.gov/arcgis/rest/services/NSS/OpenShelters/MapServer/0/query?where=1%3D1&outFields=*&returnGeometry=false&outSR=4326&f=json"
data = urlopen(json_file).read()
raw_json = json.loads(data)
formatted_json = [feature['attributes'] for feature in raw_json['features']]
formatted_json
is now a list of dictionaries containing the data we are after. It is no longer JSON. To create the data frame:
df = pd.DataFrame(formatted_json)
Upvotes: 1