Flattening nested JSON API dictionaries in Python

Question

I am receiving the following json response for a distances matrix which was gathered using the following code:

import requests
import json

payload = {
    "origins": [{"latitude": 54.6565153, "longitude": -1.6802816}, {"latitude": 54.6365153, "longitude": -1.6202816}], #surgery
    "destinations": [{"latitude": 54.6856522, "longitude": -1.2183634}, {"latitude": 54.5393295, "longitude": -1.2623914}, {"latitude": 54.5393295, "longitude": -1.2623914}], #oa - up to 625 entries
    "travelMode": "driving",
    "startTime": "2014-04-01T11:59:59+01:00",
    "timeUnit": "second"
}
headers = {"Content-Length": "497", "Content-Type": "application/json"}
paramtr = {"key": "INSERT_KEY_HERE"}
r = requests.post('https://dev.virtualearth.net/REST/v1/Routes/DistanceMatrix', data = json.dumps(payload), params = paramtr, headers = headers)
data = r.json()["resourceSets"][0]["resources"][0]

and am attempting to flatten:

destinations.latitude, destinations.longitude, origins.latitude, origins.longitude, departureTime, destinationIndex, originIndex, totalWalkDuration, travelDistance, travelDuration

from:

    {'__type': 'DistanceMatrix:http://schemas.microsoft.com/search/local/ws/rest/v1',
 'destinations': [{'latitude': 54.6856522, 'longitude': -1.2183634},
  {'latitude': 54.5393295, 'longitude': -1.2623914},
  {'latitude': 54.5393295, 'longitude': -1.2623914}],
 'errorMessage': 'Request completed.',
 'origins': [{'latitude': 54.6565153, 'longitude': -1.6802816},
  {'latitude': 54.6365153, 'longitude': -1.6202816}],
 'results': [{'departureTime': '/Date(1396349159000-0700)/',
   'destinationIndex': 0,
   'originIndex': 0,
   'totalWalkDuration': 0,
   'travelDistance': 38.209,
   'travelDuration': 3082},
  {'departureTime': '/Date(1396349159000-0700)/',
   'destinationIndex': 1,
   'originIndex': 0,
   'totalWalkDuration': 0,
   'travelDistance': 40.247,
   'travelDuration': 2708},
  {'departureTime': '/Date(1396349159000-0700)/',
   'destinationIndex': 2,
   'originIndex': 0,
   'totalWalkDuration': 0,
   'travelDistance': 40.247,
   'travelDuration': 2708},
  {'departureTime': '/Date(1396349159000-0700)/',
   'destinationIndex': 0,
   'originIndex': 1,
   'totalWalkDuration': 0,
   'travelDistance': 34.857,
   'travelDuration': 2745},
  {'departureTime': '/Date(1396349159000-0700)/',
   'destinationIndex': 1,
   'originIndex': 1,
   'totalWalkDuration': 0,
   'travelDistance': 36.895,
   'travelDuration': 2377},
  {'departureTime': '/Date(1396349159000-0700)/',
   'destinationIndex': 2,
   'originIndex': 1,
   'totalWalkDuration': 0,
   'travelDistance': 36.895,
   'travelDuration': 2377}]}

The best I have currently achieved is:

json_normalize(outtie, record_path="results", meta="origins")

However this contains nested origins and destinations refuse to append. I also tried to drop the type to see if it made a difference, and explored max_level= and record_prefix='_' but to no avail.

Trenton McKinney · Accepted Answer

I don't not think this is an appropriate question for flatten_json, however, it can be useful for JSON objects that are less thoughtfully constructed.
- See How to flatten nested JSON recursively, with flatten_json? for those cases.
The list in destinations, corresponds to the list in results, which means, when they are normalized, they'll have the same index.
The dataframes can be concated correctly, because they will have corresponding indices.

# create a dataframe for results and origins
res_or = pd.json_normalize(data, record_path=['results'], meta=[['origins']])

# create a dataframe for destinations
dest = pd.json_normalize(data, record_path=['destinations'], record_prefix='dest_')

# normalize the origins column in res_or
orig = pd.json_normalize(res_or.origins).rename(columns={'latitude': 'origin_lat', 'longitude': 'origin_long'})

# concat the dataframes
df = pd.concat([res_or, orig, dest], axis=1).drop(columns=['origins'])

# display(df)
                departureTime  destinationIndex  originIndex  totalWalkDuration  travelDistance  travelDuration  origin_lat  origin_long  dest_latitude  dest_longitude
0  /Date(1396349159000-0700)/                 0            0                  0          38.209            3082   54.656515    -1.680282      54.685652       -1.218363
1  /Date(1396349159000-0700)/                 1            0                  0          40.247            2708   54.656515    -1.680282      54.539330       -1.262391
2  /Date(1396349159000-0700)/                 2            0                  0          40.247            2708   54.656515    -1.680282      54.539330       -1.262391

update for new example data

Records contains the index for destinations and origins, so it's easy to create a separate dataframe for each key, and then .merge the dataframes.
- The index for orig and dest, corresponds to destinationIndex and originsIndex in results.

# create three separate dataframe
results = pd.json_normalize(data, record_path=['results'])
dest = pd.json_normalize(data, record_path=['destinations'], record_prefix='dest_')
orig = pd.json_normalize(data, record_path=['origins'], record_prefix='orig_')

# merge them at the appropriate location
df = pd.merge(results, dest, left_on='destinationIndex', right_index=True)
df = pd.merge(df, orig, left_on='originIndex', right_index=True)

# display(df)
                departureTime  destinationIndex  originIndex  totalWalkDuration  travelDistance  travelDuration  dest_latitude  dest_longitude  orig_latitude  orig_longitude
0  /Date(1396349159000-0700)/                 0            0                  0          38.209            3082      54.685652       -1.218363      54.656515       -1.680282
1  /Date(1396349159000-0700)/                 1            0                  0          40.247            2708      54.539330       -1.262391      54.656515       -1.680282
2  /Date(1396349159000-0700)/                 2            0                  0          40.247            2708      54.539330       -1.262391      54.656515       -1.680282
3  /Date(1396349159000-0700)/                 0            1                  0          34.857            2745      54.685652       -1.218363      54.636515       -1.620282
4  /Date(1396349159000-0700)/                 1            1                  0          36.895            2377      54.539330       -1.262391      54.636515       -1.620282
5  /Date(1396349159000-0700)/                 2            1                  0          36.895            2377      54.539330       -1.262391      54.636515       -1.620282

Flattening nested JSON API dictionaries in Python

Answers (2)

update for new example data

Related Questions