How to concatenate Pandas dataframes built from multiple requests/JSON responses?

Question

I have a script that I use to send a request to an endpoint and receive a JSON response that looks as follows:

{'SalesInfoStatus': {'next_offset': 15,
  'sales': [{'saSalesID': '761S-1666031549-adhoc-0:0',
    'identifier': '761',
    'origin': 'New York',
    'destination': 'London',
    'filed_time': {'epoch': 1666033349,
     'time': '15:02',
     'date': '10/19/2022',
     'localtime': 1666018949},
     'actual_time': {'epoch': 1666033349,
     'time': '15:02',
     'date': '10/19/2022',
     'localtime': 1666018949},
    'inbound_saSalesID': '761S-1666029229-adhoc-0:0'},
   {'saSalesID': '762S-1666031549-adhoc-0:0',
    'identifier': '762',
    'origin': 'New York',
    'destination': 'London',
    'filed_time': {'epoch': 1666033349,
     'time': '15:02',
     'date': '10/17/2022',
     'localtime': 1666018949},
     'actual_time': {'epoch': 1666033349,
     'time': '15:02',
     'date': '10/17/2022',
     'localtime': 1666018949},
    'inbound_saSalesID': '762S-1666029229-adhoc-0:0'},

   {'saSalesID': '765S-1666031549-adhoc-0:0',
    'identifier': '765',
    'origin': 'Paris',
    'destination': 'Tokyo',
    'filed_time': {'epoch': 1666033349,
     'time': '15:02',
     'date': '10/15/2022',
     'localtime': 1666018949},
     'actual_time': {'epoch': 1666033349,
     'time': '15:02',
     'date': '10/15/2022',
     'localtime': 1666018949},
    'inbound_saSalesID': '765S-1666029229-adhoc-0:0'},

   {'saSalesID': '767S-1666031549-adhoc-0:0',
    'identifier': '767',
    'origin': 'Los Angeles',
    'destination': 'Sydney',
    'filed_time': {'epoch': 1666033349,
     'time': '15:02',
     'date': '10/13/2022',
     'localtime': 1666018949},
     'actual_time': {'epoch': 1666033349,
     'time': '15:02',
     'date': '10/13/2022',
     'localtime': 1666018949},
    'inbound_saSalesID': '767S-1666029229-adhoc-0:0'}

]}}

I then build a Pandas dataframe from the response.

Here is the code:

identifier = ['S1234']

url = https://example.com/json/ident={identifier}

response = requests.get(url, auth=(user_name, api_key)).json()

try:
    df = pd.json_normalize(response['SalesInfoStatus']['sales'])[['actual_departure_time.date', 'origin', 'destination', 'ident']]

except KeyError:
     print('Identifier not found.')

I'd like to to iterate through a list of about a handful of identifiers, build a Pandas dataframe for each, then build one unified dataframe consisting of each of the individual dataframes (i.e. concatenate each dataframe to build only one Pandas dataframe).

Let's assume the list of identifiers is:

identifiers = ['S1234', 'S2345', 'S3456', 'S4567', 'S5678']

The only wrinkle is that ['SalesInfoStatus']['sales'] could be empty.

How would I do this?

Thanks!

Gerd · Accepted Answer

Try this:

Convert the entire structure to a data frame:

df = pd.json_normalize(response['SalesInfoStatus']['sales'])

Check if the structure is empty:

if df.empty:
    print('dataframe is empty')

If the structure is not empty, extract the relevant columns to create the unified data frame and create a list of individual data frames based on the identifier list:

else:
    df = df[['actual_time.date', 'origin', 'destination', 'identifier']]
    print(df)
    df_list = []
    identifiers = ['761', '765']
    for identifier in identifiers:
        df_list.append (df[df['identifier'] == identifier])
    for d in df_list:
        print(d)

df will be your unified data frame, df_list will contain the individual data frames for each identifier.

Note that the data field names slightly differ between your json example and your code (actual_time/actual_departure_time, identifier/ident).

How to concatenate Pandas dataframes built from multiple requests/JSON responses?

Answers (1)

Related Questions