SteelReyn
SteelReyn

Reputation: 53

How can I load a Plaid banking API response to a pandas dataframe in python?

I'm using Plaid's API to return balances on banking accounts. Their documentation indicates that all responses come in standard JSON. I have experience loading JSON responses from the request module, but I'm not able to directly load Plaid's response to a pandas dataframe. Here's what happens when I try:

request = AccountsBalanceGetRequest(access_token=token)
response = client.accounts_balance_get(request)
df = pd.json_normalize(response, record_path=['accounts'])

ERROR:
File "C:\Users\<me>\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\json\_normalize.py", line 423, in _json_normalize
    raise NotImplementedError

For reference, print(response['accounts']) correctly accesses the relevant part of the response. Here's the section of _normalize from the error, though I don't understand how to apply this to solve the problem:

    if isinstance(data, list) and not data:
        return DataFrame()
    elif isinstance(data, dict):
        # A bit of a hackjob
        data = [data]
    elif isinstance(data, abc.Iterable) and not isinstance(data, str):
        # GH35923 Fix pd.json_normalize to not skip the first element of a
        # generator input
        data = list(data)
    else:
        raise NotImplementedError

If I print the response, it looks like this:

{'accounts': [{'account_id': 'account_1',
               'balances': {'available': 300.0,
                            'current': 300.0,
                            'iso_currency_code': 'USD',
                            'limit': None,
                            'unofficial_currency_code': None},
               'mask': 'xxx1',
               'name': 'SAVINGS',
               'official_name': 'Bank Savings',
               'subtype': 'savings',
               'type': 'depository'},
              {'account_id': 'account_2',
               'balances': {'available': 500.00,
                            'current': 600.0,
                            'iso_currency_code': 'USD',
                            'limit': None,
                            'unofficial_currency_code': None},
               'mask': 'xxx2',
               'name': 'CHECKING',
               'official_name': 'Bank Checking',
               'subtype': 'checking',
               'type': 'depository'},
              {'account_id': 'account_3',
               'balances': {'available': 2000.00,
                            'current': 2000.00,
                            'iso_currency_code': 'USD',
                            'limit': None,
                            'unofficial_currency_code': None},
               'mask': 'xxx3',
               'name': 'BUSINESS CHECKING',
               'official_name': 'Bank Business Checking',
               'subtype': 'checking',
               'type': 'depository'}],
 'item': {'available_products': ['balance'],
          'billed_products': ['auth', 'transactions'],
          'consent_expiration_time': None,
          'error': None,
          'institution_id': 'ins_123xyz',
          'item_id': 'item_123xyz',
          'update_type': 'background',
          'webhook': ''},
 'request_id': 'request_123xyz'}

I assume if Plaid's response is standard JSON, the single quotes are only there because Python's print converted them from double quotes. If I take this string as a base and replace single with double quotes, and replace None with "None", I can load to a dataframe:

data = json.loads(responseString.replace("'", '"').replace('None', '"None"'))
df = pd.json_normalize(data, record_path=['accounts'])
print(df)

Applying this directly to Plaid's response also works:

data = str(response)
data = data.replace("'", '"').replace('None', '"None"')
data = json.loads(data)
df = pd.json_normalize(data, record_path=['accounts'])

What I have seems to be a temporary workaround, but not a robust or intended solution. Is there a more preferred way to get there?

UPDATE 1: Expected output from the first block of code in this post would produce the dataframe below, rather than an error:

 account_id  mask               name           official_name   subtype  ... balances.available  balances.current  balances.iso_currency_code balances.limit balances.unofficial_currency_code
0  account_1  xxx1            SAVINGS            Bank Savings   savings  ...              300.0             300.0                         USD           None                              None
1  account_2  xxx2           CHECKING           Bank Checking  checking  ...              500.0             600.0                         USD           None                              None
2  account_3  xxx3  BUSINESS CHECKING  Bank Business Checking  checking  ...             2000.0            2000.0                         USD           None                              None

I can get the same output with the workaround, but don't understand why it's necessary, and it doesn't seem like a great way to do get the result with relying on replacing single quotes with double quotes.

UPDATE 2: I installed the plaid components on 10/15/2021 using the non-docker instructions and npm.

print(plaid.__version__)
8.2.0
$ py --version
Python 3.9.6

UPDATE 3: Adding full solution based on Stephen's suggested answer. The response needs to be explicitly converted to a dict first, then processed from there. What worked:

json_string = json.loads(json.dumps(response.to_dict()))
df = pd.json_normalize(json_string, record_path=['accounts'])

This allowed me to cut out all the workarounds needed after converting to a string and basically load straight to a dataframe.

Upvotes: 3

Views: 955

Answers (1)

Stephen Jayakar
Stephen Jayakar

Reputation: 46

So I think the solution is something like this

json_string = json.dumps(response.to_dict())
# which you can then input into a df

Basically, we moved from returning dictionaries from the API to returning Python models. So we need to travel from model -> dictionary -> json. to_dict is a method on every model that outputs a dictionary, and then json.dumps takes in a dictionary and converts it to valid JSON.

LMK if this works for you :)

Upvotes: 2

Related Questions