Reputation: 644
I have a serverless Aurora DB on AWS RDS (with data api enabled) that I would like to query using the database resource and secret ARNs. The code to do this is shown below.
rds_data = boto3.client('rds-data', region_name='us-west-1')
response = rds_data.execute_statement(
resourceArn='DATABASE_ARN',
secretArn='DATABASE_SECRET',
database='db',
sql=query)
rds_data.close()
The response contains the returned records, but the column names are not shown and the format is interesting (see below).
'records': [[{'stringValue': 'name'}, {'stringValue': '5'}, {'stringValue': '3'}, {'stringValue': '1'}, {'stringValue': '2'}, {'stringValue': '4'}]]
I want to query the aurora database and then create a pandas dataframe. Is there a better way to do this? I looked at just using psycopg2 and creating a database connection, but I believe you need a user name and password. I would like to not go that route and instead authenticate using IAM.
Upvotes: 1
Views: 1025
Reputation: 644
The answer was actually pretty simple. There is a parameter called "formatRecordsAs" in the "execute_statement" function. If you set this to json, you can get back better records.
response = rds_data.execute_statement(
resourceArn='DATABASE_ARN',
secretArn='DATABASE_SECRET',
database='db',
formatRecordsAs='JSON',
sql=query)
This still gives you back a string, so then you just need to convert that string representation to a list of dictionaries.
list_of_dicts = ast.literal_eval(response['formattedRecords'])
That can then be changed to a pandas dataframe
df = pd.DataFrame(list_of_dicts)
Upvotes: 1