Issues with Python parsing dictionary output of Amazon RDS instances

Question

I am trying to parse what I guess is a dict output from AWS's boto interface in Python, which should give me information about all of my Amazon RDS (database) instances. I am new to all of the tools I am using here, so please forgive my ignorance.

I am trying to treat it like an array, which may be a bad premise to start with, since it's a dict type. The below code works fine to pull two (2) instances, which is what is strange - it's at least repeatable once, and I've verified by printing the dict output shows all four (4) instances I have running in the Region - but it does not pull the latter two in the output. When I use len() to measure the length of the output, it returns 2.

Can someone please help me understand what I need to do to this code to make it actually parse out all of the instances returned? (And then the functionality is to look for a rebootAllowed tag, which if 1, stops the instance... this appears to work fine, but again, only for the first two results returned.)

import json
import boto3

region = 'us-east-1'
rds = boto3.client('rds')

def lambda_handler(event, context):
    # Ingest all RDS instances
    dbinstances = rds.describe_db_instances() # Filtering by instance status is not supported
    
    print('* * *')
    
    # Set db instance counter for below loop   
    dbi = 0

    # Loop through each running RDS instance, checking tags and shutting down if tags match
    for dbinstance in dbinstances:
        # Set default values for tags we'll parse
        rebootAllowed = 0
        # We'll need this later
        try:
            dbinstanceId = dbinstances.get('DBInstances')[dbi]['DBInstanceIdentifier']
            dbinstanceArn = dbinstances.get('DBInstances')[dbi]['DBInstanceArn']
            rdstags = rds.list_tags_for_resource(ResourceName=dbinstanceArn)
            # Attempt to look into tags for EC2s that have them. If they don't, we'll get an exception
            try:
                # Does the instance have the rebootAllowed tag? Great, what's its value?
                if 'rebootAllowed' in rdstags['TagList'][dbi]['Key']:
                    rebootAllowed = rdstags['TagList'][dbi]['Value']
                # Let's log what we're doing.
                print('Examining RDS instance ' + dbinstanceId + '...')
                # Attempt to stop instance
                try:
                    if rebootAllowed == '1':
                        message = '-- This instance CAN be stopped. Doing that now.'
                        rdsid = [dbinstanceId]
                        rds.stop_db_instance(DBInstanceIdentifier=dbinstanceId)
                    elif rebootAllowed == '0':
                        message = '-- This instance is BLOCKED from being stopped. Doing nothing.'
                    else:
                        message = '-- This instance does not have the tags used by this script. Skipping.'
                except Exception, err:
                    message = 'Error with RDS instance: instanceId: ' +  err
                    raise err
                print (message)
                print ('* * *')
                dbi += 1
            except:
                print('Examining RDS instance ' + dbinstanceId + ')')
                print('-- An EXECPTION occurred while analyzing this instance. This could be because the instance has no tags at all.')
                print('* * *')
                dbi += 1
        except:
            print('End of list. Script complete.')

Ben · Accepted Answer

It's a little tough to tell exactly what's going on, especially since we don't exactly know what the shape of the dict you're trying to iterate is, but it does seem like you have a sense of the underlying problem: you're not really handling this iteration very pythonically.

I went and looked up the output, and found this sample:

{
    "DBInstances": [
        {
            "DBInstanceIdentifier": "mydbinstancecf",
            "DBInstanceClass": "db.t3.small",
            "Engine": "mysql",
            "DBInstanceStatus": "available",
            "MasterUsername": "masterawsuser",
            "Endpoint": {
                "Address": "mydbinstancecf.abcexample.us-east-1.rds.amazonaws.com",
                "Port": 3306,
                "HostedZoneId": "Z2R2ITUGPM61AM"
            },
            ...some output truncated...
        }
    ]
}

So right off the bat, you can simplify your loop by cutting out the extraneous data in dbinstances

# dbinstances = rds.describe_db_instances() becomes
dbinstances = rds.describe_db_instances()["DBInstances"]

now, you're dealing with an array of db instances. In a python for loop, you get each element of the iterarable (list) as the variable. There's no need, in this case, to maintain that dbi counter at all. If you do want to be counting the elements, you can do this for i, e in enumerate(my_list): where i is the index, and e is the element.

so your loop becomes more like this:

for instance in dbinstances: 
# Set default values for tags we'll parse
    rebootAllowed = 0
    # We'll need this later
    try:
       dbinstanceId = instance['DBInstanceIdentifier']
       dbinstanceArn = instance['DBInstanceArn']
       rdstags = rds.list_tags_for_resource(ResourceName=dbinstanceArn)
#     ... the rest is left as an exercise for Josh

As a general suggestion, when designing this kind of algorithm in Python, make heavy use of the REPL. You can poke around and try things out to quickly get a sense of the actual structure of the data.

Issues with Python parsing dictionary output of Amazon RDS instances

Answers (1)

Related Questions