indiangolfer
indiangolfer

Reputation: 311

Python Lambda Function Parsing DynamoDB's JSON Format

Python Lambda function that gets invoked for a dynamodb stream has JSON that has DynamoDB format (contains the data types in JSON). I would like to covert DynamoDB JSON to standard JSON. PHP and nodejs have Marshaler that can do this. Please let me know if there are similar or other options for Python.

DynamoDB_format = `{"feas":
    {"M": {
        "fea": {
            "L": [
                {
                    "M": {
                        "pre": {
                            "N": "1"
                        },
                        "Li": {
                            "N": "1"
                        },
                        "Fa": {
                            "N": "0"
                        },
                        "Mo": {
                            "N": "1"
                        },
                        "Ti": {
                            "S": "20160618184156529"
                        },
                        "Fr": {
                            "N": "4088682"
                        }
                    }
                }
                ]
            }   
        }
    }
}`

Upvotes: 17

Views: 23506

Answers (6)

Zoro_77
Zoro_77

Reputation: 415

This worked for me. Made minor modification t @vekerdyb's answer

def _unmarshalValue(ddbValue):                                                       
    for key, value in ddbValue.items():                                              
        if key.lower() == "s":                                                       
            return value                                                             
        elif key.lower() == "n":                                                     
            return int(value)                                                        
        elif key.lower() == "bool":                                                  
            return value                                                             
        elif key.lower() == "m":                                                     
            data = {}                                                                
            for mKey, mValue in value.items():                                       
                data[mKey] = _unmarshalValue(mValue)                                 
            return data                                                              
        elif key.lower() == "l":                                                     
            data = []                                                                
            for item in value:                                                       
                data.append(_unmarshalValue(item))                                   
            return data                                                              
                                                                                 
                                                                                 
def unmarshalDynamoDBJson(ddbItem):                                                                                                                                                                                                                                                                                                                    
    result = {}                                                                      
    for key, value in ddbItem.items():                                               
        result[key] = _unmarshalValue(value)                                         
                                                                                 
    return result       

                                                                          

Upvotes: 0

Junior Osho
Junior Osho

Reputation: 173

import json import boto3 import base64

output = []

def lambda_handler(event, context): print(event) for record in event['records']: payload = base64.b64decode(record['data']).decode('utf-8') print('payload:', payload)

    row_w_newline = payload + "\n"
    print('row_w_newline type:', type(row_w_newline))
    row_w_newline = base64.b64encode(row_w_newline.encode('utf-8'))
    
    output_record = {
        'recordId': record['recordId'],
        'result': 'Ok',
        'data': row_w_newline
    }
    output.append(output_record)

print('Processed {} records.'.format(len(event['records'])))

return {'records': output}

Upvotes: -3

alukach
alukach

Reputation: 6288

As taken from this blog, the following seems to be the simplest solution:

from boto3.dynamodb.types import TypeDeserializer, TypeSerializer


def unmarshall(dynamo_obj: dict) -> dict:
    """Convert a DynamoDB dict into a standard dict."""
    deserializer = TypeDeserializer()
    return {k: deserializer.deserialize(v) for k, v in dynamo_obj.items()}


def marshall(python_obj: dict) -> dict:
    """Convert a standard dict into a DynamoDB ."""
    serializer = TypeSerializer()
    return {k: serializer.serialize(v) for k, v in python_obj.items()}

Upvotes: 4

Kenton Blacutt
Kenton Blacutt

Reputation: 190

To easily convert to and from the DynamoDB JSON I recommend using the boto3 dynamodb types serializer and deserializer.

import boto3
from boto3.dynamodb.types import TypeSerializer, TypeDeserializer
ts= TypeSerializer()
td = TypeDeserializer()

data= {"id": "5000"}
serialized_data= ts.serialize(data)
print(serialized_data)
#{'M': {'id': {'S': '5000'}}}
deserialized_data= td.deserialize(serialized_data)
print(deserialized_data)
#{'id': '5000'}

For more details check out the boto3.dynamodb.types classes.

Upvotes: 10

vekerdyb
vekerdyb

Reputation: 1263

Update: There is a library now: https://pypi.org/project/dynamodb-json/


Here is an improved version of indiangolfer's answer. While @indiangolfer's solution works for the question, this improved version might be more useful for others who stumble upon this thread.

def unmarshal_dynamodb_json(node):
    data = dict({})
    data['M'] = node
    return _unmarshal_value(data)


def _unmarshal_value(node):
    if type(node) is not dict:
        return node

    for key, value in node.items():
        # S – String - return string
        # N – Number - return int or float (if includes '.')
        # B – Binary - not handled
        # BOOL – Boolean - return Bool
        # NULL – Null - return None
        # M – Map - return a dict
        # L – List - return a list
        # SS – String Set - not handled
        # NN – Number Set - not handled
        # BB – Binary Set - not handled
        key = key.lower()
        if key == 'bool':
            return value
        if key == 'null':
            return None
        if key == 's':
            return value
        if key == 'n':
            if '.' in str(value):
                return float(value)
            return int(value)
        if key in ['m', 'l']:
            if key == 'm':
                data = {}
                for key1, value1 in value.items():
                    if key1.lower() == 'l':
                        data = [_unmarshal_value(n) for n in value1]
                    else:
                        if type(value1) is not dict:
                            return _unmarshal_value(value)
                        data[key1] = _unmarshal_value(value1)
                return data
            data = []
            for item in value:
                data.append(_unmarshal_value(item))
            return data

It is improved in the following ways:

  • handles more data types, including lists, which were not handled correctly previously

  • handles lowercase and uppercase keys

Edit: fix recursive object bug

Upvotes: 20

indiangolfer
indiangolfer

Reputation: 311

I couldn't find anything out in the wild. So, I decided to port the PHP implementation of dynamodb json to standard json that was published here. I tested this in a python lambda function processing DynamoDB stream. If there is a better way to do this, please let me know.

(PS: This is not a complete port of PHP Marshaler)

The JSON in the question gets transformed to:

{  
   "feas":{  
      "fea":[  
         {  
            "pre":"1",
            "Mo":"1",
            "Ti":"20160618184156529",
            "Fa":"0",
            "Li":"1",
            "Fr":"4088682"
         }
      ]
   }
}

def unmarshalJson(node):
    data = {}
    data["M"] = node
    return unmarshalValue(data, True)


def unmarshalValue(node, mapAsObject):
    for key, value in node.items():
        if(key == "S" or key == "N"):
            return value
        if(key == "M" or key == "L"):
            if(key == "M"):
                if(mapAsObject):
                    data = {}
                    for key1, value1 in value.items():
                        data[key1] = unmarshalValue(value1, mapAsObject)
                    return data
            data = []
            for item in value:
                data.append(unmarshalValue(item, mapAsObject))
            return data

Upvotes: 11

Related Questions