Reputation: 311
Python Lambda function that gets invoked for a dynamodb stream has JSON that has DynamoDB format (contains the data types in JSON). I would like to covert DynamoDB JSON to standard JSON. PHP and nodejs have Marshaler that can do this. Please let me know if there are similar or other options for Python.
DynamoDB_format = `{"feas":
{"M": {
"fea": {
"L": [
{
"M": {
"pre": {
"N": "1"
},
"Li": {
"N": "1"
},
"Fa": {
"N": "0"
},
"Mo": {
"N": "1"
},
"Ti": {
"S": "20160618184156529"
},
"Fr": {
"N": "4088682"
}
}
}
]
}
}
}
}`
Upvotes: 17
Views: 23506
Reputation: 415
This worked for me. Made minor modification t @vekerdyb's answer
def _unmarshalValue(ddbValue):
for key, value in ddbValue.items():
if key.lower() == "s":
return value
elif key.lower() == "n":
return int(value)
elif key.lower() == "bool":
return value
elif key.lower() == "m":
data = {}
for mKey, mValue in value.items():
data[mKey] = _unmarshalValue(mValue)
return data
elif key.lower() == "l":
data = []
for item in value:
data.append(_unmarshalValue(item))
return data
def unmarshalDynamoDBJson(ddbItem):
result = {}
for key, value in ddbItem.items():
result[key] = _unmarshalValue(value)
return result
Upvotes: 0
Reputation: 173
import json import boto3 import base64
output = []
def lambda_handler(event, context): print(event) for record in event['records']: payload = base64.b64decode(record['data']).decode('utf-8') print('payload:', payload)
row_w_newline = payload + "\n"
print('row_w_newline type:', type(row_w_newline))
row_w_newline = base64.b64encode(row_w_newline.encode('utf-8'))
output_record = {
'recordId': record['recordId'],
'result': 'Ok',
'data': row_w_newline
}
output.append(output_record)
print('Processed {} records.'.format(len(event['records'])))
return {'records': output}
Upvotes: -3
Reputation: 6288
As taken from this blog, the following seems to be the simplest solution:
from boto3.dynamodb.types import TypeDeserializer, TypeSerializer
def unmarshall(dynamo_obj: dict) -> dict:
"""Convert a DynamoDB dict into a standard dict."""
deserializer = TypeDeserializer()
return {k: deserializer.deserialize(v) for k, v in dynamo_obj.items()}
def marshall(python_obj: dict) -> dict:
"""Convert a standard dict into a DynamoDB ."""
serializer = TypeSerializer()
return {k: serializer.serialize(v) for k, v in python_obj.items()}
Upvotes: 4
Reputation: 190
To easily convert to and from the DynamoDB JSON I recommend using the boto3 dynamodb types serializer and deserializer.
import boto3
from boto3.dynamodb.types import TypeSerializer, TypeDeserializer
ts= TypeSerializer()
td = TypeDeserializer()
data= {"id": "5000"}
serialized_data= ts.serialize(data)
print(serialized_data)
#{'M': {'id': {'S': '5000'}}}
deserialized_data= td.deserialize(serialized_data)
print(deserialized_data)
#{'id': '5000'}
For more details check out the boto3.dynamodb.types classes.
Upvotes: 10
Reputation: 1263
Update: There is a library now: https://pypi.org/project/dynamodb-json/
Here is an improved version of indiangolfer's answer. While @indiangolfer's solution works for the question, this improved version might be more useful for others who stumble upon this thread.
def unmarshal_dynamodb_json(node):
data = dict({})
data['M'] = node
return _unmarshal_value(data)
def _unmarshal_value(node):
if type(node) is not dict:
return node
for key, value in node.items():
# S – String - return string
# N – Number - return int or float (if includes '.')
# B – Binary - not handled
# BOOL – Boolean - return Bool
# NULL – Null - return None
# M – Map - return a dict
# L – List - return a list
# SS – String Set - not handled
# NN – Number Set - not handled
# BB – Binary Set - not handled
key = key.lower()
if key == 'bool':
return value
if key == 'null':
return None
if key == 's':
return value
if key == 'n':
if '.' in str(value):
return float(value)
return int(value)
if key in ['m', 'l']:
if key == 'm':
data = {}
for key1, value1 in value.items():
if key1.lower() == 'l':
data = [_unmarshal_value(n) for n in value1]
else:
if type(value1) is not dict:
return _unmarshal_value(value)
data[key1] = _unmarshal_value(value1)
return data
data = []
for item in value:
data.append(_unmarshal_value(item))
return data
It is improved in the following ways:
handles more data types, including lists, which were not handled correctly previously
handles lowercase and uppercase keys
Edit: fix recursive object bug
Upvotes: 20
Reputation: 311
I couldn't find anything out in the wild. So, I decided to port the PHP implementation of dynamodb json to standard json that was published here. I tested this in a python lambda function processing DynamoDB stream. If there is a better way to do this, please let me know.
(PS: This is not a complete port of PHP Marshaler)
The JSON in the question gets transformed to:
{
"feas":{
"fea":[
{
"pre":"1",
"Mo":"1",
"Ti":"20160618184156529",
"Fa":"0",
"Li":"1",
"Fr":"4088682"
}
]
}
}
def unmarshalJson(node):
data = {}
data["M"] = node
return unmarshalValue(data, True)
def unmarshalValue(node, mapAsObject):
for key, value in node.items():
if(key == "S" or key == "N"):
return value
if(key == "M" or key == "L"):
if(key == "M"):
if(mapAsObject):
data = {}
for key1, value1 in value.items():
data[key1] = unmarshalValue(value1, mapAsObject)
return data
data = []
for item in value:
data.append(unmarshalValue(item, mapAsObject))
return data
Upvotes: 11