Reputation: 9
I'm working on migrating the data from csv file stored in s3 to a table in DynamoDB. The code seems working but only the last data point is being posted on DynamoDB. The primary partition key (serial) is same for all data points. Not sure if I'm doing something wrong here and any help is greatly appreciated.
import boto3
s3_client = boto3.client("s3")
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('scan_records')
def lambda_handler(event, context):
bucket_name = event['Records'][0]['s3']['bucket']['name']
s3_file_name = event['Records'][0]['s3']['object']['key']
resp = s3_client.get_object(Bucket=bucket_name,Key=s3_file_name)
data = resp['Body'].read().decode("utf-8")
scan_time = data.split("\n")
for scan in scan_time:
print(scan)
scan_data = scan.split(",")
# Add it to dynamoDB
try:
table.put_item(
Item = {
'serial' : scan_data[0],
'time' : scan_data[1],
}
)
except Exception as e:
print("End of File")
Upvotes: 0
Views: 1426
Reputation: 9
ConditionExpression='attribute_not_exists(serial) AND attribute_not_exists(time)',
Upon doing below two changes the issue was resolved and the code works fine. 1. Unique entry checked with the combination of partition and sort key 2. Add loop to go line by line in the csv file and ingest the data into DynamoDB.
Happy to share the code if anyone finds it useful.
Upvotes: 0
Reputation: 184
in your dynamoDb table your Primary key needs to be unique for each elements in the table. So if the your primary key is only composed of a partition key that is the same for all your data point you will always have the same element overwritten. * You could add to your table a sort key that uses another field so that the partition key, sort key pair composing the primary key is unique and hence appending data to your table. * If you can't have a unique primary key composed from your data points you can always add an UUID to the primary key to make it unique.
Upvotes: 2