j panton
j panton

Reputation: 232

How to populate DynamoDB tables

firstly I'm very new to DynamoDB, and AWS services in general - so I'm finding it hard when bombarded with all the details.

My problem is that I have an excel file with my data in CSV format, and I'm looking to add said data to a DynamoDB table, for easy access for the Alexa function I'm looking to build. The format of the table is as follows:

ID, Name, Email, Number, Room

1534234,    Dr Neesh Patel, [email protected],   +44 (0)3424 111111, HW101

Some of the rows have empty fields.

But everywhere I look online, there doesn't appear to be an easy way to actually achieve this - and I can't find any official means either. So with my limited knowledge of this area - I am questioning whether I'm going about this all the entirely wrong way. So firstly, am I thinking about this wrong? Should I be looking at a completely different solution for a backend database? I would have thought this would be a common task but with the lack of support or easy solutions - am I wrong?

Secondly, if I'm going about this all fine - how can it be done? I understand that the DynamoDB requires a specific JSON format - and again there doesn't appear to be a straightforward way to convert my CSV into said format.

Thanks, guys.

Upvotes: 3

Views: 3870

Answers (2)

Kannaiyan
Kannaiyan

Reputation: 13035

I had the same problem when I start using DynamoDB. When you come to distributed, big data system you really need to architect how to move data across the systems. This is where you start with it.

Clearly documented here,

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SampleData.LoadData.html

Adding more details to understand the process.

Step 1: Convert your csv to json file.

If you have small amount of data, you can use online tools.

http://www.convertcsv.com/csv-to-json.htm

 {
   "ID": 1534234,
   "Name": "Dr Neesh Patel",
   "Email": "[email protected]",
   "Number": "+44 (0)3424 111111",
   "Room": "HW101"
 }

You can see how nicely it formatted remove spaces, etc., Choose the right options and perform your conversion.

If your data is huge, then you need to use big data tools to parallely process those data to convert them.

Step 2: Upload using CLI for small and one time upload

aws dynamodb batch-write-item --request-items file://data.json

If you want to regularly upload the file, you need to create a data pipeline or a different process.

Hope it helps.

Upvotes: 4

Zaxxon
Zaxxon

Reputation: 667

DynamoDb is cool. However, before you use it you have to know your data usage patterns. For your case, if you're only every going to query the DynamoDb table by ID then it is great. If you need to query by any one or combination of columns then well there are solutions for that:

  • Elastisearch in conjunction with DynamoDb (which can be expensive), secondary indexes on the DynamoDb table (understand that each secondary index is creating a full copy of your DynamoDb table with the columns you choose to store in the index),
  • Elasticache in conjunction with DynamoDb (for tying searches back to the ID column),
  • RDS instead of DynamoDb ('cause a sql-ish db is better when you don't know your data usage patterns and you just don't want to think about it),
  • etc.

It really depends on how much data you have and how you'll query the data that should define your architecture. For me it would come down to weighing cost and performance of each of the options available.

In terms of getting the data into your DynamoDb or RDS table:

  • AWS Glue may be able to work for you
  • AWS Lambda to programmatically get the data into your data store(s)
  • perhaps others

Upvotes: 2

Related Questions